Difference between revisions of "NameIssue"

From Phyloinformatics
Jump to: navigation, search
m
Line 1: Line 1:
#Two new Bio::SimpleAlign methods are written & tested:  
+
Two new Bio::SimpleAlign methods are written & tested:  
##"$aln->set_displayname_safe" (assign serial names to sequences & store/return original names in a hash). POD:
+
 
 +
===Generate and set unique short names===
 +
 
 +
"$aln->set_displayname_safe" (assign serial names to sequences & store/return original names in a hash).  
 
<pre>
 
<pre>
 
     set_displayname_safe
 
     set_displayname_safe
Line 14: Line 17:
  
 
</pre>
 
</pre>
##"$aln->restore_displayname".
+
 
 +
===Restore long names===
 +
 
 +
"$aln->restore_displayname"
 +
 
 
<pre>
 
<pre>
 
     restore_displayname
 
     restore_displayname

Revision as of 12:53, 15 December 2006

Two new Bio::SimpleAlign methods are written & tested:

Generate and set unique short names

"$aln->set_displayname_safe" (assign serial names to sequences & store/return original names in a hash).

    set_displayname_safe

      Title     : set_displayname_safe
      Usage     : ($new_aln, $ref_name)=$ali->set_displayname_safe()
      Function  : Assign machine-generated serial names to sequences in input order.
                  Designed to protect names during PHYLIP runs. Assign 10-char string
                  in the form of "S000000001" to "S999999999". Restore the original
                  names using "restore_displayname".
      Returns   : 1. a new $aln with system names;
                  2. a hash ref for restoring names

Restore long names

"$aln->restore_displayname"

     restore_displayname

      Title     : restore_displayname
      Usage     : $aln_name_restored=$ali->restore_displayname($hash_ref)
      Function  : Restore original sequence names (after running $ali->set_displayname_safe)
      Returns   : a new $aln with names restored.
      Argument  : a hash reference of names from "set_displayname_safe".

    1. New files: SimpleAlign.pm and SimpleAlign.t
  1. A use case: Run Phylip::SeqBoot while preserving names

<perl>

  1. !/usr/bin/perl -w
  2. Run SeqBoot (Phylip) without corrupting your sequence names

use Bio::AlignIO; use Bio::Tools::Run::Phylo::Phylip::SeqBoot; use Data::Dumper;

my $long_name_file=shift @ARGV; my $in=new Bio::AlignIO(-file=>$long_name_file); my $aln=$in->next_aln();

my @params = ('datatype'=>'SEQUENCE','replicates'=>5); my $seq = Bio::Tools::Run::Phylo::Phylip::SeqBoot->new(@params);

my ($aln_ref, $name_ref) = $seq->run($aln); # diff to the original "run": save names in "$name_ref"

my $aio = Bio::AlignIO->new(-file=>">alignment.bootstrap.new",-format=>"clustalw"); foreach my $ai(@{$aln_ref}){

 $ai=$ai->restore_displayname($name_ref); # restore sequence names
 $aio->write_aln($ai);

} </perl>