Difference between revisions of "NameIssue"

From Phyloinformatics
Jump to: navigation, search
m
Line 1: Line 1:
###Two new Bio::SimpleAlign methods are written & tested: "$aln->set_displayname_safe" (assign serial names to sequences & store/resturn original names in a hash) and "$aln->restore_displayname". New files: SimpleAlign.pm and SimpleAlign.t
+
#Two new Bio::SimpleAlign methods are written & tested:  
###A use case: Run Phylip::SeqBoot while preserving names
+
##"$aln->set_displayname_safe" (assign serial names to sequences & store/return original names in a hash). POD:
 +
<pre>
 +
    set_displayname_safe
 +
 
 +
      Title    : set_displayname_safe
 +
      Usage    : ($new_aln, $ref_name)=$ali->set_displayname_safe()
 +
      Function  : Assign machine-generated serial names to sequences in input order.
 +
                  Designed to protect names during PHYLIP runs. Assign 10-char string
 +
                  in the form of "S000000001" to "S999999999". Restore the original
 +
                  names using "restore_displayname".
 +
      Returns  : 1. a new $aln with system names;
 +
                  2. a hash ref for restoring names
 +
 
 +
</pre>
 +
##"$aln->restore_displayname".  
 +
<pre>
 +
    restore_displayname
 +
 
 +
      Title    : restore_displayname
 +
      Usage    : $aln_name_restored=$ali->restore_displayname($hash_ref)
 +
      Function  : Restore original sequence names (after running $ali->set_displayname_safe)
 +
      Returns  : a new $aln with names restored.
 +
      Argument  : a hash reference of names from "set_displayname_safe".
 +
 
 +
</pre>
 +
##New files: SimpleAlign.pm and SimpleAlign.t
 +
#A use case: Run Phylip::SeqBoot while preserving names
  
 
<perl>
 
<perl>
 
#!/usr/bin/perl -w
 
#!/usr/bin/perl -w
# Run SeqBoot (Phylip) without  
+
# Run SeqBoot (Phylip) without corrupting your sequence names
  
 
use Bio::AlignIO;
 
use Bio::AlignIO;
Line 13: Line 39:
 
my $in=new Bio::AlignIO(-file=>$long_name_file);
 
my $in=new Bio::AlignIO(-file=>$long_name_file);
 
my $aln=$in->next_aln();
 
my $aln=$in->next_aln();
 
#my ($aln2,$ref_name)=$aln->set_displayname_safe();
 
#print Dumper($ref_name);
 
  
 
my @params = ('datatype'=>'SEQUENCE','replicates'=>5);
 
my @params = ('datatype'=>'SEQUENCE','replicates'=>5);
 
my $seq = Bio::Tools::Run::Phylo::Phylip::SeqBoot->new(@params);
 
my $seq = Bio::Tools::Run::Phylo::Phylip::SeqBoot->new(@params);
  
#my $aln_ref = $seq->run($aln2);
+
my ($aln_ref, $name_ref) = $seq->run($aln); # diff to the original "run": save names in "$name_ref"
my ($aln_ref, $name_ref) = $seq->run($aln);
 
  
#my $aio = Bio::AlignIO->new(-format=>"phylip");
+
my $aio = Bio::AlignIO->new(-file=>">alignment.bootstrap.new",-format=>"clustalw");
my $aio = Bio::AlignIO->new(-format=>"fasta");
 
#$aln->set_displayname_flat();
 
#my $aio = Bio::AlignIO->new(-file=>">alignment.bootstrap.new",-format=>"clustalw");
 
 
foreach my $ai(@{$aln_ref}){
 
foreach my $ai(@{$aln_ref}){
#  $aio->write_aln($ai->restore_displayname($name_ref));
+
   $ai=$ai->restore_displayname($name_ref); # restore sequence names
  foreach my $seq ($ai->each_seq()) {
+
   $aio->write_aln($ai);
    print $seq->id(), "\n";
 
  }
 
 
 
  print "\n";
 
   $ai=$ai->restore_displayname($name_ref);
 
   foreach my $seq ($ai->each_seq()) {
 
    print $seq->id(), "\n";
 
  }
 
  print "===\n"
 
$aio->write_aln($ai);
 
 
}
 
}
 
</perl>
 
</perl>

Revision as of 12:48, 15 December 2006

  1. Two new Bio::SimpleAlign methods are written & tested:
    1. "$aln->set_displayname_safe" (assign serial names to sequences & store/return original names in a hash). POD:
    set_displayname_safe

      Title     : set_displayname_safe
      Usage     : ($new_aln, $ref_name)=$ali->set_displayname_safe()
      Function  : Assign machine-generated serial names to sequences in input order.
                  Designed to protect names during PHYLIP runs. Assign 10-char string
                  in the form of "S000000001" to "S999999999". Restore the original
                  names using "restore_displayname".
      Returns   : 1. a new $aln with system names;
                  2. a hash ref for restoring names

    1. "$aln->restore_displayname".
     restore_displayname

      Title     : restore_displayname
      Usage     : $aln_name_restored=$ali->restore_displayname($hash_ref)
      Function  : Restore original sequence names (after running $ali->set_displayname_safe)
      Returns   : a new $aln with names restored.
      Argument  : a hash reference of names from "set_displayname_safe".

    1. New files: SimpleAlign.pm and SimpleAlign.t
  1. A use case: Run Phylip::SeqBoot while preserving names

<perl>

  1. !/usr/bin/perl -w
  2. Run SeqBoot (Phylip) without corrupting your sequence names

use Bio::AlignIO; use Bio::Tools::Run::Phylo::Phylip::SeqBoot; use Data::Dumper;

my $long_name_file=shift @ARGV; my $in=new Bio::AlignIO(-file=>$long_name_file); my $aln=$in->next_aln();

my @params = ('datatype'=>'SEQUENCE','replicates'=>5); my $seq = Bio::Tools::Run::Phylo::Phylip::SeqBoot->new(@params);

my ($aln_ref, $name_ref) = $seq->run($aln); # diff to the original "run": save names in "$name_ref"

my $aio = Bio::AlignIO->new(-file=>">alignment.bootstrap.new",-format=>"clustalw"); foreach my $ai(@{$aln_ref}){

 $ai=$ai->restore_displayname($name_ref); # restore sequence names
 $aio->write_aln($ai);

} </perl>