PhyloSoC:BioPerl integration of the NeXML exchange standard and Bio::Phylo toolkit/Benchmark Results

From Phyloinformatics
Revision as of 11:32, 20 July 2009 by Chmille4 (talk)
Jump to: navigation, search

Benchmarking

The two main methods of each module were benchmarked using the Benchmark.pm module included in Perl. Several data files were used and the results were normalized against the most time consuming method (i.e. Bio::Nexml::write has a %100 mark and the rest are compared to it).

Representative results

Below are the results based on two nexml documents taken from the nexml.org website. Characters.xml contains two relevant matrices and was used for the Bio::Nexml, Bio::AlignIO, and Bio::SeqIO tests and Trees.xml contains three trees and was used for the Bio::TreeIO and Bio::Nexml tests. The amount of data is roughly the same in both files.

Bio::TreeIO::nexml

_parse() 5 wallclock secs ( 4.67 usr + 0.04 sys = 4.71 CPU) @ 10.62/s (n=50) Normalized : 22%

write() 2 wallclock secs ( 2.10 usr + 0.01 sys = 2.11 CPU) @ 23.70/s (n=50) Normalized : 8%

Bio::AlignIO::nexml

_parse() 14 wallclock secs (13.28 usr + 0.07 sys = 13.35 CPU) @ 3.75/s (n=50) Normalized : 60%

write() 14 wallclock secs ( 9.25 usr + 0.08 sys = 9.33 CPU) @ 5.36/s (n=50) Normalized: 60%

Bio::SeqIO::nexml

_parse() 13 wallclock secs (13.13 usr + 0.08 sys = 13.21 CPU) @ 3.79/s (n=50) Normalized : 56%

write() 4 wallclock secs ( 3.90 usr + 0.03 sys = 3.93 CPU) @ 12.72/s (n=50) Normalized : 17%

Bio::Nexml

_parse() 11 wallclock secs (10.63 usr + 0.08 sys = 10.71 CPU) @ 4.67/s (n=50) Normalized : 47%

write() 23 wallclock secs (22.76 usr + 0.22 sys = 22.98 CPU) @ 2.18/s (n=50) Normalized : 100%

Analysis

Overall, the methods were reasonably efficient but a bit on the slow side. By comparing the benchmarks, it is obvious that both Bio::AlignIO methods, Bio::Nexml::write, and Bio::SeqIO::_parse should all be able to be optimized for increased efficiency.