Difference between revisions of "PhyloSoC:Biodiversity Conservation Algorithms and GUI"
|Line 283:||Line 283:|
==== Goals ====
==== Goals ====
=== Week 9 ===
=== Week 9 ===
Revision as of 20:03, 16 July 2007
- 1 News
- 2 Project Overview
- 3 Appropriate bio* package:
- 4 GUI Options
- 5 Gateway to other code
- 6 Timeline of Goals for Klaas
- 6.1 Weeks < 1
- 6.2 Week 1
- 6.3 Week 2
- 6.4 Week 3
- 6.5 Week 4
- 6.6 Week 5
- 6.7 Week 6
- 6.8 Mid term Evaluation (end Week 6)
- 6.9 Week 7
- 6.10 Week 8
- 6.11 Week 9
- 6.12 Week 10
- 6.13 Week 11
- 6.14 Week 12
- 6.15 Beyond SoC
- 7 Literature References
- 8 External Links
- This page started --[Klaas Hartmann] 09:53, 8 May 2007 (EDT)
More detail at: SoC Application
Appropriate bio* package:
- Yes, and this was probably also partially influenced by my perception that Rutger would play a more significant role in this project than it appears he will now. Also, BioPerl is the most advanced Bio* toolkit and BioJava doesn't offer much functionality or object models to assist with this project, so if the project was to be written in Java then potentially it would utilise only a limited set of existing functionality from BioJava. It is unclear to me how widespread changes to the project plan the Google SoC rules allow at this point in the project. --Tobias.thierer 16:44, 8 May 2007 (EDT)
- I imagine if we had compelling reasons to choose a different Bio* package it would be okay as this is only a change of the project within the mentoring organisation -- especially if Hilmar and others in NESCent were happy for us to do this. If we wanted to utilise Mesquite or Geneious this might be a very different matter (I think this is probably an unlikely choice due to this and license issues for Geneious). --Klaas 17:50, 8 May 2007 (EDT)
- most comprehensive bio* package
- Rutger is very familiar with this language
- Perl has a tendency to encourage unclear code (Klaas' opinion)
- Python (especially with scipy) is great for scientific computing (Klaas' personal opinion)
- Klaas has some experience creating cross-platform stand-alone GUIs with Python and wxPython
- Tobias has only about 2 hours experience with Python (writing one script of probably less than 100 lines)
- Tobias is very familiar with Java
- Less need to worry about platform cross compatibility
This was suggested to Klaas by Arne Mooers
- I don't have much experience with Mesquite myself but have heard fairly negative comments about it at the Phyloinformatics hackathon. I think the main point of criticism was that the GUI was overloaded and busy with too many options in an unstructured way. I'm not sure if we want to follow that path. If we went for Java then my personal preference would probably be Geneious (which also has an open API and everyone with the free Geneious Basic could run your plugin), but of course I'm 100% biased because I work for Biomatters and know the Geneious API very well. --Tobias.thierer 16:58, 8 May 2007 (EDT)
- Existing GUI
- Existing user base
- Cross platform compatible (JAVA)
- User base does not really overlap with target audience
- GUI too extensive and complicated -- our features would be buried in it
Klaas' concluding thoughts
The existing code in the bio* packages that will effect this project are:
- Tree objects
- Routines for loading trees from files/databases
- Routines for displaying trees/creating graphical representations
- Existing implementations of indices (Bio::Phylo)
- BioJava doesn't have any of these but JEBL (Java Evolutionary Biology Library) can read and write trees from/to Nexus files, and has a fairly advanced tree viewer GUI component (I think this alone took longer to develop than this entire project aims to). I'm not entirely convinced of the JEBL tree model (it appears hard to create a good tree object model, at least in Java), but it shouldn't be too hard to write code for. I'm not sure what you mean with those species indices - I'll try calling you again in your office later today and we can discuss. --Tobias.thierer 16:49, 8 May 2007 (EDT)
- It would be great to have a tree viewing component that can just be dropped in place. Species indices are species specific measures that measure the relative biodiversity of a species. Examples include Equal Splits, Pendant Edge and Quadratic Entropy. They are all pretty easy to code up (probably knock them off in a day). --Klaas 17:38, 8 May 2007 (EDT)
The features of the programming languages themselves that are desirable are:
- A good GUI library
- Interface to C/C++
- Cross platform binary compilation (except Java)
- Although that is possible, Perl also doesn't need cross platform binary compilation, does it (I think it's ok to require a perl interpreter)? Besides, for Java it is also possible to create installers (e.g. with install4j) that bundle the JRE and therefore appear just like a native application to the end user. --Tobias.thierer 16:54, 8 May 2007 (EDT)
As far as I can tell Python, Perl and Java all have several options for doing these things.
NB: This section is largely irrelevant if bioJava were chosen
- resulting code possibly faster
- stable for long computations
- producing binaries for several platforms may be a pain
- possible platform dependent bugs
- unsure if incorporating C/C++ code will be problematic
Options for producing binaries
- PerlBin -- no longer actively developed?
- PAR-- possibly the most reliable option.
- perl2EXE -- commercial $449US
Options include XUL::node (requires Firefox!!!) and Google GWT Toolkit.
- full control over the environment in which the algorithms run
- relatively simple interface construction
- need to find a server to run this program on
- possible stability issues for problems that take a long time to solve
- less control over the interface design
Klaas' concluding thoughts
Gateway to other code
- may not need to implement some algorithms in Perl
- C/C++ algorithms may be faster for large problems
- further work done by others in C/C++ can easily be incorporated
- Need to ensure the C/C++ code is suitably compiled for all target platforms
- May cause problems with cross-platform binaries for the GUI
Timeline of Goals for Klaas
I have modified the timeline from the original outline as I think it is important to get a semi-functional preview of the GUI working earlier. This will allow the GUI to be reviewed by some of the end users. The timeline is relatively ambitious -- according to the timeline the core functionality I want to produce from this project will be implemented by the half way point. Some of the goals mentioned after this point are optimistic and may not be realistically implementable.
Weeks < 1
- Klaas tells me that wxPerl looks like the most promising widget/GUI library,and that he wants to use PAR for compilation. Some of the Perl packages he intends to use contain C code which needs to be compiled specifically for each platform; at the end of the project we will need to create specific installers for each platform (Windows/MacOS X/Linux). --Tobias.thierer 00:03, 30 May 2007 (EDT)
- wxPerl provides the most OS consistent interface and there are several programs that utilise PAR to provide a standalone GUI. Although these are mostly windows focused this should also be possible on Linux.
- GD::Graph will be used for producing graphs (for comparing indices etc.)
- PAR will be used to create a stand alone program. This basically creates a zip file containing a perl interpreter and all dependencies for that program. The final size should be between 5 and 10MB. A development system for each OS for which this will be distributed is required. Definitely want to target WinXP and linux (I have build environments for both). OS X may have a default perl installation with sufficient libraries to make a simpler approach possible, otherwise I'll have to create an OS X build environment too.
- Bio::Phylo provides most of the phylogenetic structure necessary.
- Bio::Tree::Draw should provide sufficient tree drawing capabilities (not a hugely important aspect of this program)
- wxGlade will be used for designing the GUI. This is free, runs under multiple OS's and generates Perl code for the GUI.
- Determine how to integrate the project with BioPerl and Bio::Phylo
- Implement a mock up of the GUI
- Familiarise myself with the google code repository (I have not used versioning systems before)
- I've gained some familiarity with the Bio packages
- Bio::Phylo::Treedrawer will be used for drawing trees instead of Bio::Tree::Draw, however this package requires a little work to resolve some bugs
- The mock up of the GUI is well underway and I am starting to get a good understanding of the WxPerl toolkit
- I am starting to use the Google svn
- For clarity I will initially program the algorithms using the Bio::Phylo::Forest::Tree object. For most applications/algorithms this should be sufficiently fast. For those algorithms that require better efficiency they should probably be implemented in C/C++ anyway. (A gateway to C/C++ algorithms is part of this project)
- Implement any species specific indices not already included in the package
- Produce the C wrapper for existing algorithm implementations
- Contact Fabio regarding making his code compatible with the C wrapper (it presently reads data from files and outputs to stdout)
- Resolve issues with Bio::Phylo::Treedrawer
NB: I decided to delay the interface to Fabio's algorithm as he will be visiting Christchurch in the next week, so I will contact him then.
- Produced a patch for wxGlade (a tool I am using for GUI design). It was not producing correct source code for perl menus.
- Fixed some issues with Bio::Phylo::Treedrawer. Will shortly pass these back to Rutger for inclusion.
- Created a Wx drawing class for Bio::Phylo::Treedrawer.
- Set up a development environment for Windows and succeeded in compiling Windows binaries.
- Created a SurvivalProbability class that encapsulates linear survival probabilities to be assigned to species
- Implemented the portion of the GUI for the SurvivalProbability class
- Implement greedy algorithms for the NAP
- Create the GUI interface to the greedy algorithms
- Further work on the GUI, I am starting to get a good handle on wxWidgets, hopefully further GUI programming will be faster.
- I've started implementing the greedy algorithms
- Presented a talk including this project at Evolution 2007 (Christchurch), some interest was expressed
- Discussed inclusion of other algorithms with two with Fabio and Steffen at Evolution
NB: This is a shortened week due to the Evolution 2007 conference.
- Finish implementing greedy algorithms for the NAP
- Preliminary interface of these with the GUI
- Class for storing greedy solutions implemented, this handles problems regarding multiple equally good solutions
- Implemented greedy algorithms for the NAP
- Incorporate Fabio's algorithm
- Create useful algorithm output for the GUI
- Implemented XS Interface to Fabio's algorithm
- Implemented Data structure for non-greedy solutions
- Implemented XS Interface to wxGraphics* in wxPerl (primarily to learn XS, but wxGraphics* will prove useful for the GUI implementation)
- Work on GUI delayed due to complications with some intricacies of XS
- Work on GUI front-end for algorithm output
- Re-evaluate project goals -- it may be better to focus on implementing a smaller number of features in the GUI better than implementing the initial ambitious functionality and perhaps doing so somewhat poorly. This is also motivated by chatting with people at the Evolution conference -- there seems to be more interest in simple methods than more complicated alternatives.
- Productive meeting with Tobias. We discussed the interface and have come up some ideas that I think will make it much more accessible.
- Making good progress towards implementing the above ideas.
Mid term Evaluation (end Week 6)
- Most of the algorithms should be completed and a preliminary GUI should be available
- The algorithms are in place. Some final re-organisation is still necessary.
- The GUI is still in early stages. My goal of having a mostly functional implementation at this point was probably a bit over ambitious (I was aware that the original timeline I proposed was ambitious).
- Respond to any mid term evalution comments
- Continue GUI implementation
The following outcomes are for greedy algorithms. The extension to non-greedy algorithms is straightforward and simply involves creating the appropriate hooks for that results data structure.
- Basic solution analysis is implemented
- Solution/species specific index comparison is nearly implemented
- Finish work on solution comparison panel
- Create hooks for non-greedy solutions
- Create solution generation gui panel
- Update cost/benefit and tree selection panels
- Address comments on GUI
- Time permitting implement some basic sensitivity analysis
- Document project
- Tidy up loose ends
- Document project
- Tidy up loose ends
This work is of great interest to me and I will continue adding to this project. The following features are of particular interest:
- Multiple tree algorithms
- Conservation measures that are not species-specific
- Sensitivity analysis
Steel, M. 2005. Phylogenetic diversity and the greedy algorithm. Systematic Biology 54:527--529.