R Hackathon 1/Mesquite-R Integration SG

From Phyloinformatics
Jump to: navigation, search

Participants: Hilmar Lapp, Wayne Maddison

Targets:

  1. Mesqite->R:
    • Enable R packages, and their functions to be called from within Mesquite, with an R package appearing as a choice for the proper operation in the Mesquite UI
  2. R->Mesquite: Make Mesquite classes, methods, and data objects usable from within R in a transparent fashion
    • Methods that execute within Mesquite can be used as otherwise "normal" R functions and objects, and expect "normal" R objects as arguments
    • Mesquite data and result objects appears transparently as "normal" R objects, or can be converted to such using standard R facilities (e.g., as())

Accomplishments:

  1. Mesquite->R:
    • Wrapper library (rmLink), external library for supporting R<->Java communication (JRI), example data, folder structure all assembled for download
    • Installation (in README included with the download) consists of a series of manual steps, roughly along the following:
      1. Move rmLink library code to correct location within the Mesquite folder tree
      2. For non-Mac OSX, download and install JRI. For Mac OSX, move JRI library to correct location within the Mesquite folder tree
      3. For Mac OSX, edit ~/.MacOSX/environment.plist to have R_HOME set properly
      4. Install at least ape in R.
    • With successful installation, APE and GEIGER examples should run.
  2. R->Mesquite:
    • Mesquite wrapper library rmLink that simplifies communication with Mesquite and makes result data (trees, matrices, named values) processing more efficient from R.
    • Installable package RMesquite that has declared dependencies on rJava (which enables using Java classes from within R) and ape, which are therefore installed too. The package also automatically installs the rmLink Mesquite wrapper library (in Java).
    • Ability to convert Newick strings and phylo (package ape) objects to Mesquite Tree objects, and R arrays and matrices to Mesquite character matrices
    • Transparently coerce Mesquite Tree and Character data objects to R phylo and matrix objects, using standard as.matrix() and as.phylo() functions.
    • Read NEXUS files into a Mesquite "Project" and inspect the contents within the project.
    • R methods to execute Mesquite's BiSSE likelihood calculation and ancestral state reconstruction methods, with standard R objects as arguments. There are also generic versions that would allow other Mesquite classes to be called that implement NumberForTreeAndCharacter and ancestral state reconstruction.

Remaining issues & future goals:

  • Vignettes and API documentation in RMesquite are still missing.
  • Provide coercion of Mesquite Tree and Character data objects to phylo4 and phylo4d S4 classes.
  • Provide coercion for Mesquite Project object to an R object or class (e.g., phylo4d?).
  • Provide way to inspect the possible options for "Employees" that Mesquite has available to hire by a particular general analysis class, and a way to pass a choice into Mesquite.
  • Starting Mesquite from within R is unstable at least on Mac OSX 10.4, but seems to work OK on Windows, and probably also on Mac OSX 10.5. On Mac OSX 10.4, if R crashes it does so soon after Mesquite is started. If it doesn't it usually remains stable. The issue is most likely an AWT thread concurrency problem, despite using the headless version of Mesquite (which in theory should not be using AWT at all).
    • R can in fact start up the full GUI version of Mesquite if started from the JGR console instead of the terminal version of R or the regular GUI version. This would allow access to the many sophisticated visualization capabilities of Mesquite.
  • Getting Mesquite calling R fully working and & easily installable on Windows and Linux.