Difference between revisions of "R Hackathon 1/Teleconferences"

From Phyloinformatics
Jump to: navigation, search
Line 2: Line 2:
Please note the [[R Hackathon 1/Planning Steps|planning steps page]] for additional information on specific items.
Please note the [[R Hackathon 1/Planning Steps|planning steps page]] for additional information on specific items.

Revision as of 10:34, 20 November 2007

1st Teleconference 11/16/2007

Please note the planning steps page for additional information on specific items.


Participants this might not be a complete list please add yourself on if you have been missed! (or indeed remove yourself if you have been added erroneously)

Michael Alfaro, Marguerite Butler, Ben Bolker, Richard Despar, Joe Felsenstein, Luke Harmon, Andrew Hipp, Gene Hunt, Steve Kembel, Damien de Vienne, Wayne Maddison, Peter Midford, Brian O’Meara, Emmanuel Paradis, Samantha Price, Brian Sidlauskas, Stacey Smith, Peter Waddell, Todd Vision

1. Introduction - why are we here?

To agree on goals for the hackathon - get into meaningful subgroups - interact. There will be at least one more teleconference to finalise goals before the hackathon.

2. Priorities and subgroups

  • Data representation Standardizing data formats in R will be quite useful. It would be good to sketch out how this might look like before hand: rooting, branch lengths, labels, data at tips and nodes etc. Start this off over email list and wiki – please collaboratively edit.
  • Sub-groups – please follow the link to create new groups, move yourself etc.

3. Preparations by participants

  • Lightning talks - is there a need for quick talks at the start of the hackathon to let everyone know you are working on? It was generally agreed that, in order to minimize the time spent on presentations, everyone should aim to get as much of this information as possible onto the wiki. To that end, we have set up a wiki page intended for everyone to populate with overviews of each package, in particular documenting future directions and known problems. The decision of whether to do any lightning talks will be deferred until next conference call.
  • Brainstorming sessions - Do we need them? A suggestion was made to have an optional session that would include some of the remote participants. No final decision made.
  • Bootcamps - intensive tutorials. Are there technical pieces of knowledge that people need or do you know everything that you need to know before you arrive? It was generally agreed that we might need bootcamps on documentation and vignette writing -S4 bootcamp? Sweave bootcamp? version control? we need to get together and discuss this on the wiki.

4. Documentation and testing

There will be several 'end-users' at the hackathon (Michael Alfaro, Samantha Price, Brian Sidlauskas, Stacey Smith, Amy Zanne) who have familiarity with comparative methods in R and varying degrees of coding experience - they will be interacting with the programmers to help write documentation and test code. We want to work out the best way of utilising their talents - is it writing vignettes, documentation or testing. What homework will the end-users have to do to achieve these aims? Please discuss these issues here.

5. Source code repository (or repositories)

Do existing packages have repositories? Does anyone use google code or sourceforge? No-one is currently using them but is it a good idea to start?

6. Misc. discussion

  • Are people interested in getting R to talk to Cipres? King and Butler – are interested - anyone else?
  • Can Mesquite (Java) serve as a calculation engine for missing info in R? It was generally agreed by the end-users that this would be useful. Wayne Maddison would like to explore this in preparation for the hackathon – please use the mailing list to let him know if you are interested in working with him on it.
  • Sub-groups do we need to include someone that can work on the post-nexus formats (e.g. Rutger Vos) ?

7. IT logistics

  • Loaner computers - please let us know if you need one
  • Shared file space - we will set up a WebDAV so that we can share documents (separate from the source code repositories and subversion etc.).


Note for participants: all of the items following the first two are topics posed for discussion, rather than decisions already made. We encourage you to voice any and all feedback, including that pursuing the topic is not desirable (if that is what you feel).

  1. Welcome, review of purpose and general objectives (TJV)
  2. Outline of planning schedule ahead (HL)
  3. Introductions (everyone)
  4. Priorities and subgroups (SK, BCO) (5mins)
  5. Preparations by participants (HL) (25mins)
    • Presentations
      • Lightning (< 5-10mins) talks - purpose(s), topics, and presenters
      • Needs for full-length (> 15min) talks?
      • Brainstorming current or future challenges - useful topics, presenters?
    • Bootcamps - purpose, needs, and presenters
    • Compiling package-specific information on the wiki - e.g., overview, relevant programming info, future goals
    • Tabulating metadata across packages (as started by Brian O'Meara) - methods, supported formats and analysis methods, visualization capabilities
    • Describing internal representation of data from test files in each package on the wiki
    • Reading list - suggestions (both for possible gaps and for recommended reading)
    • Other preparations we or others can facilitate or help with?
  6. Documentation & testing (SP, SK) (5mins)
    • R documentation and vignette writing - who has experience, and would be willing to help train the users? Examples to draw from?
    • Collection of data for testing and validation (such as tree files for testing)
  7. Source code repository (or repositories) (HL) (5mins)
    • Survey of current source code repository and versioning setup for participating packages
    • Assessing need for help with publicly accessible repository
    • Assessing need(s) for NESCent-run repository
    • Code branching needs
    • Other source code-related preparation needs
  8. IT logistics (HL)
    • Computers - will we need loaners?
    • Network access - will anyone need wired network?
    • File share needs
  9. Other homework? (5mins)
  10. Q & A (10mins)


Action Items:

    • begin work on crafting data standards: what do we want, what formats should we consider
    • convert test set of sample trees and datasets to internal representation for each package, summarizing info here.
    • write package overviews, including future goals
    • send help files for functions not already available here to Brian O'Meara
    • learn how to write documentation, including vignettes, for R. This may include learning LaTex, though there are various programs to make using this easier.
    • start emailing each other for discussions
    • misc.: investigate R->Java (Mesquite) process (see rJava); look into linking to CIPRES, perhaps inviting additional participants for this; start prioritizing which new methods to add; think about source code repositories (rforge? r-forge? sourceforge? Google Code?) .