PhyloSoC: Extending Jalview support for handling RNA

From Phyloinformatics
Jump to: navigation, search

Author and Relevant links

Jan Engelhardt

Jalview Homepage

RFAM

VARNA

Abstract

Jalview is an alignment editor highly used in different web pages (e.g. Pfam, Rfam). It can also be used as a stand alone application. However like most bioinformatic tools it was developed with protein sequences in mind and is not optimally prepared for use on ncRNAs, yet. The main focus of this years project will be to embed the VARNA (Visualization Applet for RNA) secondary structure display into Jalview’s desktop application. Additionally other structure features will be added.

Project Goals

  1. Update last years code
  2. Secondary structure data model
  3. Embedding VARNA
  4. Editing secondary structures
  5. Communication between Jalview and VARNA
  6. VARNA display
  7. Consensus/conservation displays
  8. Vienna Package in JABAWS

Use cases for VARNA in Jalview

  1. View consensus structure for an alignment. This can now be done, but I think you need to explicitly call this structure 'Consensus' in the side panel and in the title shown in VARNA.
  2. view how the consensus structure fits a particular sequence in the alignment - this is the trimmed WUSS string. Here, you need visual indications for whether the structure fits or not (check for compatible base pairing as well as gaps).
  3. View a sequence's own secondary structure(s). Here, there will probably be some informative label on the annotation row that holds the secondary structure for the sequence (perhaps a special label indicating when this sequence forms this structure). All distinct 'sequence associated' secondary structures need to be distinguished in the VARNA side panel.
  4. Compare two (or more) secondary structures. You mentioned the 'AlignmentDemo' - which looks like it allows two secondary structures to be compared - this could be used for :
    1. comparing an observed secondary structure for a sequence against the consensus secondary structure for the alignment
    2. comparing different observed secondary structures for the same sequence.
    3. comparing secondary structures from different sequences.
  5. Editing/Creating new secondary structure. In any of the above viewing situations, the user might want to edit or create a new secondary structure entry. There are two situations this is probably useful:
    1. curating a given secondary structure in the light of new knowledge, or comparison with the consensus for an alignment.
    2. transferring a consensus prediction onto a specific sequence - this is essentially the next thing the user might want to do after performing step 2 - mapping the consensus onto a specific sequence, so you might need to provide a 'transfer' or 'save structure on sequence' button that saves the trimmed version as a new annotation row specifically associated with the given sequence, and then allows the user to edit the newly created structure.


Timeline

“Community Bonding Period”

  • before 5/23 - Apply to mailing lists, set up developmental environment (Eclipse, Version control system). Familiarize with the last years and the Jalview code.

Gather some test data which can be used to evaluate the proposed goals after their implementation. Naturally Rfam will be a good data source for that. Start coding to collect some backup time.

Achievements: Joined countless mailing lists; created this nice page; IDE (Eclipse+Egit) was set up;
Test sets were downloaded: RF00019 (Y RNA, 116 RNAs), RF00198 (SL1 RNA, 32 RNAs), RF00108 (SNORD 116, 1.434 RNAs), RF00006 (Vault RNA, 75 RNAs)

Official Coding Start, 5/23-5/29

Goal 1: Port Lauren’s GSOC code into the latest codebase.
Test: Check manually if Lauren’s features still work in the new codebase.

Achievements: Lauren's code was ported; functionality was tested with the test sets; got in contact with VARNA developer;

Week 2, 5/30-6/5

Goal 2: Check and fix jalview.io.AnnotationFile to ensure jalview can export and import RNA secondary structure annotation for a sequence, group or alignment.
Minor Goals: Undo in RNA mode dont't work; RNA helices coloring is not updated properly
Goal 3: Improved secondary structure data model.
Test: Some large alignment sets will be used to evaluate the efficient performance of the data structure.

Achievements: Existing secondary structure model was tested with large data sets (up to 1.434 sequences). Existing data model seems compatible with VARNA and will be retained. The goal 2 bug is solved.

Week 3, 6/6-6/12

Start embedding a VARNA window.

Achievements: Jmol integration in Jalview was analyzed; New classes for Varna were created: jalview/ext/varna/JalviewVarnaBinding.java,jalview/ext/varna/VarnaCommands.java,jalview/gui/AppVarna.java,src/jalview/gui/AppVarnaBinding.java,jalview/jbgui/GRnaStructureViewer.java;

Week 4, 6/13-6/19

Finish embedding a VARNA window.
Goal 4: A VARNA window is embedded in the Jalview Swing interface. Necessary menu items are included in the Jalview alignment window. Test: The embedded window will be evaluated manually using Rfam alignments.

Achievements: Created classes were adopted for Varna; AppVarnaBinding.java was adopted according to fr.orsay.lri.varna.applications.VARNAGUI.java; VARNA gui can be called from within Jalview; 'View RNA structure'-method added via PopupMenu.java;

Week 5, 6/20-6/26

Goal 5: Add functionality for editing secondary structures and fix issues from last years code with that.
Minor Goals: VARNA classes could be more structured; Test: Edit the structure in the alignment and see if it works.

Achievements: RNA functionality is just displayed if it is an nucleotide alignment; VARNA editing menus are added again to allow communication between Jalview and VARNA in both directions; the actual structure can be used correctly now and necessary methods are implemented in the Sequence.java and Alignment.java classes;

Week 6, 6/27-7/3

Goal 6: Make sure mouse events are transmitted from Jalview to VARNA and vice versa.

  • Use-cases of RNA editing with Jalview have to be clarified (prototype to Paul Gardner)
  • Different sequences should get own structures
  • New structures should be accessible in a pdb-like manner
  • The issue of including the VARNA window in the Jalview.desktop must be solved
  • VARNA could be used to edit sequence and structure, too?

Test: Different mouse events in either application has to be done to check if the communication works properly.

Achievements: Use-cases of RNA editing was discussed with Paul Gardner on jalview-discussion list; Result can be see under 'Project goals';
The Varna window is now fully included in a Jalview desktop;
Varna can not be used so easily for editing since existing components do not edit but create new entrys; the sequence should not be edited from it; a function to edit the structure has to be implemented
New structures can be transmitted simply via Stockholm-format; the existing Stockholm-parser must be adopted for allowing structures for every sequence;

Week 7, 7/4-7/10

Prepare mid-term evaluation.

Jalview to Varna:

  1. send (keyboard-based) structure edits from Jalview to Varna
    • Create an interface for making structure editing in Jalview possible
    • Allow Undo's for structure edits
  2. send (keyboard-based) sequence edits from Jalview to Varna
    • append jalview.gui.AlignViewport.addPropertyChangeListener( .. )
  3. send selection from Jalview to Varna (AppVarna listen for selection events using structure.SelectionListener)
  4. highlight current mouse position in sequence on 2D structure (highlightSequence method on structure.SequenceListener)
    1. there may be a need to create a sub interface for listening to sequence highlighting and colouring events.
  5. colour varna by sequence in Jalview (already example with Jmol)

Varna to Jalview:

  1. pass structure edits to alignment annotation rows in jalview alignment view
  2. get currently selected positions from VARNA and send them to the Jalview window to select regions of sequence.
  3. getting colouring/annotation


Week 8, 7/11-7/17

Write and submit mid-term evaluation

Goal 8: Include a method for calculating a consensus secondary structure either from the annotated structure or an additional tool (e.g. RNAalifold or R-Coffee using JABAWS).
Test: Compare the calculated consensus structure calculated in Jalview with the consensus structure calculated by the original tool and the Rfam annotation.


Week 9, 7/18-7/24

Goal 7: Enable the user to Save/Restore a VARNA display to a Jalview project.
Test: A saved Jalview project must have the same VARNA display after reopening.

Achievements:
  • Stockholm parser was extended so that secondary structures associated to a specific sequenc (#=GR) can be processed by Jalview
  • I got a thorough introduction into JABAWS by Peter which is needed to include parts of the ViennaRNA package into Jalview
  • VARNA state:
    • there is currently no way to save work states in VARNA
    • Yann Ponty (the varna developer) assured that he could implement the saving of a basic set of varna features
    • we prioritised some VARNA states we would like to get saved
    • since this must be implemented by Yann it will probably take some time before it can be used in Jalview

Week 10, 7/25-7/31

Goal 9: Implement a ’pair consensus’ annotation row based on the consensus secondary structure.
How to do that?

  1. Implement a method to check if a base pair is valid (AT,GC,GT) or not.
  2. Probably extend the RNA data structure to save that information
  3. Implement a method for counting the valid and invalid base pairs in a column
  4. Introduce an additional line in the AnnotationPanel visualizing the `pair consensus` annotation
  5. Think about introducing different modes of `RNA helices`-coloring, e.g. according to RNAalifold

Test: The ‘pair consensus’ has to be checked for different Rfam alignments.

Week 11, 8/1-8/7

Goal 10: Adapt the existing sequence logo to visualize base-pair interactions.
Test: The base-pair interaction logo will be calculated and checked for some test cases.

Week 12, 8/8-8/14

Leftovers

  • base-pair interaction
  • solve the new bugs
  • enable a cursor mode for the secondary structure accordingly to the one for sequences

If everything has gone well now could be time to do some "Nice-to-Have".
Goal 11: Extending Goal 8 by adding RNAfold/RNAalifold to JABAWS and make it fully usable in Jalview.
Test: Compare the structure predicted in Jalview with the prediction of the stand-alone applications with different test alignments.

Week 13, 8/15-8/21

’Pencils down’, ”scrub code, do tests, improve documentation, etc”. Start writing final evaluation.

Week 14, 8/22-8/28

Finish final evaluation.
Final Goal: Submit final evaluation by 26th August.