PhyloSoC: Apply Machine Learning Algorithm(s) to Ecology Data

From Phyloinformatics
Revision as of 11:43, 7 May 2012 by Zaher14@gmail.com (talk) (Updating From GitHub Wiki)
Jump to: navigation, search

Author and Relevant links

Student: Abu Zaher Md. Faridee / email: zaher14@gmail.com

Mentors: Primary Mentor: Kathryn Iverson, Secondary Mentor: Sarah Westcott

Project Homepage: PhyloSoC: Apply Machine Learning Algorithm(s) to Ecology Data

Project blog: Wiki on github and Issue Tracker on github

Source Code: Hosted on github

Abstract

Project Goals

Timeline

Interim Period before Acceptance: April 7 - April 22

  • Become Familiar with Microbial Ecology Terms
  • Got mothur up and running with Xcode and Gfortran in my Mac OS X setup
  • Follow mothur tutorial from the wiki and get to know the workflow
  • Create a private git branch for mothur in Github
  • Take a closer look at the R implementation
  • Create the initial schematics for the command line programs that would be added to mothur ('train' and 'inquire')
  • Go through text books to make sure my knowledge of the classifier algorithms are clear once again

Week 1: April 23 - April 29 (Community Bonding Period Starts)

  • Register for Machine Learning Course at Coursera
  • Set up Wiki, Isssue Tracker and other stuffs in GitHub page

  • Determine Performance Evaluation Criteria - Issue #1
  • Prepare Datasets for The Random Forest Algorithm - Issue #2
  • Classification/Regression or Both! - Issue #3
  • Make a List of Reusable Classes/Functions from Mothur - Issue #4


Week 2: April 30 - May 6

  • Find Mothur's Common Practices - Issue #5
  • Find a Way to Merge 'train' and 'inquire' Commands Into One Single Command - Issue #6
  • Investigate Mothur's Multithreading/Multiprocessing API - Issue #7
  • Investigate The Possible Places of Parallelization in the Random Forest Classifier - Issue #8
  • Create Code Schematics/Pseudocode for the Random Forest Implementation - Issue #9
  • Investigate the Best Practices for Parameter Tuning/Estimation for Random Forest - Issue #10

Week 3: May 7 - May 13

Week 4: May 14 - May 20

Milestone 1 Reached

Week 5: May 21 - May 27 (Start of Official Coding)

Week 6: May 28 - June 3

Week 7: June 4 - June 10

Milestone 2 Reached

Week 8: June 11 - June 17

Week 9: June 18 - June 24

Week 10: June 25 - July 1

Week 11: July 2 - July 8

Mid-term evaluation (from July 9 to July 13)

Status:

Week 12: July 9 - July 15

Week 13: July 16 - July 22

Week 14: July 23 - July 29

Milestone 3 Reached

Week 15: July 30 - August 5

Week 16: August 6 - August 12

August 13: Suggested ‘Pencils Down’

Status:

Week 17: August 13 - August 19

August 20: Firm ‘Pencils Down’

Status:

Final Evaluation

Status:

Further work after Google Summer of Code