Difference between revisions of "PhyloSoC: Apply Machine Learning Algorithm(s) to Ecology Data"

From Phyloinformatics
Jump to: navigation, search
(Further work after Google Summer of Code Has Ended)
(Updating From GitHub Wiki)
Line 17: Line 17:
  
 
=Timeline=
 
=Timeline=
 +
==Interim Period before Acceptance: April 7 - April 22 ==
  
<!-- ==Interim Period before Acceptance: 4/10-4/25 == -->
+
<s>
 +
* Become Familiar with Microbial Ecology Terms
 +
* Got mothur up and running with Xcode and Gfortran in my Mac OS X setup
 +
* Follow mothur tutorial from the wiki and get to know the workflow
 +
* Create a private git branch for mothur in Github
 +
* Take a closer look at the R implementation
 +
* Create the initial schematics for the command line programs that would be added to mothur ('train' and 'inquire')
 +
* Go through text books to make sure my knowledge of the classifier algorithms are clear once again
 +
</s>
 +
 
 +
==Week 1: April 23 - April 29 (Community Bonding Period Starts) ==
 +
 
 +
<s>
 +
* Register for Machine Learning Course at Coursera <!-- https://www.coursera.org/course/ml -->
 +
* Set up Wiki, Isssue Tracker and other stuffs in GitHub page
 +
</s>
 +
* ''Determine Performance Evaluation Criteria''  - [https://github.com/darthxaher/mothur/issues/1 Issue #1]
 +
* ''Prepare Datasets for The Random Forest Algorithm'' - [https://github.com/darthxaher/mothur/issues/2 Issue #2]
 +
*  ''Classification/Regression or Both!'' - [https://github.com/darthxaher/mothur/issues/3 Issue #3]
 +
* ''Make a List of Reusable Classes/Functions from Mothur'' - [https://github.com/darthxaher/mothur/issues/4 Issue #4]
  
==Community Bonding Period Starts: Week 1: April 23 - April 29 ==
 
===What was done:===
 
  
 
==Week 2: April 30 - May 6==
 
==Week 2: April 30 - May 6==
===What was done:===
+
 
 +
* ''Find Mothur's Common Practices'' - [https://github.com/darthxaher/mothur/issues/5 Issue #5]
 +
* ''Find a Way to Merge 'train' and 'inquire' Commands Into One Single Command'' - [https://github.com/darthxaher/mothur/issues/6 Issue #6]
 +
* ''Investigate Mothur's Multithreading/Multiprocessing API'' - [https://github.com/darthxaher/mothur/issues/7 Issue #7]
 +
* ''Investigate The Possible Places of Parallelization in the Random Forest Classifier'' - [https://github.com/darthxaher/mothur/issues/8 Issue #8]
 +
* ''Create Code Schematics/Pseudocode for the Random Forest Implementation'' -  [https://github.com/darthxaher/mothur/issues/9 Issue #9]
 +
* ''Investigate the Best Practices for Parameter Tuning/Estimation for Random Forest'' -  [https://github.com/darthxaher/mothur/issues/10 Issue #10]
  
 
==Week 3: May 7 - May 13==
 
==Week 3: May 7 - May 13==
===What was done:===
+
 
  
 
==Week 4: May 14 - May 20==
 
==Week 4: May 14 - May 20==
===What was done:===
+
 
  
 
'''Milestone 1 Reached'''
 
'''Milestone 1 Reached'''
  
==Week 5: Official coding start, May 21 - May 27 ==
+
==Week 5: May 21 - May 27 (Start of Official Coding)==
  
===What was done:===
 
  
 
==Week 6: May 28 - June 3==
 
==Week 6: May 28 - June 3==
 
 
<!-- ===More Goals/Goals shifted from other weeks=== -->
 
<!-- ===More Goals/Goals shifted from other weeks=== -->
  
===What was done:===
 
  
 
==Week 7: June 4 - June 10==
 
==Week 7: June 4 - June 10==
  
===What was done:===
 
  
 
'''Milestone 2 Reached'''
 
'''Milestone 2 Reached'''
Line 52: Line 72:
 
==Week 8: June 11 - June 17==
 
==Week 8: June 11 - June 17==
  
===What was done===
 
  
 
==Week 9: June 18 -  June 24==
 
==Week 9: June 18 -  June 24==
  
===What was done===
 
  
 
==Week 10: June 25 - July 1==
 
==Week 10: June 25 - July 1==
  
===What was done===
 
  
 
==Week 11: July 2 - July 8==
 
==Week 11: July 2 - July 8==
  
===What was done===
 
  
 
===Mid-term evaluation (from July 9 to July 13)===
 
===Mid-term evaluation (from July 9 to July 13)===
Line 71: Line 87:
 
==Week 12: July 9 - July 15==
 
==Week 12: July 9 - July 15==
  
===What was done===
+
 
  
 
==Week 13: July 16 - July 22==
 
==Week 13: July 16 - July 22==
  
===What was done===
 
  
 
==Week 14: July 23 - July 29==
 
==Week 14: July 23 - July 29==
  
===What was done:===
 
  
 
'''Milestone 3 Reached'''
 
'''Milestone 3 Reached'''
Line 85: Line 99:
 
==Week 15: July 30 - August 5==
 
==Week 15: July 30 - August 5==
  
===What was done:===
 
  
 
==Week 16: August 6 - August 12==
 
==Week 16: August 6 - August 12==
  
===What was done:===
+
 
===April 13: Suggested ‘Pencils Down’===
+
===August 13: Suggested ‘Pencils Down’===
 
'''Status''':
 
'''Status''':
  
 
==Week 17: August 13 - August 19==
 
==Week 17: August 13 - August 19==
===What was done:===
+
 
  
 
===August 20: Firm ‘Pencils Down’===
 
===August 20: Firm ‘Pencils Down’===
Line 100: Line 113:
  
 
===Final Evaluation===
 
===Final Evaluation===
Status:
+
'''Status''':
  
 
=Further work after Google Summer of Code=
 
=Further work after Google Summer of Code=
  
 
[[Category:PhyloSoC]]
 
[[Category:PhyloSoC]]

Revision as of 11:43, 7 May 2012

Author and Relevant links

Student: Abu Zaher Md. Faridee / email: zaher14@gmail.com

Mentors: Primary Mentor: Kathryn Iverson, Secondary Mentor: Sarah Westcott

Project Homepage: PhyloSoC: Apply Machine Learning Algorithm(s) to Ecology Data

Project blog: Wiki on github and Issue Tracker on github

Source Code: Hosted on github

Abstract

Project Goals

Timeline

Interim Period before Acceptance: April 7 - April 22

  • Become Familiar with Microbial Ecology Terms
  • Got mothur up and running with Xcode and Gfortran in my Mac OS X setup
  • Follow mothur tutorial from the wiki and get to know the workflow
  • Create a private git branch for mothur in Github
  • Take a closer look at the R implementation
  • Create the initial schematics for the command line programs that would be added to mothur ('train' and 'inquire')
  • Go through text books to make sure my knowledge of the classifier algorithms are clear once again

Week 1: April 23 - April 29 (Community Bonding Period Starts)

  • Register for Machine Learning Course at Coursera
  • Set up Wiki, Isssue Tracker and other stuffs in GitHub page

  • Determine Performance Evaluation Criteria - Issue #1
  • Prepare Datasets for The Random Forest Algorithm - Issue #2
  • Classification/Regression or Both! - Issue #3
  • Make a List of Reusable Classes/Functions from Mothur - Issue #4


Week 2: April 30 - May 6

  • Find Mothur's Common Practices - Issue #5
  • Find a Way to Merge 'train' and 'inquire' Commands Into One Single Command - Issue #6
  • Investigate Mothur's Multithreading/Multiprocessing API - Issue #7
  • Investigate The Possible Places of Parallelization in the Random Forest Classifier - Issue #8
  • Create Code Schematics/Pseudocode for the Random Forest Implementation - Issue #9
  • Investigate the Best Practices for Parameter Tuning/Estimation for Random Forest - Issue #10

Week 3: May 7 - May 13

Week 4: May 14 - May 20

Milestone 1 Reached

Week 5: May 21 - May 27 (Start of Official Coding)

Week 6: May 28 - June 3

Week 7: June 4 - June 10

Milestone 2 Reached

Week 8: June 11 - June 17

Week 9: June 18 - June 24

Week 10: June 25 - July 1

Week 11: July 2 - July 8

Mid-term evaluation (from July 9 to July 13)

Status:

Week 12: July 9 - July 15

Week 13: July 16 - July 22

Week 14: July 23 - July 29

Milestone 3 Reached

Week 15: July 30 - August 5

Week 16: August 6 - August 12

August 13: Suggested ‘Pencils Down’

Status:

Week 17: August 13 - August 19

August 20: Firm ‘Pencils Down’

Status:

Final Evaluation

Status:

Further work after Google Summer of Code