GSoC2013 Coding Challenge

From Phyloinformatics
Jump to: navigation, search

Implementing Machine Learning Algorithms for Classification and Feature Selection in Mothur

My complements on submitting you application for the idea, I'm glad to let you know that all of you have submitted really good applications, some of which are outstanding. We have almost 3 weeks before we make the final judgement, in this time we want to give you some interesting things to do so that you can grow a bit more understanding of the problem we are trying to address. Try any of the following challenges.

  • Using any open source tools i.e. R, Octave, scikit-learn, libsvm, shogun ML toolbox etc and scripting, run any one of the ML and feature selection algorithms on the data provided
  • Write a prototype implementation of SVM or ENET and try it on the data provided, it doesn't have to be a very efficient implementation, a very crude one that serves as a proof on concept will do.

The output should be in the following format. The rank is a relative term, not an absolute value. It denotes the relative importance between the features.

OTU     Rank
Otu0022     5.55
Otu0077     0.93
Otu0840     0.82
Otu0299     0.8
Otu0170     0.79
Otu0566     0.78
Otu0372     0.78
Otu0365     0.77

Finishing any of the objectives (or if you can both) will count towards bonus points for you of getting selected for this idea.