Saturday, June 12, 2010

machine learning class

I organized a machine learning class at HackerDojo, Mike Bowles and Tricia Hoffman volunteered to teach it. I met them at the Data Mining camp held at the Dojo 6 months ago. I was shocked there were 300 people who showed up for that. They had hundreds at another follow up data mining unconference in March.

I am surprised at the quality of the students, Tyler Neylon came up with a way to make k-nearest neighbors linear in time and map reducable vs. the current n^2 implementation. He had his paper accepted and presented. He is going to present on 6/17 our last class meeting in front of ~100 people. Kudos also to Peter Harrington for learning map reduce so quickly and porting lwlr to it.

We are working on 1h lectures to deliver for people interested in learning about map reduce. The goal is to get more people interested to come down to the Dojo and learn.

I think the big deal with machine learning are both the algorithms part and the systems part. In the class room environment the material never covers the systems part. So when we introduce Hadoop and Map Reduce to a machine learning class we get intense interest. Machine learning is as much about the algorithms as it is about building a system to store and generate reports on big data. It's cool to come up with a new algorithm like Tyler but you have to build a big data system also. Look at the contrast between Yahoo/Ebay and Google. Both of those companies are 5-10 years behind Google.

No comments:

Post a Comment