Course project

Summary

There are three main components to the course project (that will be graded):

Timeline

Mar 8 - Apr 20 Work on projects!
Provide feedback.
Apr 21 (T)
  • Joey Ruberti and Michael Church
  • Zhaochong Liu
Apr 22 (W)
  • Bita Kazemi and Alekhya Chennupati
Apr 23 (R)
  • Roi Ceren, Will Richardson, Muthukumaran Chandrasekaran
  • Ankita Joshi, Bahaa AlAila, Manish Ranjan
May 1 (F) Project reports due by 11:59:59pm.

Proposal

Only one proposal is required per team (the proposal should list all the team members). The proposal should briefly describe the project, touching specifically on 1) what the goal of the project is, 2) the software framework that will be used, 3) the data that will be investigated, and 4) how this fits into the area of "big data" / distributed computing / scalable machine learning.

Deadline: Friday, March 6 by 11:59:59pm.

Other

You may work in teams of 2. Or you may work alone.

You may use Hadoop, Spark, or any other framework you'd like.

Your final report must be in NIPS format; you can find the templates here.

Your source code is not a required deliverable for this project; only your proposal, presentation, and final report count towards your project grade. Still, contributing to open source is a worthwhile endeavor, so I encourage everyone to post their code in the public domain. Furthermore, if you are working on a team of 2 or more, I highly recommend using some sort of versioning software like git to manage your codebase.