What is this course about?
CS341 is an advanced project based course, framed as the natural continuation of CS246 - Mining Massive Data Sets. Students will work on Data Mining and Machine Learning algorithms for analyzing very large amounts of data. Both interesting datasets as well as computational infrastructure (Google Cloud) will be provided to the students by the course staff and mentors.
Students are expected to have knowledge and familiarity with concepts covered in CS246 or similar classes (Hadoop, Spark, large scale data mining and machine learning algorithms, etc.). Other courses that might be helpful are: CS221, CS224N, CS224W, CS228, CS229, CS276, EE364A.
The following text is useful, but not required. It can be downloaded for free, or purchased from Cambridge
Leskovec-Rajaraman-Ullman: Mining of Massive Dataset