Mining Massive Data Sets
The course will discuss data mining and machine learning algorithms for analyzing very large amounts of data. The emphasis will be on Map Reduce as a tool for creating parallel algorithms that can process very large amounts of data.
- 1/9: The first class will be held at 3pm on Tuesday, January 10, in NVIDIA Auditorium, Huang Engineering Center.
- 1/9: HW0 (Hadoop tutorial) is out, due on January 19 at 11:59pm.
- 1/10: GHW1 has been assigned on Gradiance, due on January 19 at 11:59pm.
- 1/10: Time and location of linear algebra review session: January 13, 3:00pm to 4:20pm in Gates B03.
- 1/10: Time and location of probability and statistics review session: January 20, 3:00pm to 4:20pm in Gates B03.
- 1/10 [for SCPD students]: We'd like to clarify that SCPD students should use Gradescope (for assignments) and Gradiance (for automated quizzes) just like the on-campus students. Please don't submit your work through SCPD. For more details, please see the course infomation page.
- 1/12: We are organizing a VM clinic to help students set up their VMs. Daniel Templeton will be at the session, assisted by several other TAs. Time and Location: January 16 (coming Monday), 6PM to 9PM in Gates 415.
- 1/12: HW1 is out, due on January 26 at 11:59pm.
Tuesday & Thursday 3PM - 4:20pm in NVIDIA Auditorium, Jen-Hsun Huang Engineering Center.
Watch video lectures on SCPD. Stanford students can see them here.
Office: 425 Gates
Email: lastname @ gmail.com
Office Hours: Tuesday 1:00-2:30pm, Friday 10:30am-12:00pm
Companion course CS246H:
There is a companion course CS246H
, which is completely independent from CS246 and covers Hadoop programming. It meets Wednesdays 11:30AM - 1:20PM, in Skilling Auditorium
|Jeff Ullman||Tuesday||1:15pm-2:30pm||425 Gates|
|Jeff Ullman||Friday||10:30am-12:00pm||425 Gates|
|Michael Zhu||Monday||10:00am-12:00pm||Huang Basement|
|Naveen Arivazhagan||Monday||1:15pm-3:15pm||Huang Basement|
|Rishabh Bhargava||Monday||3:00pm-5:00pm||Huang Basement|
|Jessica Su||Monday||5:00pm-7:00pm||Huang Basement|
|Anthony Kim||Tuesday||10:00am-12:00pm||Huang Basement|
|Nihit Desai||Tuesday||4:30pm-6:30pm||Huang Basement|
|Leon Yao||Wednesday||12:00pm-2:00pm||Huang Basement|
|Yixin Wang||Wednesday||2:00pm-4:00pm||Huang Basement|
|Vinaya Polamreddi||Thursday||10:30am-12:30pm||Huang Basement|
|Yixin Cai||Thursday||1:00pm-3:00pm||Huang Basement|
|Junwei Yang||Friday||9:00am-11:00am||Huang Basement|
|Sachin Padmanabhan||Friday||11:00am-1:00pm||Huang Basement|
|Luda Zhao||Friday||1:00pm-3:00pm||Huang Basement|
Automated Quizzes: We will be using Gradiance. Everyone (on-campus as well as SCPD students) should create an account there (passwords are at least 10 letters and digits with at least one of each) and enter the class code 380CE054. Please use your real first and last name, with the standard capitalization, e.g., "Jeffrey Ullman". Also please register using your stanford email or the same email you used for Gradescope so we can match your Gradiance score report to other class grades.
Books: Leskovec-Rajaraman-Ullman: Mining of Massive Datasets can be downloaded for free. It can be purchased from Cambridge University Press, but you are not required to do so.
MOOC: You can watch videos from a past Coursera MOOC (similary to this course) on Youtube.
Piazza: Piazza Discussion Group for this class (access code "mmds").
Course handouts: Available here.
Staff Email: You can reach us at email@example.com
Previous versions of the course
CS246: Winter 2016
CS246: Winter 2015
CS246: Winter 2014
CS246: Winter 2013
CS246: Winter 2012
CS246: Winter 2011
CS345a: Winter 2010