Mining Massive Data Sets
The course will discuss data mining and machine learning algorithms for analyzing very large amounts of data. The emphasis will be on Map Reduce as a tool for creating parallel algorithms that can process very large amounts of data.
Important course information will be posted on this web page and announced in class. You are responsible for all material that appears here and should check this page for updates frequently.
- 1/4: The first class will be held at 9am on Tuesday, January 5, in NVIDIA Auditorium, Huang Engineering Center.
- 1/4: HW0 (Hadoop tutorial) is out, due on January 12 at 11:59pm.
- 1/5: GHW1 has been assigned on Gradiance, due on January 14 at 11:59pm.
- 1/6: Daniel Templeton (CS246H) will be holding additional office hours to help students with VM/Hadoop setup and questions on Friday (1/8) and Monday (1/11). Time: 7pm-9pm. Location: Gates B28.
- 1/7: HW1 is out, due on January 21 at 11:59pm. Find submission templates on the course handouts page.
- 1/18: Just a reminder that GHW2 has been assigned on Gradiance, due on January 21 at 11:59pm. See course info page for a complete schedule of Gradiance homeworks.
- 1/21: GHW3 has been assigned on Gradiance, due on January 28 at 11:59pm.
- 1/21: HW2 is out, due on February 04 at 11:59pm. Find submission templates on the course handouts page.
- 1/26: GHW4 has been assigned on Gradiance, due on February 4 at 11:59pm.
- 2/2: GHW5 has been assigned on Gradiance, due on February 11 at 11:59pm.
- 2/4: HW3 is out, due on February 18 at 11:59pm. Find submission templates on the course handouts page.
- 2/9: GHW6 has been assigned on Gradiance, due on February 18 at 11:59pm.
Tuesday & Thursday 9AM - 10:20AM in NVIDIA Auditorium, Jen-Hsun Huang Engineering Center.
Watch video lectures on SCPD. Stanford students can see them here.
Office: 425 Gates
Email: lastname @ gmail.com
Office Hours: Tuesday 10:30AM-Noon, Friday 10:30AM-Noon
Companion course CS246H:
There is a companion course CS246H
, which is completely independent from CS246 and covers Hadoop programming. It meets Tuesdays 3PM - 4:20PM, also in NVIDIA Auditorium
Note: Jeff Ullman will not hold office hours on 1/26, 1/29, and 2/9.
|Jeff Ullman||Tuesday||10:30AM-noon||425 Gates|
|Jeff Ullman||Friday||10:30AM-noon||425 Gates|
|Caroline Suen||Monday||1:30PM-3:30PM||414 Gates|
|Duyun Chen||Monday||5PM-7PM||Huang Basement|
|Shubham Gupta||Tuesday||10:30AM-11:30AM||Huang Basement|
|Ivaylo Bahtchevanov||Tuesday||1PM-3PM||Huang Basement|
|Jacky Wang||Tuesday||4PM-6PM||Huang Basement|
|Leon Yao||Wednesday||11AM-1PM||Huang Basement|
|Himabindu Lakkaraju||Wednesday||2PM-4PM||448 Gates|
|Jeff Hwang||Wednesday||6PM-8PM||Huang Basement|
|Shubham Gupta||Thursday||10:30AM-11:30AM||Huang Basement|
|Tim Althoff||Thursday||3PM-5PM||414 Gates|
|Sameep Bagadia||Friday||3PM-5PM||Huang Basement|
|You Zhou||Friday||9AM-11AM||Huang Basement|
|Nihit Desai||Friday||1PM-3PM||Huang Basement|
Automated Quizzes: We will be using Gradiance. Everyone should create an account there
(passwords are at least 10 letters and digits with at least one of each) and enter the class code 62B99A55. Please use your real first and last name, with the standard capitalization, e.g., "Jeffrey Ullman" so we can match your Gradiance score report to
other class grades.
Books: Leskovec-Rajaraman-Ullman: Mining of Massive Datasets can be downloaded for free. It can be purchased from Cambridge University Press, but you are not required to do so.
MOOC: There is a Coursera MOOC that is similar to this course. You may find
it useful to view some of the videos there.
Piazza: Piazza Discussion Group for this class (access code "mmds").
Course handouts: Available here.