Mining Massive Data Sets
Winter 2017
The course will discuss data mining and machine learning algorithms for analyzing very large amounts of data. The emphasis will be on Map Reduce as a tool for creating parallel algorithms that can process very large amounts of data.
In Spring 2017, we will be offering a project based course where students will apply data mining and machine learning techniques on real world datasets. CS341: Project in Mining Massive Data Sets


Course information:


Tuesday & Thursday 3PM - 4:20pm in NVIDIA Auditorium, Jen-Hsun Huang Engineering Center.
Watch video lectures on SCPD. Stanford students can see them here.


Jeff Ullman
Office: 425 Gates
Email: lastname @ gmail.com
Office Hours: Tuesday 1:00-2:30pm, Friday 10:30am-12:00pm

Companion course CS246H:

There is a companion course CS246H, which is completely independent from CS246 and covers Hadoop programming. It meets Wednesdays 11:30AM - 1:20PM, in Skilling Auditorium

Office hours:

SCPD students can join the office hours via Google Hangouts. The hangouts link is available on Piazza.

Note: Jeff Ullman will not be able to hold office hours on 3/14 as he is traveling during this time. On 3/17, his office hours are by appointment only. He will hold extra office hours the morning of the final exam 3/21 from 10AM-noon by appointment.

Jeff UllmanTuesday1:15pm-2:30pm425 Gates
Jeff UllmanFriday10:30am-12:00pm425 Gates
Michael ZhuMonday10:00am-12:00pmHuang Basement
Rishabh BhargavaMonday3:00pm-5:00pmHuang Basement
Jessica SuMonday5:00pm-7:00pmHuang Basement
Naveen ArivazhaganMonday7:00pm-9:00pmHuang Basement
Anthony KimTuesday9:30am-11:30amHuang Basement
Nihit DesaiTuesday4:30pm-6:30pmHuang Basement
Leon YaoWednesday12:00pm-2:00pmHuang Basement
Yixin WangWednesday2:00pm-4:00pmHuang Basement
Vinaya PolamreddiThursday10:30am-12:30pmHuang Basement
Yixin CaiThursday1:00pm-3:00pmHuang Basement
Junwei YangFriday9:00am-11:00amHuang Basement
Sachin PadmanabhanFriday11:00am-1:00pmHuang Basement
Luda ZhaoFriday1:00pm-3:00pmHuang Basement

Course materials:

Automated Quizzes: We will be using Gradiance. Everyone (on-campus as well as SCPD students) should create an account there (passwords are at least 10 letters and digits with at least one of each) and enter the class code 380CE054. Please use your real first and last name, with the standard capitalization, e.g., "Jeffrey Ullman". Also please register using your stanford email or the same email you used for Gradescope so we can match your Gradiance score report to other class grades.

Books: Leskovec-Rajaraman-Ullman: Mining of Massive Datasets can be downloaded for free. It can be purchased from Cambridge University Press, but you are not required to do so.

MOOC: You can watch videos from a past Coursera MOOC (similary to this course) on Youtube.

Piazza: Piazza Discussion Group for this class (access code "mmds").

Course handouts: Available here.

Staff Email: You can reach us at cs246-win1617-staff@lists.stanford.edu

Previous versions of the course

CS246: Winter 2016

CS246: Winter 2015

CS246: Winter 2014

CS246: Winter 2013

CS246: Winter 2012

CS246: Winter 2011

CS345a: Winter 2010