MS&E 317: Algorithms for Modern Data Models

Spring 2014, Stanford University
Mon, Wed 2:15 PM - 3:30 PM at Meyer 143

Instructors:
Ashish Goel | Reza Zadeh

We traditionally think of algorithms as running on data available in a single location, typically main memory. In many modern applications including web analytics, search and data mining, computational biology, finance, and scientific computing, the data is often too large to reside in a single location, is arriving incrementally over time, is noisy/uncertain, or all of the above.

Paradigms such as map-reduce, streaming, sketching, Distributed Hash Tables, Resilient Distributed Datasets, and random walks have proved useful for these applications. This course will provide an introduction to the design and analysis of algorithms for these modern data models. The class will be largely theoretical, but also involve at least two hands-on programming exercises.

Pre-requisites: Targeting Doctorate students having taken Algorithms at the level of CS 261. Being able to competently program in any main-stream high level language (C, C++, Ruby, Java, Scala, Python, Perl).

There will be 3 homeworks, one scribed lecture, and a take-home final exam. Students can substitute the take-home final with a project. Students taking the class for credit/no credit instead of letter grade can skip the final.

Students are strongly encouraged to attend the Spark Workshop.

Optional textbooks:
Algorithm Design by Kleinberg and Tardos [KT]
Randomized Algorithms by Rajeev Motwani and Prabakhar Raghavan [MR]

Homework

Homework 1 [pdf]
Homework 2 [pdf]
Homework 3 [pdf]

Scribe template

Final Exam Due June 9th by midnight

Lectures and References

Contact

Ashish: ashishg at stanford.edu

Reza: rezab at stanford.edu