MS&E 317: Algorithms for Modern Data ModelsSpring 2015, Stanford University
Mon, Wed 2:15 PM - 3:30 PM at Sequoia Hall 200. The first class will be on Wed Apr 1.
Instructor: Ashish Goel. Office hours: Wed 4-5pm, HEC 359. Cell: 650 eight one four 1478.
We traditionally think of algorithms as running on data available in a single location, typically main memory. In many modern applications including web analytics, search and data mining, computational biology, finance, and scientific computing, the data is often too large to reside in a single location, is arriving incrementally over time, is noisy/uncertain, or all of the above.
Paradigms such as map-reduce, streaming, sketching, Distributed Hash Tables, Resilient Distributed Datasets, metric embeddings, and random walks have proved useful for these applications. This course will provide an introduction to the design and analysis of algorithms for these modern data models. The class will be largely theoretical, but also involve at least two hands-on programming exercises.
Pre-requisites: Targeting Doctorate students having taken Algorithms at the level of CS 161. Being able to competently program in any main-stream high level language (eg. C, C++, Ruby, Java, Scala, Python, Perl).
There will be 3 homeworks, one scribed lecture, and a take-home final exam. Students can substitute the take-home final with a project on distributed computation based on sketches. If you would like to do a project, please let the class staff know before the end of the second week. Students taking the class for credit/no credit instead of letter grade can skip the final.