The Workshops on Algorithms for Modern Massive Data Sets (MMDS 2012) addressed algorithmic and statistical challenges in modern large-scale data analysis. The goals of this series of workshops are to explore novel techniques for modeling and analyzing massive, high-dimensional, and nonlinearly-structured scientific and internet data sets; and to bring together computer scientists, statisticians, mathematicians, and data analysis practitioners to promote the cross-fertilization of ideas.

Tuesday, July 10, 2012. Theme: Data Analysis and Statistical Data Analysis

8:00 - 10:00 Breakfast and Registration -- outside Cubberley Auditorium (at the Stanford School of Education, just off the Main Quad)
9:45 - 10:00 Welcome and Opening Remarks -- in Cubberley Auditorium
10:00 - 11:00 Tutorial: Jiawei Han
A Meta Path-Based Approach for Similarity Search and Mining of Heterogeneous Information Networks
11:00 - 11:30 Alexander Gray
Faster Learning for Massive Datasets
11:30 - 12:00 Christopher Re
Hazy: Making Data-driven Statistical Applications Easier to Build and Maintain
2:00 - 3:00 Tutorial: Peter Bartlett
Model Selection and Recent Results for Large Scale Problems
3:00 - 3:30 Noureddine El Karoui
On Robust Regression Estimators in High-dimension
3:30 - 4:00 Jure Leskovec
Affiliation Network Models for Densely Overlapping Communities in Networks
4:30 - 5:00 Haesun Park
Nonnegative Matrix Factorizations for Clustering
5:00 - 5:30 Fan Chung Graham
Vectorized Laplacians for Dealing with High-dimensional Data Sets
5:30 - 6:00 Joydeep Ghosh
Actionable Mining of Large, Multi-relational Data using Localized Predictive Models

Wednesday, July 11, 2012. Theme: Industrial and Scientific Applications

9:00 - 10:00 Tutorial: DJ Patil
When Algorithms Go Wrong: How Product Design Can Save Algorithmic Limitations
10:00 - 10:30 Sean Fahey
Big Data and Analytics for National Security
11:00 - 11:30 Petros Drineas
Leverage Scores, the Column Subset Selection Problem, and Least-squares Problems
11:30 - 12:00 David Woodruff
Low Rank Approximation and Regression in Input Sparsity Time
12:00 - 12:30 Michael W. Mahoney
Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments
2:30 - 3:30 Tutorial: Rick Stevens
The Biological, Algorithmic and Computational Challenges of Systems Biology
3:30 - 4:00 Tiankai Tu
Fault-Tolerant Parallel Analysis of Millisecond-Scale Molecular Dynamics Trajectories
4:30 - 5:00 Alexander Szalay
Current Statistical Challenges in Large Astronomical Surveys
5:00 - 5:30 Joseph Richards
Astronomical Time Series Analysis for the Synoptic Survey Era
5:30 - 6:00 Tony Cass
Data Handling for LHC: Plans and Reality

Thursday, July 12, 2012. Theme: Novel Algorithmic Approaches

9:00 - 10:00 Tutorial: Michael Mitzenmacher
Peeling Arguments: Invertible Bloom Lookup Tables and Biff Codes
10:00 - 10:30 Frederic Chazal
Detection and Approximation of Linear Structures in Metric Spaces
11:00 - 11:30 Ping Li
Probabilistic Hashing for Efficient Search and Learning on Massive Data
11:30 - 12:00 Ashish Goel
Real Time Social Search and Related Problems
12:00 - 12:30 Andrew Goldberg
Hub Labels in Databases: Shortest Paths for the Masses
2:30 - 3:00 Theodore Johnson
Data Stream Warehousing
3:00 - 3:30 Josh Wills
Experimenting at Scale
3:30 - 4:00 Hang Li
Large Scale Machine Learning for Query Document Matching in Web Search
4:30 - 4:50 Blair Sullivan
Branching Out: Quantifying Tree-like Structure in Complex Networks
4:50 - 5:10 Mahdi Soltanolkotabi
A Geometric Analysis of Subspace Clustering with Outliers
5:10 - 5:30 Bahman Bahmani
Scalable K-Means++
5:30 - 6:00 Steve Bartel
Analytics at Dropbox

Friday, July 13, 2012. Theme: Novel Matrix and Graph Methods

9:00 - 10:00 Tutorial: Yi Ma
The Pursuit of Low-dimensional Structures in High-dimensional Data
10:00 - 10:30 Edoardo Airoldi
Graphlets Decomposition of a Weighted Network
11:00 - 11:30 Yiannis Koutis
SDD Solvers: Bridging the Gap Between Theory and Practice
11:30 - 12:00 Art Owen
Bootstrapping r-fold Tensor Data
12:00 - 12:30 Kamesh Madduri
Algorithms and Tools for Scalable Graph Analytics
2:30 - 3:00 Shaowei Lin
Studying Model Asymptotics with Singular Learning Theory
3:00 - 3:30 David Bindel
Communities, Spectral Clustering, and Random Walks
3:30 - 4:00 Ali Pinar
The Block Two-Level Erdos-Renyi (BTER) Graph Model
4:30 - 5:00 Xiao-Li Meng (presented by Alexander Blocker)
Preprocessing, Multiphase Inference, and Massive Data in Theory and Practice
5:00 - 5:30 Alfred Hero
Hub Discovery in Large Correlation Networks
5:30 - 6:00 Dan Feldman
Google Your Life: Learning Sensors Data

Organizing Committee:
Michael Mahoney (chair), Alex Shkolnik, Gunnar Carlsson, Petros Drineas

The MMDS 2012 Organizers and the MMDS Foundation would like to thank the following institutional sponsors for their generous support:

