MMDS 2012. Workshop on Algorithms for Modern Massive Data Sets

Stanford University
July 10–13, 2012


The Workshops on Algorithms for Modern Massive Data Sets (MMDS 2012) addressed algorithmic and statistical challenges in modern large-scale data analysis. The goals of this series of workshops are to explore novel techniques for modeling and analyzing massive, high-dimensional, and nonlinearly-structured scientific and internet data sets; and to bring together computer scientists, statisticians, mathematicians, and data analysis practitioners to promote the cross-fertilization of ideas.

MMDS 2012 Schedule

Tuesday, July 10, 2012. Theme: Data Analysis and Statistical Data Analysis

Time Talk
8:00 - 10:00 Breakfast and Registration -- outside Cubberley Auditorium (at the Stanford School of Education, just off the Main Quad)
9:45 - 10:00 Welcome and Opening Remarks -- in Cubberley Auditorium
10:00 - 11:00 Tutorial: Jiawei Han
A Meta Path-Based Approach for Similarity Search and Mining of Heterogeneous Information Networks
11:00 - 11:30 Alexander Gray
Faster Learning for Massive Datasets
11:30 - 12:00 Christopher Re
Hazy: Making Data-driven Statistical Applications Easier to Build and Maintain
2:00 - 3:00 Tutorial: Peter Bartlett
Model Selection and Recent Results for Large Scale Problems
3:00 - 3:30 Noureddine El Karoui
On Robust Regression Estimators in High-dimension
3:30 - 4:00 Jure Leskovec
Affiliation Network Models for Densely Overlapping Communities in Networks
4:30 - 5:00 Haesun Park
Nonnegative Matrix Factorizations for Clustering
5:00 - 5:30 Fan Chung Graham
Vectorized Laplacians for Dealing with High-dimensional Data Sets
5:30 - 6:00 Joydeep Ghosh
Actionable Mining of Large, Multi-relational Data using Localized Predictive Models

Wednesday, July 11, 2012. Theme: Industrial and Scientific Applications

Time Talk
9:00 - 10:00 Tutorial: DJ Patil
When Algorithms Go Wrong: How Product Design Can Save Algorithmic Limitations
Book PDFs: Building Data Science Teams, Data Jujitsu
10:00 - 10:30 Sean Fahey
Big Data and Analytics for National Security
11:00 - 11:30 Petros Drineas
Leverage Scores, the Column Subset Selection Problem, and Least-squares Problems
11:30 - 12:00 David Woodruff
Low Rank Approximation and Regression in Input Sparsity Time
12:00 - 12:30 Michael W. Mahoney
Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments
2:30 - 3:30 Tutorial: Rick Stevens
The Biological, Algorithmic and Computational Challenges of Systems Biology
3:30 - 4:00 Tiankai Tu
Fault-Tolerant Parallel Analysis of Millisecond-Scale Molecular Dynamics Trajectories
4:30 - 5:00 Alexander Szalay
Current Statistical Challenges in Large Astronomical Surveys
5:00 - 5:30 Joseph Richards
Astronomical Time Series Analysis for the Synoptic Survey Era
5:30 - 6:00 Tony Cass
Data Handling for LHC: Plans and Reality

Thursday, July 12, 2012. Theme: Novel Algorithmic Approaches

Time Talk
9:00 - 10:00 Tutorial: Michael Mitzenmacher
Peeling Arguments: Invertible Bloom Lookup Tables and Biff Codes
10:00 - 10:30 Frederic Chazal
Detection and Approximation of Linear Structures in Metric Spaces
11:00 - 11:30 Ping Li
Probabilistic Hashing for Efficient Search and Learning on Massive Data
11:30 - 12:00 Ashish Goel
Real Time Social Search and Related Problems
12:00 - 12:30 Andrew Goldberg
Hub Labels in Databases: Shortest Paths for the Masses
2:30 - 3:00 Theodore Johnson
Data Stream Warehousing
3:00 - 3:30 Josh Wills
Experimenting at Scale
3:30 - 4:00 Hang Li
Large Scale Machine Learning for Query Document Matching in Web Search
4:30 - 4:50 Blair Sullivan
Branching Out: Quantifying Tree-like Structure in Complex Networks
4:50 - 5:10 Mahdi Soltanolkotabi
A Geometric Analysis of Subspace Clustering with Outliers
5:10 - 5:30 Bahman Bahmani
Scalable K-Means++
5:30 - 6:00 Steve Bartel
Analytics at Dropbox

Friday, July 13, 2012. Theme: Novel Matrix and Graph Methods

Time Talk
9:00 - 10:00 Tutorial: Yi Ma
The Pursuit of Low-dimensional Structures in High-dimensional Data
10:00 - 10:30 Edoardo Airoldi
Graphlets Decomposition of a Weighted Network
11:00 - 11:30 Yiannis Koutis
SDD Solvers: Bridging the Gap Between Theory and Practice
11:30 - 12:00 Art Owen
Bootstrapping r-fold Tensor Data
12:00 - 12:30 Kamesh Madduri
Algorithms and Tools for Scalable Graph Analytics
2:30 - 3:00 Shaowei Lin
Studying Model Asymptotics with Singular Learning Theory
3:00 - 3:30 David Bindel
Communities, Spectral Clustering, and Random Walks
3:30 - 4:00 Ali Pinar
The Block Two-Level Erdos-Renyi (BTER) Graph Model
4:30 - 5:00 Xiao-Li Meng (presented by Alexander Blocker)
Preprocessing, Multiphase Inference, and Massive Data in Theory and Practice
5:00 - 5:30 Alfred Hero
Hub Discovery in Large Correlation Networks
5:30 - 6:00 Dan Feldman
Google Your Life: Learning Sensors Data

MMDS 2012 Confirmed Speakers

MMDS 2012 Organizers

Organizing Committee:
Michael Mahoney (chair), Alex Shkolnik, Gunnar Carlsson, Petros Drineas

MMDS 2012 Sponsors

The MMDS 2012 Organizers and the MMDS Foundation would like to thank the following institutional sponsors for their generous support:

ebay dropbox dropbox

AFOSR LBNL stanford

Past MMDS events

MMDS 2010: Workshop on Algorithms for Modern Massive Data Sets, Stanford, CA, June 15–18, 2010.

MMDS 2008: Workshop on Algorithms for Modern Massive Data Sets, Stanford, CA, June 25–28, 2008.

MMDS 2006: Workshop on Algorithms for Modern Massive Data Sets, Stanford, CA, June 21–24, 2006.