Christina's photo


"I know but one freedom and that is the freedom of the mind"
                        Antoine Saint Exupery

I am interested in computer architecture, systems, and applied data mining.

My PhD work has focused on improving the resource efficiency of large-scale datacenters. Since traditional scaling techniques, e.g., commodity computing or Dennard scaling are reaching the point of diminishing returns, we must focus on using existing systems more efficiently. Specifically, during my PhD I have designed and built practical and scalable scheduling systems that are both QoS-aware and resource-efficient.
My approach is that system management must become an integral part of the architecture, and must account for the resource preferences of different applications. Unfortunately, obtaining this information through profiling is too expensive. To make the system practical, I have leveraged efficient data mining techniques that take advantage of existing system knowledge to quickly make high quality scheduling decisions. Here is a list of projects I am currently working on or have worked on in the past.

Quasar: Traditionally, datacenters have been plagued by low utilization, primarily due to users overprovisioning resource reservations to side-step performance unpredictability. Quasar is a cluster manager that adopts a different interface between system and user. Instead of specifying raw resources, the user only specifies a performance target a job must meet. Quasar leverages efficient data mining techniques that find similarities between previous and new applications to translate this performance target to resources, much like a movie recommendation system finds similarities between previous and new users to recommend movies that they are likely to enjoy. Quasar achieves both high cluster utilization and high per-application performance.
[ASPLOS'14 paper] [demo] [press]

Paragon: Paragon is a QoS-aware datacenter scheduler that accounts for both inteference between co-scheduled workloads and platform heterogeneity when assigning applications to servers. The scheduler leverages fast classification techniques to determine the interference and heterogeneity preferences of incoming applications, which only introduce minimal scheduling overheads. In a 1,000-server EC2 cluster Paragon improves system utilization by 47% compared to a traditional least-loaded scheduler and achieves 96% of optimal performance, while being scalable and lightweight.
[ASPLOS'13 paper] [TopPicks'14 paper] [TOCS'13 paper] [webpage]

iBench: Paragon and Quasar need to know the sensitivity of an incoming application to various types of interference. iBench is a benchmark suite that consists of a set of microbenchmarks each of which puts pressure on a specific shared resource. iBench enables fast and practical characterization of the interference an application tolerates in various resources and the interference it itself generates.
[IISWC'13 paper]

ARQ: Admission control is needed during periods of high load to prevent cluster overloading. ARQ is a multi-class admission control protocol that ensures fast application dispatching and low head-of-line blocking.
[ICAC'13 paper]

Datacenter Application Modeling: Previously, I worked on characterizing and modeling the behavior of large-scale datacenter applications. I designed and implemented ECHO, a consice analytical model that captures and recreates the network traffic of distributed datacenter applications. I also developed a modeling framework for storage workloads, which generates synthetic load patterns similar to the original applications. Both modeling frameworks were validated against real datacenter applications from Microsoft, and were used in a series of efficiency and cost optimization studies.
[IISWC'12 paper] [IISWC'11 paper] [CAL'12 paper] [TPCTC'11 paper]


I also enjoy teaching and mentoring students.