Antoine Saint Exupery
I am interested in computer architecture, systems, and applied data mining.
My PhD work has focused on improving the resource efficiency of large-scale datacenters. Since traditional scaling techniques, e.g., commodity computing or Dennard scaling are reaching the point of diminishing returns,
we must focus on using existing systems more efficiently.
Specifically, during my PhD I have designed and built practical and scalable scheduling systems that are both QoS-aware and resource-efficient.
My approach is that system management must become an integral part of the architecture, and must account for the resource preferences of different applications. Unfortunately, obtaining this information through profiling is too expensive. To make the system practical, I have leveraged efficient data mining techniques that take advantage of existing system knowledge to quickly make high quality scheduling decisions. Here is a list of projects I am currently working on or have worked on in the past.
Quasar: Traditionally, datacenters have been plagued by low utilization, primarily due to users overprovisioning resource reservations to side-step performance unpredictability. Quasar is a cluster manager that adopts a different interface between system and user. Instead of specifying raw resources, the user only specifies a performance target a job must meet. Quasar leverages efficient data mining techniques that find similarities between previous and new applications to translate this performance target to resources, much like a movie recommendation system finds similarities between previous and new users to recommend movies that they are likely to enjoy. Quasar achieves both high cluster utilization and high per-application performance.
[ASPLOS'14 paper] [demo] [press]
Paragon: Paragon is a QoS-aware datacenter scheduler that accounts for both inteference between co-scheduled workloads
and platform heterogeneity when assigning applications to servers. The scheduler leverages fast classification techniques to
determine the interference and heterogeneity preferences of incoming applications, which only introduce minimal scheduling overheads.
In a 1,000-server EC2 cluster Paragon improves system utilization by 47% compared to a traditional least-loaded scheduler and
achieves 96% of optimal performance, while being scalable and lightweight.
[ASPLOS'13 paper] [TopPicks'14 paper] [TOCS'13 paper] [webpage]
iBench: Paragon and Quasar need to know the sensitivity of an incoming application to various types of interference. iBench is a benchmark suite that consists
of a set of microbenchmarks each of which puts pressure on a specific shared resource. iBench enables fast and practical characterization of the interference an
application tolerates in various resources and the interference it itself generates.
ARQ: Admission control is needed during periods of high load to prevent cluster overloading. ARQ is a multi-class admission control protocol that ensures fast
application dispatching and low head-of-line blocking.
Datacenter Application Modeling: Previously, I worked on characterizing and modeling the behavior of large-scale datacenter applications.
I designed and implemented ECHO, a consice analytical model that captures and recreates the network traffic of
distributed datacenter applications. I also developed a modeling framework for storage workloads, which generates synthetic load patterns similar to the
original applications. Both modeling frameworks were validated against real datacenter applications from Microsoft, and were used in a series of efficiency
and cost optimization studies.
[IISWC'12 paper] [IISWC'11 paper] [CAL'12 paper] [TPCTC'11 paper]
I also enjoy teaching and mentoring students.
- In Fall 2014 I am co-teaching CS316 (Advanced Multicore Systems).
- In Spring 2014 I was co-teaching EE282 (Computer Architecture).
- In Spring 2013 I was TAing EE282 (Computer Architecture) and teaching some of the lectures and a weekly recitation.
- I am mentoring Sammy Steele (Summer 2014-present). Sammy is working with us on porting techniques that improve resource-efficiency on Mesos.
- In Fall 2013, I mentored several quarter-long projects for CS316 (Advanced Processor Architecture) related to heterogeneous CMP scheduling and datacenter server provisioning.