Advisor: Christos Kozyrakis
- The growing number of applications using machine learning has lead to a need for low-latency and high accuracy inference serving frameworks. Existing serving and deployment frameworks provide low-latency and high accuracy by using persistent model containers in long-lived nodes, and leveraging dedicated resources. I am working on INFaaS, an inference-as-a-service system that makes inference accessible and easy-to-use by abstracting resource management and model selection. Users simply specify their inference task along with any performance and accuracy requirements for queries. [Publication]
- Previously worked on gg : a highly-scalable, function-as-a-service orchestration management framework for applications that are normally run on a laptop or on a cluster that can stay idle for long periods of time, such as video decoding and compilation. [Publication]
Advisor: David Cheriton
In-memory databases have strict latency and quality-of-service requirements, and when the database does not fit in memory, performance suffers. We have studied the impacts of paging and context switches on an in-memory databases by using a custom file system kernel module and a multi-threaded b-tree benchmark. Our studies have shown that in order to have a minimal performance degradation when only parts of the b-tree fit in memory, page faults would have to be "close to free". Using low-latency secondary storage technologies, we have developed a useful method that allows service providers to characterize their dataset based on throughput requirements and determine what storage backend technology is most cost-effective to suit their needs. We demonstrate this using Intel's 3DXP and NAND flash, and show that, on average, service providers only need to provision small DRAM amounts across multiple servers to meet their desired Quality-of-Service constraints. [arXiv]
Advisor: Christina Delimitrou
Scheduling applications on heterogeneous machines remains challenging, in particular due to the machines sharing various hardware resources. This becomes especially important as servers progressively get replaced and upgraded during a datacenter's lifetime. We developed Mage, a runtime system that leverages online machine learning techniques to determine the performance of co-scheduled applications in a heterogeneous system under any application-to-core mapping. Mage monitors the application performance and quickly reacts to discrepancies between predicted and actual performance within a few milliseconds. Across 350 application mixes on a heterogeneous CMP, Mage improves performance by 38% and up to 2x compared to a greedy scheduler. Across 160 mixes on a heterogeneous cluster, Mage improves performance by 30% on average over the greedy scheduler, and by 11% over Paragon. [Publication]
Advisor: Shrikanth Narayanan
In psychotherapy, the language used by therapists is critical in influencing the overall session. Thus, being able to predict and measure the level of empathy from a particular session can assist in judging efficacy. Using features inspired by psycholinguistic norms on a corpus of motivational interviews from real patients, we demonstrate the ability to predict empathy with a maximum of 75.28% UAR. [Publication]
We also leveraged the multiple instance learning (MIL) paradigm for automatically estimating human behavioral patterns and gaining insight into human behavioral data. [Publication]
Advisor: Murali Annavaram
As GPU's become more general purpose, graph applications have emerged as popular candidates for the hardware. However, several bottlenecks and stalls hinder performance on GPUs, in particular at the warp-level. We studied over 30 graph benchmarks and found bottlenecks such as long memory latencies and synchronization barriers to be major factors in the applications' performance degradation. These results aided in developing techniques in resource management and scheduling to diminish the major stalls.
Advisor: John Wawrzynek
To construct a high frequency wireless system simulator, we first needed to understand the reflection power and characteristics of various different types of materials. To collect this data, we constructed a mechanical setup onto which two antennas (transmitter and receiver) could be mounted and positioned at different angles from each other. Materials such as a classroom door, composite wall, opaque glass, and a desktop monitor were tested. We found the glass door and the classroom door to closest follow Fresnel's equations, which characterize the interactive properties between an electromagnetic wave and an interface.
This work was done as a part of the 2014 University of California, Berkeley SUPERB-ITS REU. It was presented at the 2014 SACNAS National Conference in Los Angeles, California, where it won a best poster award.