I am a Master's student in Computer Science, Artificial Intelligence track at Stanford University. I currently do research with Chris Ré at HazyResearch in robustness in video-based machine learning, working with Sharon Li (Asst. Prof., University of Wisconsin-Madison) and Dan Fu (PhD Candidate, Stanford). After graduation, I hope to pursue a PhD in computer science. I am most interested in studies of machine learning model robustness and fairness in real-world situations. I am passionate about ethical AI, and am always searching for connections between my undergraduate field -- ethnic studies -- and responsible, fair AI devleopment.
NEWS
GPA: 4.05
GPA: 3.97
Studies in robustness in video machine learning models under bit-level network and file corruptions in data. We find that video corruptions cause an up to 77% drop in accuracy on the action recognition task using 3D-ResNets. Furthermore, corrupted videos that the model predicted incorrectly were up to 1.6 times more perturbed (L2 distance) than those correctly predicted. Baseline model-side defenses for dealing with corrupted data (out-of-distribution detection, adversarial training, training with data augmented by corruptions) fail to restore performance on clean data, though augmentation shows some promise. Initial robustness study presented at ECCV 2020 (first-author), Workshop on Adversarial Robustness in the Real World. You can view my talk here.
Studies in the viability of multi-label problem transformation techniques in the few-shot domain adaptation case, for plug-and-play training for Memory Augmented Neural Networks, Model-Agnostic Meta-Learning, and Prototypical Networks. Evaluated utility of powerset labels vs. binary relevance labels (multi-headed approach), the latter requiring architectural modifications. On BigEarthNet dataset, class-balanced sampling technique achieved top performance (86.3 F1, 3-label case, MANN), outweighing choice of label representation. Ablations suggest exponential growth of problem difficulty with increase in label set cardinality, as well as linear gains from exponential support set growth. Future work points to better scaling for these techniques.
Under supervision of Henwei Huang (Postdoc, MIT), in Giovanni Traverso's (Asst. Prof., MIT Mechanical Engineering) lab. Investigated practical methods for single-channel EEG-based drowsiness detection. Previous work has used frequency features (i.e. power spectral density, entropy) to predict mental/neurological states, but practical methods for single-channel drowsiness detection remain underexplored. Developed a data collection and ML model pipeline using the OpenBCI headset and a Jetson Nano. Between classical (i.e. SVM, decision trees) and deep ML architectures, we found a Fourier-domain KNN search yielded the best accuracy.
Bounding box regression and character-level classification on over 3000 scanned, expert-labeled images of Japanese calligraphy pages. Objective is OCR on Japanese calligraphy. studied multiple architectures:w in the image-to-text domain, from R-CNN techniques to image captioning architectures. Baseline model is a CNN with VGG-19 feature extraction that serves as an encoder, which is then fed into a LSTM decoder with self-attention. Baseline results have approx. 40% training classification accuracy.
Implemented multiple classification algorithms and signal processing techniques on the UCI Epilepsy dataset. Baseline models included softmax regression and k-nearest neighbors, which achieved moderate accuracy. Best model was a 1D CNN, borrowing from a text sequence classification architecture. Algorithms were tested based on raw time-series data as well as frequency data extracted by the Fourier transform and the spectral entropy of the singal. My contribution to the group was proposing and implementing the Fourier transform as a feature extraction technique, and creating a Hidden Markov Model for classification.
Member of Stanford Alexa Grand Challenge Prize Team. Investigating super secret cool NLP stuff (sorry, it's all internal for now 😊).
Led robustness study of video action recognition networks against naturally-occurring network and file corruptions. See above description under "Projects" for more details.
Examined methods for drowsiness detection in single-channel EEGs. See above description under "Projects."
Developed and taught AI Ethics curriculum, a project-based introduction to mathematical definitions of model fairness using the COMPAS ProPublica dataset. Delivered intro lecture. Taught advanced high school cohort introductory machine learning concepts (linear regression, logistic regression, neural networks)
Led 2 labs of ~20 high school students each in the Artificial Intelligence course at Stanford Pre-Collegiate Studies, reviewing concepts like search algorithms, game-playing algorithms (e.g. minimax), and reinforcement learning. Supervised and advised projects in computer vision, price prediction, sentiment analysis, and more. Wrote problem set and code solutions (Python and Unity C#) for student reference.
When not thinking about AI, I enjoy making music. I am a jazz pianist and songwriter, and am heavily involved in the music side of musical theatre at Stanford, where I am piloting a intiative for student-taught workshops in the arts as a member of the Ram's Head Theatrical Society Board of Directors.
My past musical theatre credits include Gaieties 2020: Unprecedented Times (Recording Engineer) Gaieties 2019: Midterm Impossible (Composer, Lyricist, and Music Director), Cabaret (Pianist), The Addams Family (Pianist), Gaieties 2018: Jane Stanford and the Chamber of Secrets (Pianist), The Wiz (Music Director and Pianist), Gaieties 2017: Bearanormal Activity (Pianist), Ragtime (Rehearsal Pianist), and Pippin (Assistant Producer).
Other interests include bullet chess (find me @tchainzzz), watching football, and getting 8 hours of sleep every night -- it's really good for productivity!