I am a postdoctoral researcher working with Dan Jurafsky and Daniel McFarland at Stanford University. I earned my Ph.D. from the Machine Learning Department at Carnegie Mellon University, where I was advised by Noah Smith (now at the University of Washington).

I work at the intersection of machine learning, natural language processing, and computational social science. Broadly speaking, I am interested in what we can learn about society from what people write - in the news, social media, historical documents, and more!

Updates


  • September 2019: I am moving to Stanford to start as a postdoc working with Dan Jurafsky and Daniel McFarland!
  • September 2019: New WIRED article about our "Show Your Work" paper and the broader issue of reproducibility.
  • August 2019: New paper -- Show Your Work -- to be published at EMNLP 2019!
  • June 2019: I will be attending The Summer Institute on AI and Society in Edmonton, Alberta.
  • May 2019: Two papers accepted for publication at ACL 2019!

Selected Publications


Show Your Work Figure

Show Your Work: Improved Reporting of Experimental Results
Jesse Dodge, Suchin Gururangan, Dallas Card, Roy Schwartz, Noah A. Smith
In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019.
Abstract Paper Code Press BibTeX

Research in natural language processing proceeds, in part, by demonstrating that new models achieve superior performance (e.g., accuracy) on held-out test data, compared to previous results. In this paper, we demonstrate that test-set performance scores alone are insufficient for drawing accurate conclusions about which model performs best. We argue for reporting additional details, especially performance on validation data obtained during model development. We present a novel technique for doing so: expected validation performance of the best-found model as a function of computation budget (i.e., the number of hyperparameter search trials or the overall training time). Using our approach, we find multiple recent model comparisons where authors would have reached a different conclusion if they had used more (or less) computation. Our approach also allows us to estimate the amount of computation required to obtain a given accuracy; applying it to several recently published results yields massive variation across papers, from hours to weeks. We conclude with a set of best practices for reporting experimental results which allow for robust future comparisons, and provide code to allow researchers to use our technique.



VAMPIRE Figure

Variational Pretraining for Semi-supervised Text Classification
Suchin Gururangan, Tam Dang, Dallas Card, Noah A. Smith
In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), 2019.
Abstract Paper Code BibTeX

We introduce VAMPIRE, a lightweight pretraining framework for effective text classification when data and computing resources are limited. We pretrain a unigram document model as a variational autoencoder on in-domain, unlabeled data and use its internal states as features in a downstream classifier. Empirically, we show the relative strength of VAMPIRE against computationally expensive contextual embeddings and other popular semi-supervised baselines under low resource settings. We also find that fine-tuning to in-domain data is crucial to achieving decent performance from contextual embeddings when working with limited supervision. We accompany this paper with code to pretrain and use VAMPIRE embeddings in downstream tasks.



Hatespeech Figure

The Risk of Racial Bias in Hate Speech Detection
Maarten Sap, Dallas Card, Saadia Gabriel, Yejin Choi, Noah A. Smith
In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), 2019.
Abstract Paper Press BibTeX

We investigate how annotators' insensitivity to differences in dialect can lead to racial bias in automatic hate speech detection models, potentially amplifying harm against minority populations. We first uncover unexpected correlations between surface markers of African American English (AAE) and ratings of toxicity in several widely-used hate speech datasets. Then, we show that models trained on these corpora acquire and propagate these biases, such that AAE tweets and tweets by self-identified African Americans are up to two times more likely to be labelled as offensive compared to others. Finally, we propose dialect and race priming as ways to reduce the racial bias in annotation, showing that when annotators are made explicitly aware of an AAE tweet's dialect they are significantly less likely to label the tweet as offensive.



DWAC Figure

Deep Weighted Averaging Classifiers
Dallas Card, Michael Zhang, Noah A. Smith
In Proceedings of the ACM Conference on Fairness, Accountability, and Transparency (ACM FAT*), 2019.
Abstract Paper Code Blog Post BibTeX

Recent advances in deep learning have achieved impressive gains in classification accuracy on a variety of types of data, including images and text. Despite these gains, however, concerns have been raised about the calibration, robustness, and interpretability of these models. In this paper we propose a simple way to modify any conventional deep architecture to automatically provide more transparent explanations for classification decisions, as well as an intuitive notion of the credibility of each prediction. Specifically, we draw on ideas from nonparametric kernel regression, and propose to predict labels based on a weighted sum of training instances, where the weights are determined by distance in a learned instance-embedding space. Working within the framework of conformal methods, we propose a new measure of nonconformity suggested by our model, and experimentally validate the accompanying theoretical expectations, demonstrating improved transparency, controlled error rates, and robustness to out-of-domain data, without compromising on accuracy or calibration.



Scholar Figure

Neural Models for Documents with Metadata
Dallas Card, Chenhao Tan, Noah A. Smith
In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL), 2018.
Abstract Paper Code Tutorial BibTeX

Most real-world document collections involve various types of metadata, such as author, source, and date, and yet the most commonly-used approaches to modeling text corpora ignore this information. While specialized models have been developed for particular applications, few are widely used in practice, as customization typically requires derivation of a custom inference algorithm. In this paper, we build on recent advances in variational inference methods and propose a general neural framework, based on topic models, to enable flexible incorporation of metadata and allow for rapid exploration of alternative models. Our approach achieves strong performance, with a manageable tradeoff between perplexity, coherence, and sparsity. Finally, we demonstrate the potential of our framework through an exploration of a corpus of articles about US immigration.



Proportions Figure

The Importance of Calibration for Estimating Proportions from Annotations
Dallas Card, Noah A. Smith
In Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2018.
Abstract Paper Code BibTeX

Estimating label proportions in a target corpus is a type of measurement that is useful for answering certain types of social-scientific questions. While past work has described a number of relevant approaches, nearly all are based on an assumption which we argue is invalid for many problems, particularly when dealing with human annotations. In this paper, we identify and differentiate between two relevant data generating scenarios (intrinsic vs. extrinsic labels), introduce a simple but novel method which emphasizes the importance of calibration, and then analyze and experimentally validate the appropriateness of various methods for each of the two scenarios.



Ideas Figure

Friendships, Rivalries, and Trysts: Characterizing Relations between Ideas in Texts
Chenhao Tan, Dallas Card, Noah A. Smith
In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL), 2017.
Abstract Paper Blog Post BibTeX

Understanding how ideas relate to each other is a fundamental question in many domains, ranging from intellectual history to public communication. Because ideas are naturally embedded in texts, we propose the first framework to systematically characterize the relations between ideas based on their occurrence in a corpus of documents, independent of how these ideas are represented. Combining two statistics - cooccurrence within documents and prevalence correlation over time - our approach reveals a number of different ways in which ideas can cooperate and compete. For instance, two ideas can closely track each other's prevalence over time, and yet rarely cooccur, almost like a "cold war" scenario. We observe that pairwise cooccurrence and prevalence correlation exhibit different distributions. We further demonstrate that our approach is able to uncover intriguing relations between ideas through in-depth case studies on news articles and research papers.



Personas Figure

Analyzing Framing through the Casts of Characters in the News
Dallas Card, Justin H. Gross, Amber E. Boydstun, Noah A. Smith
In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2016.
Abstract Paper BibTeX

We present an unsupervised model for the discovery and clustering of latent "personas" (characterizations of entities). Our model simultaneously clusters documents featuring similar collections of personas. We evaluate this model on a collection of news articles about immigration, showing that personas help predict the coarse-grained framing annotations in the Media Frames Corpus. We also introduce automated model selection as a fair and robust form of feature evaluation.



Media Frames Corpus Figure

The Media Frames Corpus: Annotations of Frames Across Issues
Dallas Card, Amber E. Boydstun, Justin H. Gross, Philip Resnik, Noah A. Smith
In Proceedings of the 53th Annual Meeting of the Association for Computational Linguistics (ACL), 2015.
Abstract Paper Data BibTeX

We describe the first version of the Media Frames Corpus: several thousand news articles on three policy issues, annotated in terms of media framing. We motivate framing as a phenomenon of study for computational linguistics and describe our annotation process.



About me

I'm originally from Winnipeg, but I have also lived in Toronto, Waterloo, Halifax, Sydney, Kampala, Pittsburgh, Seattle, and now Palo Alto!

I am an occasional guest on The Reality Check podcast! You can hear me in episodes #466 (biased algorithms), #382 (deep learning), #362 (Simpson's paradox), and #227 (fMRI and vegetative states).

I love to travel and sometimes I write about it.


GitHub Icon Twitter Icon Google Scholar Icon Google Scholar Icon