Hello! I'm Ben Newman, currently a Computer Science MS student at Stanford working with the Stanford NLP group. I'm interested in understanding how NLP systems learn and process human language and the role language technologies can play in society more broadly.
I've worked on projects analyzing models' abilities to extrapolate, process syntax, and communicate. I've also thought about how language technologies can usefully augment human language abilities, both for scientific discovery (e.g. testing psycholinguistic hypotheses) and in political science (e.g. countering minsinformation). I've been a course assistant for two of Stanford's NLP classes, CS124 and CS224N, and have co-taught courses in Introductory Linguistics and Computing Fundamentals to high schoolers at Stanford Splash.
P-Adapters: Robustly Extracting Factual Information from Language Models with Diverse Prompts
Benjamin Newman, Prafulla Kumar Choubey, Nazneen Rajani
ICLR 2022 (Poster)
[pdf][tldr]
Querying Pretrained NLP models with semantically equivalent prompts (e.g. "[MASK] is Canada's capital" and "Canada, which has capital city [MASK]") do not necessarily lead to the same predictions. Can low-parameter fine-tuning methods increase model consistency? Yes! And they don't need as many annotations as other methods.
Conducted during an internship at Salesforce Research
Refining Targeted Syntactic Evaluation
Benjamin Newman, Kai Siang-Ang, Julia Gong, John Hewitt
NAACL 2021
[pdf] [code] [cite] [blog] [tldr]
How should we evaluate the syntactic understanding of NLP systems? We argue for evaluating models' likely behavior and systematicty, and build off of minimal pair evaluation to address these goals. We find models are better at conjugating verbs they deem likely.
The EOS Decision and Length Extrapolation
Benjamin Newman, John Hewitt, Percy Liang and Chris Manning
Blackbox NLP@EMNLP 2020 (Outstanding Paper)
Why do sequence models struggle to extrapolate? For many reasons, but the decision to train models with End of Sequence tokens at the end of training examples is one of them. We investigate and visualize the effect that this decision has on neural models' extrapolative abilities.
Communication-based Evaluation for Natural Language Generation
Benjamin Newman, Reuben Cohn-Gordon, and Christopher Potts
Society for Computation in Linguistics@LSA 2020
Do n-gram overlap metrics like BLEU capture whether the models are successful communicators? Not really, so we introduce a new method for evaluating communicative effectiveness based on the Rational Speech Acts framework.
Conducted during CS224U and the Center for the Study of Language and Information (CSLI) summer internship.
On the Opportunities and Risks of Foundation Models
Bommasani et al., 2021
Misuse Section: Antoine Bousselut*, Shelby Grossman*, and Benjamin Newman [pdf]
Unsupervised Recovery of Tree Metrics from Learned Representations
Representations from pretrained language models likely incorporate syntax, but can we access it without training supervised probes? [pdf]
CS229: Machine Learning. Final Project (2019).
English-Chinese Name Machine Transliteration Using Search and Neural Networks
What's your name in Chinese? Translating a name is different from translating an article because name translations are based in phoenetics and lack large corpera. We explore two approaches. [pdf] [code]
CS221: Artificial Intelligence: Principles and Techniques: Final Project with Julia Gong (2018).
Using POMDPs to Learn Language in a Spatial Reference Game
How can you teach computational agents to follow directions without defining what each instruction means? POMDPs! [pdf] [code]
CS238: Decision Making under Uncertainty: Final Project with Suvir Mirchandani and Levi Lian (2018)
Swear Analysis
What we can learn about people's use of swears by looking at their word2vec and GLOVE embeddings? [pdf]
Linguist 130A: Semantics and Pragmatics: Final Project with Julia Gong (2018)
Zero Width Space Encrypter
Hiding secret messages in HTML zero-width space characters. Demo here!