Before joining as a PhD student, I spent two years pursing a master's degree in Artificial Intelligence at Stanford broadening my understanding with a variety of courses and conducting research with Prof. Jure Leskovec, Robert West and Austin Benson.
( PUBLICATIONS ) Human-like informative conversations via conditional mutual information
10 Mar 2021
The goal of this work is to build a dialogue agent that can weave new factual content into conversations as naturally as humans. We draw insights from linguistic principles of...
( NEWS ) 2nd place in Alexa Prize Socialbot Competition (2020)
04 Jul 2020
Our team Chirpy Cardinal stood 2nd place in Alexa Prize 2020. Check out our technical paper and get it touch if you want to know more!
( PUBLICATIONS ) Neural Generation Meets Real People: Towards Emotionally Engaging Mixed-Initiative Conversations
03 Jul 2020
We present Chirpy Cardinal, an open-domain dialogue agent, as a research plat- form for the 2019 Alexa Prize competition. Building an open-domain socialbot that talks to real people is challenging...
Human-like informative conversations via conditional mutual information
Ashwin Paranjape and Christopher D. Manning
North American Chapter of the Association for Computational Linguistics (NAACL), 2021
The goal of this work is to build a dialogue agent that can weave new factual content into conversations as naturally as humans. We draw insights from linguistic principles of conversational analysis and annotate human-human conversations from the Switchboard Dialog Act Corpus, examinining how humans apply strategies for acknowledgement, transition, detail selection and presentation. However, when current chatbots (explicitly provided with new factual content) introduce facts in a
conversation, their generated responses do not acknowledge the prior turns. This is because, while current methods are trained with two contexts, new factual content and conversational history, we show that their generated responses are not simultaneously specific to both the contexts and in particular, lack specificity w.r.t conversational history.
We propose using pointwise conditional mutual information (pcmi) to measure specificity w.r.t. conversational history. We show that responses that have a higher pcmi_h are judged by human evaluators to be better at acknowledgement 74% of the time.
To show its utility in improving overall quality, we compare baseline responses that maximize pointwise mutual information (Max. PMI) with our alternative responses (Fused-PCMI) that trade off pmi for pcmi_h and find that human evaluators prefer Fused-PCMI 60% of the time.
Neural Generation Meets Real People: Towards Emotionally Engaging Mixed-Initiative Conversations
Ashwin Paranjape*, Abigail See*, Kathleen Kenealy, Haojun Li, Amelia Hardy, Peng Qi, Kaushik Ram Sadagopan, Nguyet Minh Phu, Dilara Soylu, Christopher D. Manning
Alexa Prize Proceedings 2020
We present Chirpy Cardinal, an open-domain dialogue agent, as a research plat- form for the 2019 Alexa Prize competition. Building an open-domain socialbot that talks to real people is challenging – such a system must meet multiple user expectations such as broad world knowledge, conversational style, and emotional connection. Our socialbot engages users on their terms – prioritizing their interests, feelings and autonomy. As a result, our socialbot provides a responsive, person- alized user experience, capable of talking knowledgeably about a wide variety of topics, as well as chatting empathetically about ordinary life. Neural generation plays a key role in achieving these goals, providing the backbone for our con- versational and emotional tone. At the end of the competition, Chirpy Cardinal progressed to the finals with an average rating of 3.6/5.0, a median conversation duration of 2 minutes 16 seconds, and a 90th percentile duration of over 12 minutes.
Importance sampling for unbiased on-demand evaluation of knowledge base population
Arun Tejasvi Chaganty*, Ashwin Pradeep Paranjape*, Percy Liang and Christopher D. Manning
Empirical Methods in Natural Language Processing (EMNLP) 2017
Knowledge base population (KBP) systems take in a large document corpus and extract entities and their relations. Thus far, KBP evaluation has relied on judgements on the pooled predictions of existing systems. We show that this evaluation is problematic: when a new system predicts a previously unseen relation, it is penalized even if it is correct. This leads to significant bias against new systems, which counterproductively discourages innovation in the field. Our first contribution is a new importance-sampling based evaluation which corrects for this bias by annotating a new system’s predictions on-demand via crowdsourcing. We show this eliminates bias and reduces variance using data from the 2015 TAC KBP task. Our second contribution is an implementation of our method made publicly available as an online KBP evaluation service. We pilot the service by testing diverse state-of-the-art systems on the TAC KBP 2016 corpus and obtain accurate scores in a cost effective manner.
Stanford at TAC KBP 2017:Building a Trilingual Relational Knowledge Graph
Arun Tejasvi Chaganty*, Ashwin Paranjape*, Jason Bolton*, Matthew Lamm*, Jinhao Lei*, Abigail See*, Kevin Clark, Yuhao Zhang, Peng Qi, Christopher D Manning
Text Analysis Conference (TAC) 2017
We describe Stanford’s entries in the TACKBP 2017 Cold Start Knowledge Base Population and Slot Filling challenges.Our biggest contribution is an entirely new Spanish entity detection and relation extraction system for the cross-lingual relation extraction tracks. This new Spanish system is a simple system that uses CRF-based entity recognition supplemented by gazettes followed by several rules-based relation extractors, some using syntactic structure. We make further improvements to our systems for other languages, including improved named entity recognition, a new neural relation extractor, and better support for nested mentions and discussion forum documents. We also experimented with data fusion with entity linking systems from entrants in the TACKBP Entity Discovery and Linking challenge. Under the official 2017 macro-averaged MAP all hops score measure, Stanford’s 2017 English, Chinese, Spanish and cross-lingual submissions achieved overall scores of 0.202, 0.124, 0.123, and 0.073, respectively. Under the macro-averaged LDC-MEAN all hops F1 measure used in previous years, the corresponding scores were 0.254, 0.188, 0.186,and 0.117 respectively.
Motifs in Temporal Networks.
Ashwin Paranjape*, Austin Benson*, Jure Leskovec
Tenth ACM International Conference on Web Search and Data Mining (WSDM), 2017.
Networks are a fundamental tool for modeling complex systems in a variety of domains including social and communication networks as well as biology and neuroscience. Small subgraph patterns in networks, called network motifs, are crucial to understanding the structure and function of these systems. However, the role of network motifs in temporal networks, which contain many timestamped links between the nodes, is not yet well understood.
Here we develop a notion of a temporal network motif as an elementary unit of temporal networks and provide a general methodology for counting such motifs. We define temporal network motifs as induced subgraphs on sequences of temporal edges, design fast algorithms for counting temporal motifs, and prove their runtime complexity. Our fast algorithms achieve up to 56.5x speedup compared to a baseline method. Furthermore, we use our algorithms to count temporal motifs in a variety of networks. Results show that networks from different domains have significantly different motif counts, whereas networks from the same domain tend to have similar motif counts. We also find that different motifs occur at different time scales, which provides further insights into structure and function of temporal networks.
Stanford at TAC KBP 2016: Sealing Pipeline Leaks and Understanding Chinese
Yuhao Zhang*, Arun Tejasvi Chaganty*, Ashwin Paranjape*, Danqi Chen*, Jason Bolton*, Peng Qi*, Christopher D Manning
Text Analysis Conference (TAC) 2016
We describe Stanford’s entries in the TACKBP 2016 Cold Start Slot Filling and Knowledge Base Population challenge. Our biggest contribution is an entirely new Chinese entity detection and relation extraction system for the new Chinese and cross-lingual relation extraction tracks. This new system consists of several rules-based relation extractors and a distantly supervised extractor. We also analyze errors produced by our existing mature English KBP system, which leads to several fixes, notably improvements to our patterns-based extractor and neural network model, support for nested mentions and inferred relations. Stanford’s 2016 English, Chinese andcross-lingual submissions achieved an over-all (macro-averaged LDC-MEAN) F1 of 22.0,14.2, and 11.2 respectively on the 2016 evaluation data, performing well above the median entries, at 7.5, 13.2 and 8.3 respectively.
Improving Website Hyperlink Structure Using Server Logs
Ashwin Paranjape*, Robert West*, Jure Leskovec, Leila Zia
9th ACM International Conference on Web Search and Data Mining (WSDM), 2016
Good websites should be easy to navigate via hyperlinks, yet main-
taining a link structure of high quality is difficult. Identifying pairs
of pages that should be linked may be hard for human editors, es-
pecially if the site is large and changes are frequent. Further, given
a set of useful link candidates, the task of incorporating them into
the site can be expensive, since it typically involves humans edit-
ing pages. In the light of these challenges, it is desirable to de-
velop data-driven methods for partly automating the link placement
task. Here we develop an approach for automatically finding useful
hyperlinks to add to a website. We show that passively collected
server logs, beyond telling us which existing links are useful, also
contain implicit signals indicating which nonexistent links would
be useful if they were to be introduced. We leverage these signals
to model the future usefulness of as yet nonexistent links. Based on
our model, we define the problem of link placement under budget
constraints and propose an efficient algorithm for solving it. We
demonstrate the effectiveness of our approach by evaluating it on
Wikipedia, a large website for which we have access to both server
logs (used for finding useful new links) and the complete revision
history (used as ground truth). As our method is based exclusively
on standard server logs, it may also be applied to any other website,
as we show at the example of the biomedical research site Simtk.
Mining Missing Hyperlinks from Human Navigation Traces- A Case Study of Wikipedia
Robert West, Ashwin Paranjape, and Jure Leskovec
24th International World Wide Web Conference (WWW'15), pp. 1242–1252, Florence, Italy, 2015.
Here we propose a novel approach to identifying missing links in Wikipedia. We build on the fact that the ultimate purpose of Wikipedia links is to aid navigation. Rather than merely suggesting new links that are in tune with the structure of existing links, our method finds missing links that would immediately enhance Wikipedia’s navigability. We leverage data sets of navigation paths collected through a Wikipedia-based human-computation game in which users must find a short path from a start to a target article by only clicking links encountered along the way. We harness human navigational traces to identify a set of candidates for missing links and then rank these candidates. Experiments show that our procedure identifies missing links of high quality
Unsupervised Word Sense Disambiguation Using Markov Random Field and Dependency Parser.
Devendra Chaplot, Pushpak Bhattacharya and Ashwin Paranjape
29th AAAI Conference on Artificial Intelligence (AAAI), pp. 2217-2223, 2015.
Word Sense Disambiguation is a difficult problem to solve in the unsupervised setting. This is because in this setting inference becomes more dependent on the interplay between different senses in the context due to unavailability of learning resources. Using two basic ideas, sense dependency and selective dependency, we model the WSD problem as a Maximum A Posteriori (MAP) Inference Query on a Markov Random Field (MRF) built using WordNet and Link Parser or Stanford Parser.
To the best of our knowledge this combination of dependency and MRF is novel, and our graph-based unsupervised WSD system beats state-of-the-art system on SensEval-2, SensEval-3 and SemEval-2007 English all-words datasets while being over 35 times faster.