Christopher Potts +> Code & Data

Linguistic sentiment analysis

The Cards corpus of collaborative task-oriented dialogues

A highly structured corpus of 744 task-oriented dialogues collected with the goal of informing models of pragmatics and discourse. The corpus distribution includes Python and R code for working with the corpus as well as a slideshow documenting its properities and reporting on some pilot studies. [bibtex]

PragBank

Extends FactBank with reader-based veridicality annotations at the level of utterance meaning. [bibtex]

Linguistic Oddities

This collection of examples consists mostly of oddities I found while reading. The emphasis is on example-types that would be very hard to find using standard search techniques. Includes a form for submitting new examples! [bibtex]

Embedded appositives

An annotated collection of 278 sentences containing appositives embedded syntactically in the complement of propositional attitude predicates and verbs of saying, drawn from 177 million words of novels, newspaper articles, and TV transcripts. Intended to inform work on appositives, conventional implicatures, and textual entailment. Includes a Javascript interface, an XML corpus, and a short write-up describing the data and their theoretical relevance. [bibtex]

Wait a minute! What kind of discourse strategy is this?

A lightly annotated collection of 439 examples, drawn from 77 million words of CNN television transcripts, involving Wait a minute. Intended to inform work on presuppositions. Includes a Javascript interface, an XML corpus, and a short write-up describing the data and their theoretical relevance. [bibtex]

Computational phonology

Calculators