Expressive content and the semantics of contexts
NSF Grant No. BCS-0642752
- UMass Amherst Linguistics Sentiment Corpora
- N-gram counts extracted from over 700,000 online product reviews in Chinese, English, German, and Japanese. The files are UTF-8 encoded text. They are formatted to be read in as R data frames, but they can easily be manipulated with other tools.
- Embedded appositives
- An annotated collection of 278 sentences containing appositives embedded syntactically in the complement of propositional attitude predicates and verbs of saying,
drawn from 177 million words of novels, newspaper articles, and TV transcripts. Intended to inform work on appositives, conventional implicatures, and textual entailment.
- Wait a minute! What kind of discourse strategy is this?
- A lightly annotated collection of 439 examples, drawn from 77 million words of CNN television transcripts, involving Wait a minute.