Calendar

Mon Tue Wed Thu Fri
4/3 4/4 4/5
Lecture 1: Intro
4/6 4/7
4/10
Lecture 2: N-gram Models
4/11 4/12
Lecture 3: StatMT
4/13 4/14
Section 1: Smoothing
4/17
Lecture 4: StatMT & EM
4/18 4/19 PA1 due
Lecture 5: StatMT Systems
4/20 4/21
Section 2: EM
4/24
Lecture 6: WSD & NB Models
4/25 4/26
Lecture 7: MaxEnt Classifiers
4/27 4/28
Section 3: MaxEnt
5/1
Lecture 8: MaxEnt Classifiers II
5/2 5/3 PA2 due
Lecture 9: CFG Parsing
5/4 5/5
Section 4: Corpora
5/8
Lecture 10: DPs for Parsing
5/9 5/10
Lecture 11: PCFGs
5/11 5/12
Section 5: Parsing & PCFGs
5/15
Lecture 12: StatParsers
5/16 5/17 PA3 due
Lecture 13: POS tagging
5/18 5/19
5/22
Lecture 14: NER & IE
5/23 5/24
Lecture 15: ComSem
5/25 5/26
5/29
Memorial Day
5/30 5/31
Lecture 16: ComSem II
6/1 6/2
6/5
Lecture 17: QA Systems
6/6 6/7 Final project due
Lecture 18: Dialog & Discourse
6/8 6/9
6/12 6/13 6/14 8:30am - 11:30am
Final project presentations
6/15 6/16


Syllabus

Lecture 1
Wed
4/5/06
Introduction [slides: pdf, ps]
Overview of NLP. Statistical machine translation. Language models and their role in speech processing. Course introduction and administration.
Good background reading: M&S 1.0-1.3, 4.1-4.2, Collaboration Policy
Optional reading: Ken Church's tutorial Unix for Poets [ps, pdf]
(If your knowledge of probability theory is limited, also read M&S 2.0-2.1.7. If that's too condensed, read the probability chapter of an intro statistics textbook, e.g. Rice, Mathematical Statistics and Data Analysis, ch. 1.)
Distributed today: Programming Assignment 1
Lecture 2
Mon
4/10/06
N-gram Language Models and Information Theory [slides: ps pdf] MegaHal]
n-gram models. Entropy, relative entropy, cross entropy, mutual information, perplexity. Statistical estimation and smoothing for language models.
Assigned reading: M&S 1.4, 2.2, ch. 6.
Optional reading: Joshua Goodman (2001), A Bit of Progress in Language Modeling, Extended Version [pdf, ps]
Optional reading: Stanley Chen and Joshua Goodman (1998), An empirical study of smoothing techniques for language modeling [pdf, ps]
Lecture 3
Wed
4/12/06
Statistical Machine Translation (MT), Alignment Models [slides: ppt, pdf, ps ]
Assigned reading: Kevin Knight, A Statistical MT Tutorial Workbook [rtf]. MS., August 1999. (see also the relevant FAQ)
Further reading: M&S 13
Section 1
Fri
4/14/06
Smoothing [notes: xls ]
Smoothing: absolute discounting, proving you have a proper probability distribution, Good-Turing implementation. Information theory examples and intuitions. Java implementation issues.
Lecture 4
Mon
4/17/06
Statistical Alignment Models and Expectation Maximization (EM) [slides: pdf, spreadsheet: xls]
EM and its use in statistical MT alignment models.
Reference reading: Geoffrey J. McLachlan and Thriyambakam Krishnan. 1997. The EM Algorithm and Extensions. Wiley
Further reading: Moore, Robert C. 2005. Association-Based Bilingual Word Alignment. In Proceedings, Workshop on Building and Using Parallel Texts: Data-Driven Machine Translation and Beyond, Ann Arbor, Michigan , pp. 1-8.
Moore, Robert C. 2004. Improving IBM Word Alignment Model 1. In Proceedings, 42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, Spain, pp. 519-526.
Lecture 5
Wed
4/19/06
Putting together a complete statistical MT system [slides: pdf]
Decoding and A* Search. Recent work in statistical MT.
Further reading: Brown, Della Pietra, Della Pietra, and Mercer, The Mathematics of Statistical Machine Translation: Parameter Estimation [pdf, pdf]. Computational Linguistics.
Ulrich Germann, Michael Jahr, Kevin Knight, Daniel Marcu, and Kenji Yamada. 2001. Fast Decoding and Optimal Decoding for Machine Translation. ACL.
K. Yamada and K. Knight. 2002. A Decoder for Syntax-Based Statistical MT. ACL.
David Chiang. 2005. A hierarchical phrase-based model for statistical machine translation. ACL 2005, pages 263-270.
Due today: Programming Assignment 1
Distributed today: Programming Assignment 2
Section 2
Fri
4/21/06
The EM algorithm [notes: xls]
Lecture 6
Mon
4/24/06
Word Sense Disambiguation (WSD) and Naïve Bayes (NB) Models [slides: pdf]
Information sources, performance bounds, dictionary methods, supervised machine learning methods, Naïve Bayes classifiers.
Assigned Reading: M&S Ch. 7.
Reference: Computational Linguistics 24(1), 1998. Special issue on Word Sense Disambiguation.
Proceedings of Senseval-3: The Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text
Lecture 7
Wed
4/26/06
Maximum Entropy Classifiers [slides: pdf]
Assigned Reading: class slides.
Other references: Adwait Ratnaparkhi. A Simple Introduction to Maximum Entropy Models for Natural Language Processing. Technical Report 97-08, Institute for Research in Cognitive Science, University of Pennsylvania.
M&S section 16.2
Section 3
Fri
4/28/06
Maximum entropy models [notes: pdf, xls]
Lecture 8
Mon
5/1/06
Maximum Entropy Classifiers, Part II [slides: pdf]
Assigned Reading: class slides.
Other references: Adwait Ratnaparkhi. A Simple Introduction to Maximum Entropy Models for Natural Language Processing. Technical Report 97-08, Institute for Research in Cognitive Science, University of Pennsylvania.
M&S section 16.2
Adam Berger, A Brief Maxent Tutorial Distributed today: Final project guide
Lecture 9
Wed
5/3/06
Parsing for Context-Free Grammars (CFGs) [slides: pdf]
Top-down parsing, bottom-up parsing, empty constituents, left recursion.
Background reading: M&S 3 (if you haven't done any linguistics courses) or J&M ch. 9
Optional reading: J&M ch. 10
Due today: Programming Assignment 2
Distributed today: Programming Assignment 3
Section 4
Fri
5/5/06
Corpora and other resources [notes: txt]
Lecture 10
Mon
5/8/06
Dynamic Programming for Parsing [handout: pdf]
Dynamic programming methods, chart parsing, the CKY algorithm.
Optional reading: J&M ch. 10
Lecture 11
Wed
5/10/06
Probabilistic Context-Free Grammars (PCFGs) [slides: pdf (probparse), pdf (search), pdf (unlexicalized)]
PCFGs, finding the most likely parse, refining PCFGs. Other questions for PCFGs: the inside-outside algorithm, and learning PCFGs.
Assigned reading: M&S Ch. 11
Due today: final project proposals
Section 5
Fri
5/12/06
Parsing, PCFGs [notes: pdf]
Lecture 12
Mon
5/15/06
Modern Statistical Parsers [slides: see last time, and pdf]
Parsing for disambiguation, weakening independence assumptions, lexicalization, search methods, Charniak's parser, probabilistic left corner grammars, parser evaluation.
Assigned reading: M&S 8.3, 12
Optional readings:
Lecture 13
Wed
5/17/06
Part of Speech Tagging and Sequence Inference [slides: pdf]
Parts of speech and the tagging problem: sources of evidence; easy and difficult cases. Probabilistic sequence inference: Hidden Markov Models (HMMs), Conditional Markov Models (CMMs), and the Viterbi algorithm.
Assigned reading: M&S Ch. 10, pp. 341-356.
Further reading on HMMs: M&S Ch. 9.
HMM POS tagger: Thorsten Brants, TnT - A Statistical Part-of-Speech Tagger, ANLP 2000.
CMM POS tagger: Kristina Toutanova and Christopher D. Manning. 2000. Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger. EMNLP 2000.
Due today: Programming Assignment 3
Lecture 14
Mon
5/22/06
Named Entity Recognition (NER) and Information Extraction (IE) [slides: pdf]
Evaluation reading: M&S 8.1
HMMs for IE reading: Dayne Freitag and Andrew McCallum (2000), Information Extraction with HMM Structures Learned by Stochastic Optimization, AAAI-2000
Maxent NER reading: Jenny Finkel et al., 2005. Exploring the Boundaries: Gene and Protein Identification in Biomedical Text
Background IE reading: Ion Muslea (1999), Extraction Patterns for Information Extraction Tasks: A Survey [pdf, ps], AAAI-99 Workshop on Machine Learning for Information Extraction.
Background IE reading: Douglas E. Appelt. 1999. Introduction to Information Extraction Technology
Lecture 15
Wed
5/24/06
Compositional Semantics [slides: pdf]
Semantic representations, lambda calculus, compositionality, syntax/semantics interfaces, logical reasoning.
Assigned reading: An Informal but Respectable Approach to Computational Semantics [pdf, ps]
Mon
5/29/06
Memorial Day
no class
Lecture 16
Wed
5/31/06
Compositional Semantics, Part II [slides: see last time]
Further reading: I. Androutsopoulos et al., Language Interfaces to Databases
Luke S. Zettlemoyer and Michael Collins. Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars. In Proceedings of the Twenty First Conference on Uncertainty in Artificial Intelligence (UAI-05), 2005.
Lecture 17
Mon
6/5/06
Question Answering (QA) [handout: pdf]
TREC-style robust QA, natural language database interfaces
Assigned reading: Marius Pasca, Sanda M. Harabagiu. High Performance Question/Answering. SIGIR 2001: 366-374.
Lecture 18
Wed
6/7/06
Dialog & Discourse Systems [handout: pdf]
Rhetorical structure, planning and requests.
Assigned reading: handout
Optional reading: Gazdar & Mellish ch. 10
Due today: Final project reports
Wednesday
6/14/06
8:30am - 11:30am
Final Project Presentations
Students will give short (~5 min) presentations on their final projects during the time slot allocated for a final exam.