Calendar

Mon Tue Wed Thu Fri
3/29 PA1 out
Lecture 1: Introduction
3/30 3/31 Lecture 2:
N-gram Models
4/1 4/2
4/5
Lecture 3: LMs and StatMT
4/6 4/7
Lecture 4: EM (and StatMT)
4/8 4/9
Section 1: Smoothing
4/12 PA1 due; PA2 out
Lecture 5: StatMT Systems
4/13 4/14
Lecture 6: Phrase-based & syntactic MT
4/15 4/16
Section 2: EM
4/19
Lecture 7: IE/NER & NB Models
4/20 4/21
Lecture 8: MaxEnt Classifiers
4/22 4/23
Section 3: Corpora
4/26 PA2 due; PA3 out
Lecture 9: Sequence classifiers & IE
4/27 4/28
Lecture 10: Syntax & Parsing
4/29 4/30
Section 4: MaxEnt
5/3 Final project proposal due
Lecture 11: DPs for Parsing
5/4 5/5
Lecture 12: LPCFGs
5/6 5/7
Section 5: Parsing & PCFGs
5/10
Lecture 13: Statistical Parsers
5/11 PA3 due 5/12
Lecture 14: Grammar induction
5/13 5/14
5/17
Lecture 15: Semantic Role Labeling
5/18 5/19
Lecture 16: ComSem
5/20 5/21
5/24
Lecture 17: ComSem II
5/25 5/26
Lecture 18: Lexical Semantics
5/27 5/28
5/31
Memorial Day
6/1 6/2 Final project due
Lecture 19: QA & Inference
6/3 6/4
6/7 6/8 6/9 9:00am - 12:00am
Final project presentations
6/10 6/11


Syllabus

Lecture 1
Mon
3/29/10
Introduction [slides: pdf; pdf1up] Overview of NLP. Statistical machine translation. Language models and their role in speech processing. Course introduction and administration.
No required reading.
Optional good background reading: J&M Ch. 1; M&S 1.0-1.3, 4.1-4.2, Collaboration Policy
Optional reading on Unix text manipulation (useful skill!): Ken Church's tutorial Unix for Poets [ps, pdf]
Background for MT video [fun read!]: The IBM 701 translator (1954)
(If your knowledge of probability theory is limited, also read M&S 2.0-2.1.7. If that's too condensed, read the probability chapter of an intro statistics textbook, e.g. Rice, Mathematical Statistics and Data Analysis, ch. 1.)
Distributed today: Programming Assignment 1
Lecture 2
Wed
3/31/10
N-gram Language Models and Information Theory [slides: pdf; pdf1up; MegaHal: html]
n-gram models. Statistical estimation and smoothing for language models. Entropy, cross entropy, mutual information, perplexity.
Assigned reading: J&M ch. 4
Alternative reading: M&S 1.4, 2.2, ch. 6.
Tutorial reading: Kevin Knight. A Statistical MT Tutorial Workbook [pdf] [rtf]. MS., August 1999. Sections 1-14.
Optional advanced reading: Joshua Goodman (2001), A Bit of Progress in Language Modeling, Extended Version [pdf, ps]
Optional advanced reading: (older but shorter) Stanley Chen and Joshua Goodman (1998), An empirical study of smoothing techniques for language modeling [pdf, ps]
Optional very advanced reading: Teh, Yee Whye. 2006. A Hierarchical Bayesian Language Model based on Pitman-Yor Processes. EMNLP 2006. [pdf]
Lecture 3
Mon
4/5/10
Statistical Machine Translation (MT), Alignment Models & (and LMs continued) [slides: pdf; pdf-1up ]
Assigned reading: J&M ch. 25, sections 25.0-25.5, 25.11.
Lecture 4
Wed
4/7/10
Expectation Maximization (EM) and Statistical Alignment Models [quiz question: pdf, slides: pdf, pdf-1up, spreadsheet: xls]
EM and its use in statistical MT alignment models.
Assigned reading: Kevin Knight. A Statistical MT Tutorial Workbook [pdf] [rtf]. MS., August 1999. Sections 15-37 (get the free beer!).
(read also the relevant Knight Workbook FAQ)
Reference reading: Geoffrey J. McLachlan and Thriyambakam Krishnan. 1997. The EM Algorithm and Extensions. Wiley
Optional further reading: M&S 13.
Moore, Robert C. 2005. Association-Based Bilingual Word Alignment. In Proceedings, Workshop on Building and Using Parallel Texts: Data-Driven Machine Translation and Beyond, Ann Arbor, Michigan , pp. 1-8.
Moore, Robert C. 2004. Improving IBM Word Alignment Model 1. In Proceedings, 42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, Spain, pp. 519-526.
Section 1
Fri
4/9/10
Smoothing [notes: ppt used in the section; original xls ]
Smoothing: absolute discounting, proving you have a proper probability distribution, Good-Turing implementation. Information theory examples and intuitions. Java implementation issues.
Lecture 5
Mon
4/12/10
Putting together a complete statistical MT system [6-up slides: pdf] [1-up slides: pdf]
IBM Word alignment models. MT evaluation. Decoding and Search.
Required reading: J&M, secs 25.7-10, 25.12.
Reference: "Seminal" background reading: Brown, Della Pietra, Della Pietra, and Mercer, 2003, The Mathematics of Statistical Machine Translation: Parameter Estimation [pdf, pdf]. Computational Linguistics.
[After their work in speech and language technology, the team turned to finance.... (the original article from Bloomberg has long since disappeared...)]
Further references:
Ulrich Germann, Michael Jahr, Kevin Knight, Daniel Marcu, and Kenji Yamada. 2001. Fast Decoding and Optimal Decoding for Machine Translation. ACL.
Due today: Programming Assignment 1
Distributed today: Programming Assignment 2
Lecture 6
Wed
4/14/10
MT systems. Decoding. Phrased-based and syntactic MT. Real world MT. [6-up slides: pdf] [1-up slides: pdf]
Decoding. Recent work in statistical MT: statistical phrase based systems and syntax in MT. MT in practice.
Required reading: J&M, secs 25.7-10, 25.12.
Further references:
Franz Josef Och, Hermann Ney. 2004. The alignment template approach to statistical machine translation. Computational Linguistics 30(4): 417-449.
K. Yamada and K. Knight. 2002. A Decoder for Syntax-Based Statistical MT. ACL.
David Chiang. 2005. A hierarchical phrase-based model for statistical machine translation. ACL 2005, pages 263-270.
Section 2
Fri
4/16/10
The EM algorithm [notes: ppt xls k-means example soft k-means example]
Lecture 7
Mon
4/19/10
Information Extraction (IE) and Named Entity Recognition (NER). [6-up slides: pdf] [1-up slides: pdf]
Information sources, rule-based methods, evaluation (recall, precision). Introduction to supervised machine learning methods. Naïve Bayes (NB) classifiers for entity classification.
Assigned reading:
J&M secs 22.0-22.1 (intro to IE and NER).
J&M secs. 5.5 and 5.7 (introduce HMMs, Viterbi algorithm, and experimental technique). If you're not familiar with supervised classification and Naive Bayes, read J&M sec 20.2 before the parts of ch. 5.
Alternative reading: M&S 8.1 (evaluation), 7.1 (experimental metholdology), 7.2.1 (Naive Bayes), 10.2-10.3 (HMMs and Viterbi)
Background IE reading:
Recent Wired article on Google's search result ranking (but don't completely swallow the hype: click through on the mike siwek lawyer mi query, and read a couple of the top hits in the search results).
Sunita Sarawagi. 2008. Information Extraction. Foundations and Trends in Databases 1(3): 261-377. http:/dx.doi.org/10.1561/1900000003
Peter Jackson and Isabelle Moulinier. 2007. Natural Language Processing for Online Applications: Text Retrieval, Extraction and Categorization. John Benjamins. 2nd edition. Ch. 3.
Ion Muslea (1999), Extraction Patterns for Information Extraction Tasks: A Survey [pdf, ps], AAAI-99 Workshop on Machine Learning for Information Extraction.
Douglas E. Appelt. 1999. Introduction to Information Extraction Technology
Lecture 8
Wed
4/21/10
Maximum Entropy Classifiers [slides: pdf, pdf1up]
Assigned Reading:
class slides.
J&M secs 6.6-7 (maximum entropy models)
Additional references:
M&S section 16.2
Adwait Ratnaparkhi. A Simple Introduction to Maximum Entropy Models for Natural Language Processing. Technical Report 97-08, Institute for Research in Cognitive Science, University of Pennsylvania.
Section 3
Fri
4/23/10
Corpora and other resources [notes: ppt, pdf(2008), txt(2006)]
Lecture 9
Mon
4/26/10
Maximum Entropy Sequence Classifiers and Information Extraction [slides: 6-up pdf] [slides: 1-up pdf]
Assigned Reading:
class slides.
J&M secs. 6.0-6.4 and 6.8-6.9 (HMMs in detail and then MEMMs), and 22.2, 22.4 (IE).
Other references: Adwait Ratnaparkhi. A Simple Introduction to Maximum Entropy Models for Natural Language Processing. Technical Report 97-08, Institute for Research in Cognitive Science, University of Pennsylvania.
Adam Berger, A Brief Maxent Tutorial
HMMs for IE reading: Dayne Freitag and Andrew McCallum (2000), Information Extraction with HMM Structures Learned by Stochastic Optimization, AAAI-2000
Maxent NER reading: Jenny Finkel et al., 2005. Exploring the Boundaries: Gene and Protein Identification in Biomedical Text
Distributed today: Final project guide Due today: Programming Assignment 2
Distributed today: Programming Assignment 3
Lecture 10
Wed
4/28/10

Syntax and Parsing for Context-Free Grammars (CFGs) [slides: 6-up pdf] [slides: 1-up pdf]
Parsing, treebanks, attachment ambiguities. Context-free grammars. Top-down and bottom-up parsing, empty constituents, left recursion, and repeated work. Probabilistic CFGs.
Assigned reading: J&M ch. 13, secs. 13.0-13.3.
Background reading: J&M ch. 9 (or M&S ch. 3). This is especially if you haven't done any linguistics courses, but even if you have, there's useful information on treebanks and part-of-speech tag sets used in NLP.
Section 4
Fri
4/30/10
Maximum entropy sequence models [notes: pdf, xls]
Lecture 11
Mon
5/3/10
Dynamic Programming for Parsing [slides: 6-up pdf] [slides: 1-up pdf] Dynamic programming for parsing. The CKY algorithm. Accurate unlexicalized PCFG parsing.
Assigned reading: J&M sec. 13.4
Additional information: Dan Klein and Christopher D. Manning. 2003. Accurate Unlexicalized Parsing. ACL 2003, pp. 423-430.
Due today: final project proposals
Lecture 12
Wed
5/5/10
Lexicalized Probabilistic Context-Free Grammars (LPCFGs) [6-up slides: pdf] [1-up slides: pdf]
Lexicalization and lexicalized parsing. The Charniak, Collins/Bikel, and Petrov & Klein parsers.
Assigned reading: J&M ch. 14 (you can stop at the end of sec. 14.7, if you'd like!)
Alternative reading: M&S Ch. 11
Optional readings:
Section 5
Fri
5/7/10
Parsing, PCFGs [notes: pdf]
Lecture 13
Mon
5/10/10
Modern Statistical Parsers [6-up slides: pdf] [1-up slides: pdf] [quiz submission guide: txt]
Search methods in parsing: Agenda-based chart, A*, and "best-first" parsing. Dependency parsing. Discriminative parsing. Assigned reading: J&M ch. 14 (you can stop at the end of sec. 14.7, if you'd like!)
Alternative, less up-to-date reading: M&S 8.3, 12
Due tomorrow: Programming Assignment 3
Lecture 14
Wed
5/12/10
Grammar Induction [6-up slides: pdf] [1-up slides: pdf]
No quiz question today.
Background reading:
Lecture 15
Mon
5/17/10
Semantic Role Labeling [slides: pdf 1up-pdf]
Assigned reading: J&M secs. 19.4, 20.9
Further reading:
Daniel Gildea and Daniel Jurafsky. 2002. Automatic Labeling of Semantic Roles. Computational Linguistics 28:3, 245-288.
Kristina Toutanova, Aria Haghighi, and Christopher D. Manning, 2005. Joint Learning Improves Semantic Role Labeling. Proceedings of 43rd Annual Meeting of the Association for Computational Linguistics (ACL 2005), pp. 589-596.
Pradhan, S., Ward, W., Hacioglu, K., Martin, J., Jurafsky, D., "Semantic Role Labeling Using Different Syntactic Views", in Proceedings of the Association for Computational Linguistics 43rd annual meeting (ACL 2005), Ann Arbor, MI, June 25-30, 2005.
V. Punyakanok, D. Roth, and W. Yih, The Necessity of Syntactic Parsing for Semantic Role Labeling. Proc. of the International Joint Conference on Artificial Intellligence (IJCAI) (2005) pp. 1117-1123.
Lecture 16
Wed
5/19/10
Computational Semantics
[slides: pdf] [1-up slides: pdf]
Semantic representations, lambda calculus, compositionality, syntax/semantics interfaces, logical reasoning.
Assigned reading:
An Informal but Respectable Approach to Computational Semantics [pdf, ps]
J&M ch. 18 (you can skip secs. 18.4 and 18.6-end, if you wish).
Lecture 17
Mon
5/24/10
Compositional Semantics II [Slides: 6-up-pdf 1-up-pdf]
Semantic representations, lambda calculus, compositionality, syntax/semantics interfaces, logical reasoning.
Assigned reading:
An Informal but Respectable Approach to Computational Semantics [pdf, ps]
J&M ch. 18 (you can skip secs. 18.4 and 18.6-end, if you wish).
Further reading:
I. Androutsopoulos et al., Language Interfaces to Databases
Luke S. Zettlemoyer and Michael Collins. Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars. In Proceedings of the Twenty First Conference on Uncertainty in Artificial Intelligence (UAI-05), 2005.
Lecture 18
Wed
5/26/10
Lexical Semantics [6-up slides pdf] [1-up slides pdf]
Reading: (Okay, I'm not so naive as to think that you'll actually read this in week 9 of the quarter....) J&M secs. 19.0-9.3.
Further reading: J&M secs 20.0-20.8
Mon
5/31/10
Memorial Day
no class
Lecture 19
Wed
6/2/10
Question Answering (QA) [1-up slides: pdf]
TREC-style robust QA, textual inference
Assigned reading: J&M secs 23.0, 23.2
Further reading: Marius Pasca, Sanda M. Harabagiu. High Performance Question/Answering. SIGIR 2001: 366-374.
Due today: Final project reports
Wed
6/9/10
9:00-12:00
Final Project Presentations
Braun Auditorium. Students will give short (~3 min) presentations on their final projects during the time slot allocated for a final exam.