|
|
CS 224N -- Ling 237 |
|
Course Syllabus |
(updated 4/03/2003) |
|
Date |
Topic |
Out |
Due |
|
Week 1 |
|
|
|
|
Wednesday, 2 Apr 03 |
What is NLP? History; current applications and topics. Why does Chris pronounce 'parsing' funny? |
|
|
|
Topics: Course introduction and administration. What is NLP? Brief history of and discussion of current topics, approaches, and applications. Need for language understanding. Rule based approaches to linguistic structure. How to find sentence structure: parsing as search. Reading: Could read M&S Sec 1.0-1.3 for intro. |
|
|
|
|
Week 2 |
|
|
|
|
Monday, 7 Apr 03 |
NLP Parsing as search and dynamic programming of parsing |
HW #1 |
|
|
Readings: handout, Gazdar and Mellish (1989) pp. 143-155; M&S Ch. 3 [if you haven't done any linguistics courses] or J&M Ch. 9 References: J&M Ch. 10 Topics: top-down parsing, bottom-up parsing; empty constituents, and left-recursive rules |
|
|
|
|
Wednesday, 9 Apr 03 |
Dynamic programming methods of parsing, weighted grammar rule parsing |
PP #1 |
|
|
Readings: handout, Gazdar and Mellish (1989) pp. 179-199 References: J&M Ch. 10 |
|
|
|
|
Section |
Parsing algorithms |
|
|
|
|
|
|
|
|
Week 3 |
|
|
|
|
Monday, 14 Apr 03 |
n-gram models of language |
|
HW #1 |
|
Readings: M&S Section 1.4.0-1.4.3, Chapter 6
[really it'd be good to glance through all of it, but pay particular
attention to things we covered in class!]. If you are rusty or have
little knowledge of probabilty theory, also read Ch. 2, sec 2.0-2.1.7.
If that's too condensed, read the probability chapter of an intro
statistics textbook, for instance, Rice, Mathematical Statistics and
Data Analysis, ch. 1. You're dormmate probably has a copy.
Stanley Chen and Joshua Goodman. 1998. An empirical study
of smoothing techniques for language modeling. Technical report TR-10-98,
Harvard University, August 1998. |
|
|
|
|
Wednesday, 16 Apr 03 |
Word Sense Disambiguation: Naïve Bayes methods |
|
|
|
Readings: Tom Mitchell Machine Learning, pp. 177-184, M&S Sec 7.0-7.3, Sec 7.5 |
|
|
|
|
Section |
Accessing corpora at Stanford; linguistic annotation; Unix text tools |
|
|
|
|
|
|
|
|
Week 4 |
|
|
|
|
Monday, 21 Apr 03 |
POS tagging and Hidden Markov Models |
|
PP #1 |
|
Readings: M&S Sec 10.0-10.2; Sec 9.0-9.3.2 Topics: Part of speech tagging. Available information sources. Markov models. Fundamental algorithms for hidden Markov models: determining the probability of an observed sequence, and the maximum probability state sequence (the Viterbi algorithm). |
|
|
|
|
Wednesday, 23 Apr 03 |
Named Entity Recognition, Information Extraction and Hidden Markov Models |
HW #2 |
|
|
Readings: Dayne Freitag and Andrew McCallum. 2000.
Information Extraction with HMM Structures Learned by Stochastic
Optimization. AAAI-2000. Topics: extracting semantic tokens (names of people, companies, prices, times, etc.) from text, use of cascades, identifying collocations and terminological phrases. Machine learning methods for IE over annotated data. Autoslog and HMM-based techniques. Reference: Ion Muslea: "Extraction Patterns for Information Extraction Tasks: A Survey", AAAI-99 Workshop on Machine Learning for Information Extraction. |
|
|
|
|
Section |
Hidden Markov Models workshop |
|
|
|
Topics: Working through HMMs |
|
|
|
|
Week 5 |
|
|
|
|
Monday, 28 Apr 03 |
POS Tagging and similar sequence problems continued |
PP #2 |
|
|
Readings: M&S from section 9.3.3-9.5. Topics: Other approaches to and issues that arise in part of speech tagging. Unknown words. Different tagsets. Baum-Welch reestimation of parameters of HMM. The limited usefulness of this in part of speech tagging. Successful use in IE. EM as data clustering. |
|
|
|
|
Wednesday, 30 Apr 03 |
Conditional/discriminative models applied to sequence tasks |
|
HW #2 |
|
Conditional markov model/maximum entropy model/discriminative sequence model techniques applied to problems of part-of-speech tagging and named entity recognition. |
|
|
|
|
Section |
Information extraction for the web: wrapper induction and related techniques |
|
|
|
|
|
|
|
|
Week 6 |
|
|
|
|
Monday, 5 May 03 |
Probabilistic Context-Free Grammars |
FinalP |
|
|
Readings: M&S chapter 11
through section 11.3.3 |
|
|
|
|
Wednesday, 7 May 03 |
Probabilistic parsing and attachment ambiguities |
|
PP #2 |
|
Readings: M&S chapter 11 from section 11.3.4, chapter
12 through section 12.1.7, sec 8.3. Reference: Eugene Charniak. A Maximum-Entropy-Inspired Parser Proceedings of NAACL-2000. Eugene Charniak. Statistical techniques for natural language parsing AI Magazine. (1997). Eugene Charniak. Statistical parsing with a context-free grammar and word statistics, Proceedings of the Fourteenth National Conference on Artificial Intelligence AAAI Press/MIT Press, Menlo Park (1997). |
|
|
|
|
Section |
Project discussion |
|
|
|
|
|
|
|
|
Week 7 |
|
|
|
|
Monday, 12 May 03 |
Building semantic representations (1) |
HW #3 |
FinalP Abstract |
|
Readings: handout Reference: J&M Ch. 15, |
|
|
|
|
Wednesday, 14 May 03 |
Building semantic representations (2) |
|
|
|
Readings: handout Reference: I. Androutsopoulos et al. Language Interfaces
to Databases http://citeseer.nj.nec.com/androutsopoulos95natural.html |
|
|
|
|
Section |
Semantic representations and logical reasoning |
|
|
|
|
|
|
|
|
Week 8 |
|
|
|
|
Monday, 19 May 03 |
Building semantic representations (3) |
|
HW #3 |
|
Interface to knowledge representations. Lexical semantics: WordNet |
|
|
|
|
Wednesday, 21 May 03 |
Dialogue and discourse systems; planning and requests |
|
|
|
Readings: handout Reference: Gazdar & Mellish, ch. 10 |
|
|
|
|
Section |
none |
|
|
|
|
|
|
|
|
Week 9 |
|
|
|
|
Monday, 26 May 03 |
Memorial Day holiday - no class |
|
|
|
|
|
|
|
|
Wednesday, 28 May 03 |
Machine translation: rule-based and statistical approaches; sentence alignment |
|
|
|
Readings: M&S chapter 13.1-2 |
|
|
|
|
Section |
none |
|
|
|
|
|
|
|
|
Week 10 |
|
|
|
|
Monday, 2 Jun 03 |
Statistical machine translation |
|
FinalP |
|
Readings: M&S chapter 13.3, Kevin Knight. A Statistical MT Tutorial Workbook. ms., August 1999. |
|
|
|
|
Wednesday, 4 Jun 03 |
Project Mini Presentations. |
|
|
|
|
|
|
|
|
Finals Period - time to visit the beach! |
|
|
|