Grammar Engineering
Ling 187/287
Logistics
Instructors: | Ron Kaplan |
Martin Forst |
Tracy Holloway King |
| ron "dot" kaplan "at" microsoft "dot" com |
mforst "at" parc "dot" com |
Tracy "dot" King "at" microsoft "dot" com |
| | 650 812-4788 | 415 848-7276 |
Class time: | Monday 2:15-5:05 (Winter 2009) |
Class location: |
260-002 NOTE: first class in 200-015 |
Office hours: | by appointment |
| Also, you can ask questions after class on Monday or by email or phone. |
Course Description
Grammar Engineering -- Hands-on
introduction to basic techniques for implementing large-scale
linguistic grammars drawing on a combination of sound grammatical
theory and engineering principles. Morphological and syntactic
specifications within a description-based lexicalist framework.
Integration of shallow and deep parsing techniques. Engineering
issues in multilingual parallel grammar development. Students will
incrementally extend a small grammar for English.
Prerequisite: basic
knowledge of syntactic theory or Ling120.
No prior programming skills
required.
Weekly topics and assignments
Final short papers (due March 20):
No late papers will be accepted since grades are due March 24. If you are taking the course for reduced credit (e.g. 2 units), you do not need to do the papers.
January 12 (week 1)
Introduction to grammar engineering
Introduction to LFG and XLE
Formal devices: equations, lexicons
- Slides and Handouts:
Slides
XLE howto
Basic emacs
- Assignment (assignments are available on Monday, due Friday at noon, returned on the next Monday in class)
- Readings:
Background on LFG for those who need it:
No class on January 19 - Martin Luther King, Jr., Day
A special office hour will be held on Tuesday, January 20, from 4:00 p.m. to 5:45 p.m.
January 26 (week 2)
Engineering and linguistic generalizations
Formal devices:
equations (cont'd),
lexicons (cont'd),
templates,
lexical rules,
configurations,
metarulemacro
February 2 (week 3)
Coordination and Functional Uncertainty
- Slides
- Assignment
- Readings
- Ron Kaplan and John T. Maxwell III. 1995.
Constituent Coordination in Lexical-Functional Grammar. (pdf) In M. Dalrymple,
R. M. Kaplan, J. T. Maxwell, and A. Zaenen (eds.), Formal Issues in
Lexical-Functional Grammar, Stanford, CA: CSLI Publications. (Originally
appeared in Proceedings of COLING-88, vol. 1 (Budapest, 1998),
303--305.)
- Functional Uncertainty: Ronald M. Kaplan and Annie Zaenen. 1989. Long-distance dependencies,
constituent structure, and functional uncertainty. (ps) In Mark Baltin and
Anthony Kroch (editors), Alternative Conceptions of Phrase Structure,
pp. 17-42. Chicago University Press. Reprinted in Dalrymple et
al. (editors), Formal Issues in Lexical-Functional Grammar. CSLI, 1995.
February 9 (week 4)
Ambiguity and Robustness:
OT marks, Fragments, Performance Settings
No class on February 16 - Presidents' Day
February 23 (week 5)
Data-driven Methods in Grammar Development:
Using Shallow Markup, Parsebanking, C-structure Pruning, Stochastic Disambiguation, Testing, Evaluation
- Slides
- Assignment
- Readings:
- Mark Johnson and Stefan Riezler. 2001. Statistical Models of Language Learning and Use.
Cognitive Science 26:3.
- Stefan Riezler, Tracy H. King, Ronald M. Kaplan, Richard Crouch, John T. Maxwell, and Mark Johnson. 2002. Parsing the Wall Street Journal using a Lexical-Functional Grammar and Discriminative Estimation Techniques.
Proceedings of the Workshop on Combining Shallow and Deep Processing for
NLP. ESSLLI.
- Richard Crouch, Ronald M. Kaplan, Tracy H, and Stefan Riezler. 2002. KingGeneration: Ronald M. Kaplan and Juergen Wedekind. 2000. A Comparison of Evaluation Metrics for a Broad Coverage Stochastic Parser. Proceedings of the Workshop on "Parseval and Beyond" at the 3rd International Conference on Language Resources and Evaluation (LREC'02).
- Optional: Stefan Riezler and Alexander Vasserman. 2004. Incremental Feature Selection and l1 Regularization for Relaxed Maximum-Entropy Modeling. Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP'04).
- Optional: Ron Kaplan, Stefan Riezler, Tracy King, John Maxwell, Alexander Vasserman, and Richard Crouch. 2004.
Speed and Accuracy in Shallow and Deep Stochastic Parsing.
Proceedings of the Human Language Technology Conference and the 4th Annual Meeting of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL'04).
March 2 (week 6)
Machine Translation - Generation, Transfer/Rewrite System, Morphology
March 9 (week 7)
Search as an Example of a Real-world Application Where a Grammar Is Used
- Slides
- Assignment
- Readings:
- Semantics via F-Structure Rewriting 2006 (D. Crouch and T.H. King) Proceedings of LFG06, CSLI On-line publications, pp. 145-165.
- PARC's Bridge and Question Answering System 2007 (D. G. Bobrow, B. Cheslow, C. Condoravdi, L. Karttunen, T.H. King, R. Nairn, V. de Paiva, C. Price, and A. Zaenen) Proceedings of the Grammar Engineering Across Frameworks (GEAF07) Workshop, pp. 46-66, CSLI Publications.
Grading
Grades will be determined based on the seven weekly assignments, two short
papers, and class participation. There is no final.
If you are taking the course for 2 credits instead of 4, your grade will be determined based on the seven weekly assignments.
Class materials
There will be assigned readings. These will be available directly from this page.
Two books that are recommended as being of interest are: