Announcements

5/13/09 Answers to all of the quizes are posted (and will be updated accordingly) online here.
4/29/09 We used the following checklist as a guide in grading the submissions for Programming Assignment 1: PA1 grading basis .
You need a SUNet ID to access this link
4/13/09 If you are submitting your quiz solutions after class, please use Quiz Submissions .
You need a SUNet ID to access this link: only in the case that you do not have one would we be accepting submissions mailed in to the staff list henceforth.
3/30/09 The website has been updated for the Spring 2009 edition of the course. Links to previous year's notes are still available with a strikethrough through them, which will be removed in order to let you know when the links are updated.


Course Description

This course is designed to introduce students to the fundamental concepts and ideas in natural language processing (NLP), and to get them up to speed with current research in the area. It develops an in-depth understanding of both the algorithms available for the processing of linguistic information and the underlying computational properties of natural languages. Word-level, syntactic, and semantic processing from both a linguistic and an algorithmic perspective are considered. The focus is on modern quantitative techniques in NLP: using large corpora, statistical models for acquisition, disambiguation, and parsing. Also, it examines and constructs representative systems.

Prerequisites

  • Adequate experience with programming and formal structures (e.g., CS106B/X and CS103B/X).
  • Programming projects will be written in Java 1.5, so knowledge of Java (or a willingness to learn on your own) is required.
  • Knowledge of standard concepts in artificial intelligence and/or computational linguistics (e.g., CS121/221 or Ling 180).
  • Basic familiarity with logic, vector spaces, and probability.

Intended Audience

Graduate students and advanced undergraduates specializing in computer science, linguistics, or symbolic systems.

Textbook and Readings

The required text is:

  • Daniel Jurafsky and James H. Martin. 2008. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition. Second Edition. Prentice Hall.

It's at the bookstore (and other purveyors of fine books). Of course, I'm also fond of:

  • Christopher D. Manning and Hinrich Schütze. 1999. Foundations of Statistical Natural Language Processing. MIT Press.
    Buy it at the Stanford Bookstore or Amazon ($64 new).
    You can read the text online on a Stanford network computer! It's referred to as M&S in the syllabus. While a bit older, it also has good and often distinct coverage of many topics. Please see http://nlp.stanford.edu/fsnlp/ for supplementary information about the text, including errata, and pointers to online resources.

Other useful reference texts for NLP are:

  • James Allen. 1995. Natural Language Understanding. Benjamin/Cummings, 2ed.
  • Gerald Gazdar and Chris Mellish. 1989. Natural Language Processing in X. Addison-Wesley. [Where X = Prolog, Lisp, or, I think, Snobol.
  • Frederick Jelinek. 1998. Statistical Methods for Speech Recognition. MIT Press.

Other papers with relevant material will occasionally be posted or distributed for appropriate class lectures.

Copies of in-class hand-outs, such as readings and programming assignments, will be posted on the syllabus, and hard copies will also be available outside Gates 158 (in front of Prof. Manning's office) while supplies last.

Assignments and Grading

There will be three substantial programming assignments, each exploring a core NLP task. They are a chance to see real, close to state-of-the-art tools and techniques in action, and where students learn a lot of the material of the class.

There will be a final programming project on a topic of your own choosing.

Finally, there will be simple in-class quizzes based on the day's lecture, which will aim to check that you are paying attention to what you hear/read.

Course grades will be based 60% on programming assignments (20% each), 6% on the quizzes, and 34% on the final project.

Be sure to read the policies on late days and collaboration.

Section

Sections will be held most weeks to go over background material, or to address issues related to the programming assignments. Sections are optional, but students are encouraged to attend for a better understanding of background material and the assignments.

Course Information


Lectures: MW 11:00-12:15
Location: Gates B03
Section: F 1:15-2:05
Location: Skilling Auditorium
Professor: Chris Manning

Electronic Communications

Web: http://cs224n.stanford.edu/

Facebook: CS224N Group
Post less time-critical questions, meet your classmates, find partners, etc. here.

Staff mailing list:
cs224n-spr0809-staff@lists.stanford.edu

Announcements mailing list:
cs224n-spr0809-students@lists.stanford.edu

Enrolled students are automatically subscribed.
Others wishing to receive announcements should
go to mailman.stanford.edu, and subscribe to
"cs224n-spr0809-guests".

Assignments

Assignment 1 (due 4/15/09)
Assignment 2 (due 4/29/09)
Assignment 3 (due 5/13/09)
Final project

Collaboration Policy
Late Day Policy
Regrading Policy

Links

The Stanford NLP Group
Linguistic Corpora at Stanford
Statistical NLP links
Probabilistic parser links
Java 1.5 Overview
Java 1.5 New Features