![]() |
CS
224N -- Ling 237: Natural Language Processing
Spring 2003 FAQs |
| FAQs |
/afs/ir/class/cs224n/src/morph-1.5
contains the Penn XTAG project
morphological analyzer. It does a fairly good job of determining and
returning the base form (lemma) and inflection of English words.
It is written in C.
/afs/ir/class/cs224n/src/
contains the Brill and icopost taggers that you can use.
/afs/ir/class/cs224n/src/icopost-0.9.1/data/english-wsj-train-0-18.lex
This file contains a list of all tokens in a particular corpus, and
the list of part of speech tags that were observed each. Counts are
also kept for number of total occurences and the number of occurences
as each tag.
Det->the
by the rule Det->an . The correction
has been made on the homework version available on the
handouts page.