Research in Machine Learning of Natural Language in Dan Jurafsky's Lab

One of the most exciting open research areas in computational linguistics is the automatic induction of natural language structure. Together with Dan Gildea and Pat Schone, we have been working on these areas in a number of projects.

Induction of Phonological Rules: Dan Gildea and I are interested in the induction of phonological rules, and particularly in the way that learning-bias can make phonological rules easier to learn in models of two-level phonology. See Gildea and Jurafsky (1996)

Machine Learning of Morphological Structure: Patrick Schone's disseration focuses on automatic building of dictionaries based only on unlabeled input corpora. Two recent papers, for example, are on automatic induction of morphology, using a semantic model based on Latent Semantic Analysis. See Schone and Jurafsky (2000) (PDF) or Schone and Jurafsky 2001 (PDF). Here is Pat's dissertation.