This site

::	HOME What? What not?
::	Site map
::	About this site

Corpus-tools & other useful software

Corpora@Stanford

Getting started
@Stanford

::	Intro & Overview Where corpora grow and why you like them
::	Playground rules & registration Apply for your visa to the land of corpora
::	Setting up your account Pack your suitcase to the land of corpora

Available resources
@Stanford

::	User support The Corpus TA & our corpora-email-list
::	Corpora [Ordering corpora \| Checking out CDs]
::	Corpora-tools & Software [Documents]
::	Corpus-related classes & projects

Beyond Stanford

::	Top 10 info-sources E-resources out there

For the Corpus TA

::	Guidelines & help

TnT - Thorsten Brants's part-of-speech tagger

TnT is a part-of-speech tagger (POS-tagger) which can be used to prepare corpora for search tools that presume POS tagging (e.g. Gsearch, tgrep, etc.). It comes pretrained to tag English and German newspaper text but can be trained with any other corpus. The Unix and Windows version can be found on AFS at:

/afs/ir/data/linguistic-data/lib/tnt

Tip - Jeanette Pettibone has provided us with her presentation on part-of-speech taggers.