lCurrently we have five different search tools for syntactically annotated
corpora
–Corpussearch (needs a grammar fragment)
–tgrep &
tgrep2 (need specific format)
–TIGERSearch (needs specific format)
–Linguist’s Search
Engine (doesn’t need either, but is
slow)
–Roger’s tgrep –
coming soon
–
lSyntactically
corpora come in their own annotation format (PENN, Negra, TigerXML, etc.)
–The original corpus files are stored with each corpus on
AFS.
–If you want to use tgrep, tgrep2, or TIGERSearch, the
corpora have to be converted
into the right format.
–Here is where you can find which corpus for which search
tool (if the corpus you
need is not in the right format, ask the corpus TA to convert it for you).