Formats, compatible tools, and where to find them
lCurrently we have five different search tools for syntactically annotated corpora
–Corpussearch (needs a grammar fragment)
–tgrep & tgrep2 (need specific format)
–TIGERSearch (needs specific format)
–Linguist’s Search Engine (doesn’t need either, but is slow)
–Roger’s tgrep – coming soon
–
lSyntactically corpora come in their own annotation format (PENN, Negra, TigerXML, etc.)
–The original corpus files are stored with each corpus on AFS.
–If you want to use tgrep, tgrep2, or TIGERSearch, the corpora have to be converted into the right format.
–Here is where you can find which corpus for which search tool (if the corpus you need is not in the right format, ask the corpus TA to convert it for you).