This site

::  HOME
What? What not?
::  Site map
::  About this site
 
 

 

Setting up your account

Corpora@Stanford

Getting started
@Stanford

::  Intro & Overview
Where corpora grow and why you like them
::  Playground rules
& registration

Apply for your visa to the land of corpora
::  Setting up your account
Pack your suitcase to the land of corpora

Available resources
@Stanford

::  User support
The Corpus TA &
our corpora-email-list
::  Corpora
[Ordering corpora | Checking out CDs]

::  Corpora-tools & Software
[Documents]

::  Corpus-related classes
& projects

Beyond Stanford

::  Top 10 info-sources
E-resources out there

For the Corpus TA

::  Guidelines & help
 

Basic corpora access instruction

Below you find step-by-step explanations for your journey to the land of corpora. This page assumes that you have read the introduction page or are familiar of what AFS is.

You can access the AFS corpora via your own Unix machine or via your personal Windows PC or Mac. Of course, you can also use the Corpus PC which is already setup to access AFS corpora. If you still have questions after reading this page or if you have problems following the instructions on this page, contact the Corpus TA.

Note: You do not have write permission on AFS. So any attempt to write onto AFS will not work.

Unix

By Susanne Riehemann (ed. by Florian Jaeger): The AFS corpora are stored under:

    /afs/ir/data/linguistic-data/

You can access them from any machine on campus that runs AFS (all Sweet Hall machines, and many others, such as csli machines, nlp, CS, ...). To be able to access that directory you need to have kerberos authentication, so on non-Sweet Hall machines you normally must do:

    /usr/pubsw/bin/kinit -t

If your leland username is the same as your login name, you can omit it. You can't omit the -t. While not documented on any kinit man page I've looked at, it turns out to be vital :-)

Once there, feel free to poke around. But if you want to use things "for real" it is a good idea to check the top-level "readme" file to see whether the corpus that you're interested in is actually complete.

Another helpful thing to do (if you're using a tcsh) is to put

    set symlinks=ignore

into your .cshrc file. This will have the effect that when you have followed a symbolic link and "go back up" by using "cd .." you will end up back in the directory where the symbolic was located. So you will not get lost in our 25 mount partitions :)

You don't have to limit yourself to just text. Providing your machine has audio, you should be able to do:

    cd /afs/ir/data/linguistic-data/Santa-Barbara/sbc1_1/speech/
    xanim sbc0001.wav

and (after about two minute's pause for file transmission) be listening to a conversation on horse veterinary school [not recommended for right before lunch].

Mac and PC

You can also access AFS via PC-leland (only for Windows 2000, XP) or MacLeland. This Kerberos software can be downloaded from ITSS: Essential Stanford Software pages. The following explanations assume that you have downloaded and installed PC-leland or MacLeland.

If not already logged in, select the Leland menu at the top right (a two-headed arrow above a collonade), and either login or do a secondary login. Enter you SuNet ID and password. From the same Leland menu (right click on the Leland icon in your taskbar or whereever you find it), choose

    Mount and then
    Full path...

Type (or copy and paste) the AFS path into the box:

    /afs/ir/data/linguistic-data

This will open the AFS space and you will be able to navigate it as if it was a hard drive on your computer. You can use any software from your computer to access the files on AFS.