![]() |
CS276A / SYMBSYS 239I / LING 239I |
This page contains information on possible project ideas, and pointers to various resources that are freely available
The course project is 40% of the final grade. The project should be done in teams of two. The choice of project is mostly open ended. You may work on exploring any topic covered in the class, although you should pick something of reasonable scope that can comfortably be accomplished in one quarter. Don't reinvent the wheel; use what already exists to build something new and innovative. You will be required to submit a final writeup describing your project, results, and conclusions.
We have installed a variety of libraries and datasets locally for use in class projects. An explanation of the libraries you may want to use is given here. Some of these libraries are required for accessing locally installed datasets, such as a crawl of Stanford.EDU. The raw datasets themselves are located in /afs/ir/class/cs276a/data1/. You can find the documentation for the APIs for the libraries here.
Libraries for both Solaris and Linux have been installed. Possible machines you can use are the tree and elaine machines (Solaris) or the firebird and raptor machines (Linux). For instance, ssh firebird will log you in to one of the firebird machines.
A sample application, using several of these libraries, is described here. It is located at /afs/ir/class/cs276a/software/examples/cs276a/examples/titleindexer/.
The following are just some of the many freely available resources you might find helpful in doing your class projects. Some of these have already been installed and have local documentation. If you know of other resources that other students can benefit from, send mail to the TA to add it to the following list.