Image credit

Course Description

Information retrieval is the process through which a computer system can respond to a user's query for text-based information on a specific topic. IR was one of the first and remains one of the most important problems in the domain of natural language processing (NLP). Web search is the application of information retrieval techniques to the largest corpus of text anywhere -- the web -- and it is the area in which most people interact with IR systems most frequently.

In this course, we will cover basic and advanced techniques for building text-based information systems, including the following topics:

  • Efficient text indexing
  • Boolean and vector-space retrieval models
  • Evaluation and interface issues
  • IR techniques for the web, including crawling, link-based algorithms, and metadata usage
  • Document clustering and classification
  • Traditional and machine learning-based ranking approaches

Class time & location

Spring quarter 2017
Lecture times: Tues/Thurs, 4:30-5:50pm, April 4 - June 6
Location: NVIDIA Auditorium

Grading & course policies

See the course policies page for details on grading, late days, and other policies.

Required textbook

Introduction to Information Retrieval, by C. Manning, P. Raghavan, and H. Schütze (Cambridge University Press, 2008).

This book is available from Amazon, the Stanford bookstore, or your favorite book purveyor. You can also download and print chapters for free at the book website. (We’d appreciate any reports of typos or of higher-level problems for the third printing.)

This book will be referred to as IIR in the reading assignments listed on the course schedule page.

Other useful references

  • (MG) Managing Gigabytes, by I. Witten, A. Moffat, and T. Bell.
  • (IRAH) Information Retrieval: Algorithms and Heuristics, by D. Grossman and O. Frieder.
  • (MIR) Modern Information Retrieval, by R. Baeza-Yates and B. Ribeiro-Neto.
  • (FSNLP) Foundations of Statistical Natural Language Processing, by C. Manning and H. Schütze.
  • (SE) Search Engines: Information Retrieval in Practice, by B. Croft, D. Metzler, and T. Strohman.
  • (IRIE) Information Retrieval: Implementing and Evaluating Search Engines, by S. Büttcher, C. Clarke, and G. Cormack.

Prerequisites

FAQ

Can I take this course on credit/no credit basis?
Yes. Credit will be given to those who would have otherwise earned a C- or above.
Can I audit or sit in?
In general we are very open to sitting-in guests if you are a member of the Stanford community (registered student, staff, and/or faculty). Out of courtesy, we would appreciate that you first email us or talk to the instructor after the first class you attend.
I have a question about the class. What is the best way to reach the course staff?
In general, we ask students to use the Piazza forum for our class so that other students may benefit from your questions and our answers. If you have a personal matter that you believe is not appropriate to share on Piazza (even in a private post), you may email the course staff at cs276-spr1617-staff@lists.stanford.edu. We may NOT be able to reply emails sent to individual instructors or TAs regarding the class.
As an SCPD student, how do I take the final exam?
For SCPD students, if you are local, you're encouraged to just come to Stanford for one of the on-campus exams. If you are not local or can't make it at the on-campus exams, you need to line up an exam monitor (usually your manager or a co-worker at your company), and submit the form specifying this person to SCPD in advance. You won't get an exam if you don't have an exam monitor on file. You will then be able to do the final exam during any 3 hour period that works for you and your exam monitor between approximately Monday noon Pacific time (from whenever the monitor gets the exam from SCPD) and Wednesday 6:30pm Pacific time. You need to make sure we get the exam back promptly (monitor should scan and email directly to us) by Wednesday 7:30pm Pacific time. We need to grade exams that evening in order to be able to turn grades in in time.
Will there be virtual office hours for SCPD students?
We will be sure to join a Google hangout for at least some office hours. We'll share more detail about office hours by the second week of class.