We have just posted a schedule of final project presentations with dates, times, and room numbers. You're welcome to stop by any presentations you'd like, as long as they're not for the same data structure that you're covering.
We're looking forward to seeing your presentations! Feel free to email us if you have any questions in the meantime.
We have just posted the requirements for the final project to the course website. In particular, please note that your project writeup will be due 24 hours before you give your presentation.
We have some updates to the midterm logistics. I've sent out an email containing this information, but wanted to reprint it here in case anyone missed it.
The midterm will be held on Wednesday, May 21 from 7PM  10PM. Although initially the plan was for the exam to be closedbook and closednote, we have revised this policy. You may now have one doublesided sheet of notes with you on 8.5" × 11" paper.
The midterm location has changed. If your last name is A  S, you should go to Meyer 124 in the Meyer Library. If your last name is T  Z, you should go to Meyer 147.
We have released a practice midterm exam that you can use to prepare for the upcoming exam. We'll release solutions in hardcopy on Monday.
Keith will be holding a review session on Monday, May 19 from 7:30PM  9:30PM in Gates 104. Please feel free to stop by if you have any questions!
Keith will be holding two sets of office hours this week: Monday, 3:30PM  4:45PM in Gates 178, and Tuesday, 1:30  2:30PM in Gates 178.
We've finalized our office hours times! Keith will be holding office hours from 3:30PM  5:30PM on Mondays in Gates 178, and the TAs will be holding office hours from 7:30PM  9:30PM on Thursdays in room 160318.
There's an error in Q2.iii of Problem Set 3 that makes the problem impossible to solve. I've posted an updated version of the problem set handout here on the course website. Sorry about that!
I will be tentatively be holding my Monday office hours from 3:30PM  5:30PM in Hewlett 201 (our lecture hall). Feel free to stop by with questions!
Welcome to CS166, a new course in the design, analysis, and implementation of data structures. We've got an exciting quarter ahead of us  the data structures we'll investigate are some of the most beautiful constructs I've ever come across  and I hope you're able to join us.
CS166 has two prerequisites  CS107 and CS161. From CS107, we'll assume that you're comfortable working from the commandline; designing, testing, and debugging nontrivial programs; manipulating pointers and arrays; using bitwise operators; and reasoning about the memory hierarchy. From CS161, we'll assume you're comfortable designing and analyzing nontrivial algorithms; using O, o, Θ, Ω, and ω notation; solving recurrences; working through standard graph and sequence algorithms; and structuring proofs of correctness.
We'll update this site with more information as we get closer to the start of the quarter. In the meantime, feel free to email me at htiek@cs.stanford.edu if you have any questions about the class!
This syllabus is still under construction and is subject to change as we finetune the course. Stay tuned for more information and updates!
Monday  Wednesday 

Range Minimum Queries
March 31
The range minimum query problem is the following: given an array, preprocess it so that you can efficiently determine the smallest value in a variety of subranges. This problem has applications all over CS and there are many strategies to solve it. What are they? How do they work? Readings: 
Cartesian Trees and RMQ
April 2
There is a close connection between a special data structure called a Cartesian tree and the RMQ problem. By exploiting this similarity and using a technique called the Method of Four Russians, we can get both theoretical and practical speedups in RMQ computations. Readings:
Handout 01: Mathematical Terms and Identities Handout 02: Problem Set Policies 
Balanced Search Trees I
April 7
Balanced search trees are among the most versatile and flexible data structures. They're used extensively in theory and in practice. What sorts of balanced trees exist? How would you design them? And what can you do with them? Readings:

Balanced Search Trees II
April 9
Balanced search trees support many operations besides insertions and lookups. Additionally, they can be augmented to support more advanced types of queries. These techniques, when combined, make balanced search trees powerful building blocks for later data structures. Readings:

Euler Tour Trees
April 14
Euler tour trees are a data structure for maintaining information about collections of trees in a graph. They're also used as building blocks for more powerful data structures like dynamic graphs and as subroutines in many algorithms. Understanding how Euler tour trees work will give you a much better feel for some of the techniques that go into advanced data structures. Readings: 
Intro to Amortization
April 16
In many cases we only care about the total time required to process a set of data. In those cases, we can design data structures that make some operations more expensive in order to lower the total cost of all aggregate operations. How do you analyze these structures? Readings:

Binomial Heaps
April 21
Binomial heaps are a simple and flexible priority queue structure that supports efficient melding of priority queues. The intuition behind binomial heaps is particularly elegant, and they'll serve as a building block toward the more complex Fibonacci heap data structure that we'll talk about on Wednesday. Readings:

Fibonacci Heaps
April 23
Fibonacci heaps are a type of priority queue that efficiently supports decreasekey, an operation used as a subroutine in many graph algorithms (Dijkstra's algorithm, Prim's algorithm, the StoerWagner min cut algorithm, etc.) They're formed by a clever transformation on a lazy binomial heap. Although Fibonacci heaps have a reputation for being ferociously complicated, they're a lot less scary than they might seem! Readings:

Splay Trees
April 28
A perfectly balanced BST might not be the best BST for a particular data set if the accesses aren't uniform. What happens if we try to optimize a BST by moving frequentlyaccessed elements toward the top? Readings:

AhoCorasick String Matching
April 30
How can you search a huge text for a bunch of shorter strings? Using a combination of tries and finite automata, you can do so about as asymptotically efficiently as possible. Readings:

Suffix Trees
May 5
Suffix trees are an incredibly versatile data structure for solving problems on strings. What is a suffix tree? Where does it come from? And why is it so useful? Readings: 
Suffix Arrays
May 7
Suffix arrays are a spaceefficient alternative to suffix trees. They're closely related to suffix trees: given a suffix array and an auxiliary data structure called an LCP array, we can construct a suffix trees in linear time. This lecture explores suffix arrays, LCP arrays, and how to construct them. Readings:

CountMin and Count Sketches
May 12
How can Google keep track of frequent search queries without storing all the queries it gets in memory? How can you estimate frequently occurring tweets without storing every tweet in RAM? As long as you're willing to trade off accuracy for space, you get get excellent approximations. Readings:

Cuckoo Hashing
May 14
Most hash tables give expected O(1) lookups. Can we make hash tables with no collisions at all, and if so, can we do it efficiently? Amazingly, the answer is yes. There are many schemes for achieving this, one of which, Cuckoo hashing, is surprisingly simple to implement, though the mathematical analysis is decidedly nontrivial. Readings:

van Emde Boas Trees
May 19
Binary search trees assume that the keys being stored are totally ordered, but otherwise makes no assumptions about them. If you know for a fact that the keys will be integers from a fixed range, you can harness properties of numbers to exponentially speed up BST operations. Doing so requires some tricks reminiscent of those we used in RMQ and will lead to a surprisingly elegant and efficient data structure. Readings:

xFast and yFast Tries
May 21
van Emde Boas trees have excellent runtimes for each operation, yet use a huge amount of space. The yfast trie matches the runtime bounds of van Emde Boas trees, but use only linear space. More importantly, though, yfast tries bring together all of the topics from the course so far  blocking, balanced BSTs, amortization, tries, and randomization. My hope is that they're a fitting premidterm sendoff! Readings: 
Memorial Day
May 26
No class. 
DisjointSet Forests
May 28
When we covered Euler tour trees earlier in the quarter, we saw how to solve the fully dynamic connectivity problem in forests with cost O(log n) per operation. In this lecture, we'll explore a data structure called the disjointset forest for solving the incremental connectivity problem  in which edges are successively added to a graph  in general graphs. The resulting data structure is blindly fast and has a very clever analysis. Readings:

Dynamic Graphs
June 2
In realworld networks, links come online and offline all the time. Eulertour trees can maintain connectivity in such networks as long as those networks are graphs. In this lecture, we'll consider two other cases: decremental connectivity in trees, and fully dynamic connectivity in undirected graphs. Readings: 
Where to Go from Here
June 4
There are so many types of data structures out there  geometric data structures, persistent data structures, parallel data structures, etc. What else is out there? How are the skills from CS166 useful in a broader context? Readings: 