This course covers the architecture of modern data storage and processing systems, including relational databases, cluster computing systems, streaming and machine learning systems. Topics include database system architecture, storage, query optimization, transaction management, fault recovery, and parallel processing, with a focus on the key design ideas shared across many types of data-intensive systems.
Question: What proof points does the paper give that Catalyst achieves its goal as an extensible optimizer? Can you think of any limitations to Catalyst's approach for supporting external extensions?
Students should ideally have taken CS 145 and CS 161, or their equivalent courses.
In particular, we expect students to be familiar with SQL syntax.
You can take a basic SQL tutorial for an overview of SQL if needed.
Assignments and Exams
We will have three programming assignments, a midterm and a final. The programming assignments are designed to be runnable on your personal machine and should be submitted through GradeScope.
Exams are open-notes and "open-laptop" (you can bring any material you want on paper or on your laptop), except that network access is not be allowed during exams. Exams will cover material in the lectures, readings and assignments.
Readings
We have occasional readings for the lectures.
We expect students to complete these and think about the respective questions on their own (you do not need to turn in answers).
Reading material can appear on the exams.
Optional Textbook
Database Systems: The Complete Book (2nd Edition), by Garcia-Molina, Ullman and Widom, covers a lot of the technical material in the course and may be helpful as a study guide. We focus on chapters 13-20. We will also cover the material in lectures, but this book is a good source of additional information.
Grading
Assignments: 15% each (total: 45%)
Midterm: 25%
Final: 30%
Late Policy
Students each have up to 3 late days that they may use during the quarter.
Assignments submitted later after these late days have been used up will incur a penalty of 10% per additional day late.
SCPD Lecture Recording Notice
Video cameras located in the back of the room will capture the instructor presentations in this course. For your convenience, you can access these recordings by logging into the course Canvas site. These recordings might be reused in other Stanford courses, viewed by other Stanford students, faculty, or staff, or used for other education and research purposes. Note that while the cameras are positioned with the intention of recording only the instructor, occasionally a part of your image or voice might be incidentally captured. If you have questions, please contact a member of the teaching team.
Feedback
Please post public questions about the class on Piazza.
For private questions to the staff, please open a private post on Piazza. You can also email professor Zaharia at matei@cs.stanford.edu.