The focal point of the course is the RedBase project. RedBase
stands for Relational Database, and also alludes to Stanford's
color. (We know, Stanford's color is really Cardinal, but
CardBase doesn't have as much of a ring to it.) RedBase is a
complete single-user relational database management system. It
involves a significant amount of coding, and the project must be
completed by each individual student -- teams are not permitted. The
project is highly structured, but there is enough slack in the
specification so that creativity is both allowed and required. The
basic project is divided into four parts:
The Record Management (RM) Component: In this part you
will implement a set of functions for managing unordered files of
database records. This component will rely on a Paged File (PF)
component that we will provide. The Paged File component performs
low-level file I/O at the granularity of pages.
The Indexing (IX) Component: In this part you will
implement a facility for building indexes on records stored in
unordered files. Your indexing facility will be based on B+ trees.
The Indexing component will rely on the Paged File component.
The System Management (SM) Component: In this part you will
implement various database and system utilities, including data
definition commands and catalog management. The System Management
component will rely on the Record Management and Indexing components
from Parts 1 and 2. It also will use a command-line parser, which we
will provide.
The Query Language (QL) Component: In this part you will
implement RQL -- the RedBase Query Language. RQL consists of
user-level data manipulation commands, both queries and updates. The
Query Language component will rely on the three components from Parts
1-3, and it will use the command-line parser that we are providing.
In addition to the basic project, each student will design and
implement a significant extension to RedBase. We expect that students
will get ideas about extensions as the course progresses.
Possibilities include aspects of record management, long fields
(BLOBs), object management, text management, sorting, indexing, join
algorithms, clustering, statistics and query optimization, query
language extensions, OLAP, XML, concurrency control, recovery,
security and authorization, compression, networking, versioning,
external functions, stored procedures, views, integrity constraints,
triggers, user and application interfaces, web integration, etc.
(We're certainly open to additional suggestions.) Each student will
submit a proposal for their project extension. Students will get
feedback on their proposal, then will implement their extension as the
fifth and final part of the project. Complete projects will be
demonstrated to the instructors during finals week.
RedBase I/O Efficiency Contest
As the old saying goes, the three most important aspects of a database
management system are efficiency, efficiency, and efficiency. To
encourage you to take efficiency into consideration as you develop
your RedBase system, we will be conducting a RedBase Efficiency
Contest when the QL component is complete. While there are
several important efficiency measures in a DBMS, we will focus on I/O
performance. We will measure each student's RedBase system on a set
of benchmark queries and updates in the RQL language and will count
the number of I/O's -- the fewer the better, of course. All students
enter the contest automatically when they submit their QL component,
unless they prefer to be excluded. The prizes are:
First prize: 10% boost in overall project score and an
official Stanford InfoLab T-shirt.
Second prize: 5% boost in overall project score and an
official Stanford InfoLab T-shirt.
Third prize: An official Stanford InfoLab T-shirt.
Late Policy
The late policy follows. There will be absolutely no exceptions to
this late policy, so please don't even ask! It's crucial that
students stay on schedule in this course -- RedBase is a very big
project.
The due dates for project parts 1-4 are on Sundays. Project
parts are due at 11:59 PM on the due date. For project parts submitted
after 11:59 PM on the due date, there is a 1% penalty applied to that
project part score for each hour late. For example, if you submit
your project part on Tuesday at 8:00 AM, it will be 32 hours late and
will incur a 32% penalty on that part.
Your project extension (Part 5) proposal is due on a Wednesday at
11:59 PM. Like project parts, proposals submitted late incur a 1%
penalty on the proposal score for each hour late.
We realize that occasionally dire circumstances arise. To cover
emergencies, each student is allocated 2 free late days and 12 free late hours for the
entire course with no penalty. Once the free days and hours have been used,
penalties begin to be applied as described above.
The final demonstrations of your project will be scheduled in
half-hour slots on Thu.-Fri. the week before finals week. All students must
complete their projects by demo time.
Computer Accounts
You will implement RedBase using the Linux machines in the FarmShare cluster (the
corn's, cardinal's, myth's, etc.).
Directory
/usr/class/cs346 will contain files and subdirectories for
the class.
Students with access to their own workstations or Linux PCs are
welcome to try to use them, but you will need to copy all provided
software from the Stanford FarmShare machines. While we will do our best to
ensure that the code we provide is portable, we cannot guarantee
portability across all platforms. Likewise, while we may attempt
to help with platform-specific problems, our focus will be on the
Linux machines.
Your programs will be submitted electronically and they will be
tested on a corn machine. It will be your
responsibility to ensure that your programs compile and run correctly
on that platform before submitting them.
More on Programming
We will provide code for the Paged File (PF) component of RedBase and
for some commonly-used routines in other components. We will also
provide a command-line parser that you will use for Parts 3 and 4 of
the project. Specifications for the code we provide, along with
specifications for each component that you will implement, will be
given as object-oriented interfaces in the C++ programming language.
We will help you get started with your programming by providing sample
Makefiles, header files, etc. In addition, for some of the project
parts we will provide test suites in advance of the due date, although
these tests will not be comprehensive.