CS346 - Spring 2014
Database System Implementation
This document contains important logistical information for the
RedBase project:
There are many ways to organize the source code for your RedBase
system. Many of you will be tempted to break your source code up
across directories -- for instance one directory for the RM component,
one for IX, and so on. However, our experience dictates that for
building the system and for TA grading, it is vastly preferable that
you keep all of your RedBase source code in a single directory.
Please structure your code management in this way throughout the
project.
To get you started, we are providing the Paged File
(PF) component of the RedBase system. The functionality of
this component is described in detail in the PF
Component document. Later we will be providing a few additional
routines and a command-line parser.
We have written a script that automates the process of getting all
necessary files to support your RedBase code development. You will
need to invoke the script once for each of the four basic components
of the RedBase system: each invocation retrieves the files needed for
that component. The first invocation will fetch the entire PF
component, template rm.h and rm_rid.h files, and an
RM "tester" file (rm_test.cc).
Here is the sequence of steps you need to follow.
Note: If you plan to try using a non-Stanford workstation
to develop RedBase then you will need to follow a different procedure.
Please contact the TA.
- Login as normal and create a directory for the RedBase system:
mkdir redbase
- Change into the directory:
cd redbase
- To make sure that nobody else can read the directory:
fs la .
In case other users (such as system:anyuser) have read and/or write permissions,
remove them from the ACL. (see
Setting Permissions with UNIX)
- Make a symbolic link to the RedBase setup script by typing:
ln -s /usr/class/cs346/redbase/scripts/setup .
- To get information about the setup script execute:
./setup -h
- Now execute the setup script. It will transfer or create symbolic
links to all files necessary for you to complete the first project
part. You can use the "-v" switch in order to see exactly
what is happening. Don't forget to include the argument "1".
./setup 1
- The setup script will create a src directory. Change into
the src directory:
cd src
- Make a symbolic link to the RedBase submit script by
typing:
ln -s /usr/class/cs346/redbase/scripts/submit .
The submit script should be stored in the same directory as all of your
source code.
- An initial Makefile is provided for you. You should use
this makefile as a basis for the makefile that will manage all of your
RedBase source code. Typing "make" alone will create the
library for the PF component. Typing "make testers" will
create the PF library and three PF component test programs. Type:
make testers
pf_test1,
pf_test2, and pf_test3 are executables that test the
PF component. Feel free to try out the tests if you want.
After submitting each component of your RedBase system (see Submission Process below) you will need to get
supporting files for the next component. Use the same setup
script, but substitute as an argument to the script the correct
component number: use "2" for IX, "3" for SM, and
"4" for QL.
Testing is extraordinarily important for RedBase. Year after
year, diligent CS346 students are sure they've coded "bullet-proof"
components, they fail to test them sufficiently, and they're caught by
surprise with low functionality scores. The tests supplied for each
component should be used only as a basis for building your own more
extensive tests -- the tests run by the TA will be much more
comprehensive than what you will receive. Your documentation is
expected to include a description of your testing process, and it will
form a (relatively small) fraction of your grade.
It is within the Honor Code for students to share tests and data, as
long as the material is first made available to everyone in the class.
Sharing tests or data without making them available publicly first
is a violation of the Honor Code. Despite those harsh warnings, we
do very much encourage test and data sharing. We have created central
directories where students should place their shared tests and data.
The directories are /usr/class/cs346/redbase/testers/ and
/usr/class/cs346/redbase/data/, and each tester or data file
should include the student's name and email contact. We have already
placed in these directories a few of our own tests and data files, and
some contributions from students in past offerings of the course.
Documentation file
When you have finished coding and testing your component, you will
need to create a 1-2 page document describing:
- Your overall design
- The key data structures used in your component
- Your testing process
- Any known bugs
- Any assistance you received (from the
instructor, TA, other students in the course, or anyone else) in
program design or debugging, as per the Honor Code statement provided in the Administrative Information page
This document should be in plain text and should be named
Component-Initials_DOC, where
Component-Initials is the abbreviation (rm,
ix, sm, or ql) for the component that you
are submitting, e.g., rm_DOC or ix_DOC.
Submitting
We have created a submit script, which is a Unix shell script
that gathers together the necessary files and submits them to the TA.
After following the directions in the Setting Up
section above, you should have the submit script in your main
RedBase source directory. The following is seen if you run
submit with no arguments.
Usage: submit <-c or -s> HW#
c - Collect the files that are to be submitted.
This argument causes the file submit.Component-Initials to be
created and displayed containing a list of the files that will get
submitted by this script.
s - Submit the files.
This argument causes the files listed in submit.Component-Initials
to be submitted. If submit.Component-Initials does not exist, it is
created as for argument c.
It is possible to call this script with argument "-c", edit the
temporary file in case it does not contain the correct files,
and then call the script with argument "-s" to submit the files.
"HW#" should be the homework number for the current
submission. The submit program uses the homework number to
automatically locate most of the files in the submission.
The number should be:
1 - RM (Record Management)
2 - IX (Index Manager)
3 - SM (System Management)
4 - QL (Query Language)
5 - EX (Extension)
HW# is required every time that submit is run.
As can be seen, submitting your code is a two-step process. First,
you will need to run "submit -c HW#",
which gathers the list of files to be sent to the TA and saves this
list in the file "submit.Component-Initials". The submission
script attempts to list all the relevant files, but you should take a
look at the list to ensure that all necessary files are there. If
needed, you can manually edit the list to add any missing
files. (Some information about which files are necessary for each
component is included in the individual component specifications.)
After ensuring that all necessary files are listed in the file
"submit.Component-Initials", type "submit -s
HW#" to send those files to the TA.
NOTE: It is a common mistake to run "submit -c
HW#" and then forget to run "submit -s
HW#". Running "submit -c
HW#" alone will not submit anything!
Multiple Submissions
Students are often tempted to submit their project parts multiple
times, e.g., if the component is improved after the deadline, or
because of a sudden realization that something was omitted. You are
free to submit multiple times, however each submission completely
erases the previous one. In other words, we will take the last
submission only as the one we grade, both in terms of content and for
calculating any late penalty.
Two Final Notes
- The Makefile provided for you at the start of the project
automatically compiles all files with the -DPF_STATS flag.
Please do not remove this flag from the Makefile as it is necessary
for grading your assignment. See the PF Component
document for a description of the statistics tracking that occurs
when this flag is set.
- All students -- no matter where they are developing their RedBase
system -- must ensure that their code compiles and runs correctly on
the FarmShare machines before submitting each project part.
(Past experience suggests setting aside at least 1-2 days for the
"port" to the Stanford machines.) In addition, since we are requiring
the use of Valgrind during code development, your programs must run
correctly under Valgrind. By "correctly" we mean that
"ERROR SUMMARY: 0 errors from 0 contexts" is reported.
In general, you will find that it is quite difficult to get perfect
scores on all of the components. A perfect score requires
bullet-proof code, good documentation, and smart design choices. Each
of the first four components of the RedBase project -- the RM, IX, SM,
and QL components -- will be graded out of 40 points. For most
components a score above 35 is considered very solid work. The
scoring for these components is broken into four main parts:
Functionality
Correct functionality means that your program passes all of the tests
in the TA's test suite. Your score on functionality will be directly
proportional to the number of tests passed. Some of the tests will
try to break your code, so don't be surprised if we uncover
bugs you didn't think about. We provide some sample tests, but these
tests are generally much simpler than the "best" of the TA's tests.
It is your responsibility to fully test your component (see Testing above).
Documentation
As mentioned above, all students must submit a 1-2 page description of
your design, key data structures, and testing strategy for your
component. This document should focus on any novel features in your
implementation, and on the modularity you should have used in coding
the component (see Modularity/Correctness below). In addition,
we expect that you will sufficiently comment your code. By
"sufficiently" we mean that it will be possible for the TA to read a
small, specific part of the code and understand how it works. As a
rule of thumb, a comment on every line is too much, while large chunks
of code with no comments at all is too little. More details are
provided in the RM Component document.
Design Choices
Within each component there are certain design choices that you will
need to make. Usually, some of the options require less work and
result in less efficient code. Other options are more work but result
in code that is more stable, faster, or results in less wasted space.
For each component the TA will choose some number of design issues
that were left unspecified in the component description. You will
receive an email message within a few days of submitting the
component, asking you to describe how you implemented those aspects of
the component. Your descriptions should be at most 1-2 paragraphs
long, and they should include references to relevant file names and
line numbers within those files. Based on your response -- which the
TA will verify by looking at your code -- you will be assigned a
"design choices" score. The score will be based on the choices that
you made, whether your code properly implements those choices, and the
clarity and accuracy of your description. You will have 48 hours to
send a reply to the TA's questions; failure to respond within this
time will result in a zero score in the design choices category.
Modularity/Correctness
An elegant or modular design of the code is a somewhat subjective
measure. Most approaches will earn full credit. There are, of
course, some very bad designs that will lose points. In addition, a
component can pass all of the functionality tests but not be
implemented correctly. For example, a one-megabyte memory leak is
considered an incorrect implementation.