CS346 - Spring 2014
Database System Implementation

RedBase Logistics

This document contains important logistical information for the RedBase project:

Setting Up

There are many ways to organize the source code for your RedBase system. Many of you will be tempted to break your source code up across directories -- for instance one directory for the RM component, one for IX, and so on. However, our experience dictates that for building the system and for TA grading, it is vastly preferable that you keep all of your RedBase source code in a single directory. Please structure your code management in this way throughout the project.

To get you started, we are providing the Paged File (PF) component of the RedBase system. The functionality of this component is described in detail in the PF Component document. Later we will be providing a few additional routines and a command-line parser.

We have written a script that automates the process of getting all necessary files to support your RedBase code development. You will need to invoke the script once for each of the four basic components of the RedBase system: each invocation retrieves the files needed for that component. The first invocation will fetch the entire PF component, template rm.h and rm_rid.h files, and an RM "tester" file (rm_test.cc). Here is the sequence of steps you need to follow.

Note: If you plan to try using a non-Stanford workstation to develop RedBase then you will need to follow a different procedure. Please contact the TA.

Login as normal and create a directory for the RedBase system:
```
   mkdir redbase
```
Change into the directory:
```
   cd redbase
```
To make sure that nobody else can read the directory:
```
   fs la .
```
In case other users (such as system:anyuser) have read and/or write permissions, remove them from the ACL. (see Setting Permissions with UNIX)
Make a symbolic link to the RedBase setup script by typing:
```
   ln -s /usr/class/cs346/redbase/scripts/setup .
```
To get information about the setup script execute:
```
   ./setup -h
```
Now execute the setup script. It will transfer or create symbolic links to all files necessary for you to complete the first project part. You can use the "-v" switch in order to see exactly what is happening. Don't forget to include the argument "1".
```
   ./setup 1
```
The setup script will create a src directory. Change into the src directory:
```
   cd src
```
Make a symbolic link to the RedBase submit script by typing:
```
   ln -s /usr/class/cs346/redbase/scripts/submit .
```
The submit script should be stored in the same directory as all of your source code.
An initial Makefile is provided for you. You should use this makefile as a basis for the makefile that will manage all of your RedBase source code. Typing "make" alone will create the library for the PF component. Typing "make testers" will create the PF library and three PF component test programs. Type:
```
   make testers
```
pf_test1, pf_test2, and pf_test3 are executables that test the PF component. Feel free to try out the tests if you want.

After submitting each component of your RedBase system (see Submission Process below) you will need to get supporting files for the next component. Use the same setup script, but substitute as an argument to the script the correct component number: use "2" for IX, "3" for SM, and "4" for QL.

Testing

Testing is extraordinarily important for RedBase. Year after year, diligent CS346 students are sure they've coded "bullet-proof" components, they fail to test them sufficiently, and they're caught by surprise with low functionality scores. The tests supplied for each component should be used only as a basis for building your own more extensive tests -- the tests run by the TA will be much more comprehensive than what you will receive. Your documentation is expected to include a description of your testing process, and it will form a (relatively small) fraction of your grade.

It is within the Honor Code for students to share tests and data, as long as the material is first made available to everyone in the class. Sharing tests or data without making them available publicly first is a violation of the Honor Code. Despite those harsh warnings, we do very much encourage test and data sharing. We have created central directories where students should place their shared tests and data. The directories are /usr/class/cs346/redbase/testers/ and /usr/class/cs346/redbase/data/, and each tester or data file should include the student's name and email contact. We have already placed in these directories a few of our own tests and data files, and some contributions from students in past offerings of the course.

Submission Process

Documentation file

When you have finished coding and testing your component, you will need to create a 1-2 page document describing:

Your overall design

The key data structures used in your component

Your testing process

Any known bugs

Any assistance you received (from the instructor, TA, other students in the course, or anyone else) in program design or debugging, as per the Honor Code statement provided in the Administrative Information page

This document should be in plain text and should be named Component-Initials_DOC, where Component-Initials is the abbreviation (rm, ix, sm, or ql) for the component that you are submitting, e.g., rm_DOC or ix_DOC.

Submitting

We have created a submit script, which is a Unix shell script that gathers together the necessary files and submits them to the TA. After following the directions in the Setting Up section above, you should have the submit script in your main RedBase source directory. The following is seen if you run submit with no arguments.

Usage: submit <-c or -s> HW#

   c - Collect the files that are to be submitted.  
       This argument causes the file submit.Component-Initials to be 
       created and displayed containing a list of the files that will get
       submitted by this script.

   s - Submit the files. 
       This argument causes the files listed in submit.Component-Initials
       to be submitted.  If submit.Component-Initials does not exist, it is
       created as for argument c.

       It is possible to call this script with argument "-c", edit the
       temporary file in case it does not contain the correct files,
       and then call the script with argument "-s" to submit the files.

   "HW#" should be the homework number for the current 
   submission.  The submit program uses the homework number to 
   automatically locate most of the files in the submission.

   The number should be:
     1 - RM (Record Management)
     2 - IX (Index Manager)
     3 - SM (System Management)
     4 - QL (Query Language)
     5 - EX (Extension)

   HW# is required every time that submit is run.

As can be seen, submitting your code is a two-step process. First, you will need to run "submit -c HW#", which gathers the list of files to be sent to the TA and saves this list in the file "submit.Component-Initials". The submission script attempts to list all the relevant files, but you should take a look at the list to ensure that all necessary files are there. If needed, you can manually edit the list to add any missing files. (Some information about which files are necessary for each component is included in the individual component specifications.)

After ensuring that all necessary files are listed in the file "submit.Component-Initials", type "submit -s HW#" to send those files to the TA.

NOTE: It is a common mistake to run "submit -c HW#" and then forget to run "submit -s HW#". Running "submit -c HW#" alone will not submit anything!

Multiple Submissions

Students are often tempted to submit their project parts multiple times, e.g., if the component is improved after the deadline, or because of a sudden realization that something was omitted. You are free to submit multiple times, however each submission completely erases the previous one. In other words, we will take the last submission only as the one we grade, both in terms of content and for calculating any late penalty.

Two Final Notes

The Makefile provided for you at the start of the project automatically compiles all files with the -DPF_STATS flag. Please do not remove this flag from the Makefile as it is necessary for grading your assignment. See the PF Component document for a description of the statistics tracking that occurs when this flag is set.
All students -- no matter where they are developing their RedBase system -- must ensure that their code compiles and runs correctly on the FarmShare machines before submitting each project part. (Past experience suggests setting aside at least 1-2 days for the "port" to the Stanford machines.) In addition, since we are requiring the use of Valgrind during code development, your programs must run correctly under Valgrind. By "correctly" we mean that "ERROR SUMMARY: 0 errors from 0 contexts" is reported.

Grading

In general, you will find that it is quite difficult to get perfect scores on all of the components. A perfect score requires bullet-proof code, good documentation, and smart design choices. Each of the first four components of the RedBase project -- the RM, IX, SM, and QL components -- will be graded out of 40 points. For most components a score above 35 is considered very solid work. The scoring for these components is broken into four main parts:

Functionality

Correct functionality means that your program passes all of the tests in the TA's test suite. Your score on functionality will be directly proportional to the number of tests passed. Some of the tests will try to break your code, so don't be surprised if we uncover bugs you didn't think about. We provide some sample tests, but these tests are generally much simpler than the "best" of the TA's tests. It is your responsibility to fully test your component (see Testing above).

Documentation

As mentioned above, all students must submit a 1-2 page description of your design, key data structures, and testing strategy for your component. This document should focus on any novel features in your implementation, and on the modularity you should have used in coding the component (see Modularity/Correctness below). In addition, we expect that you will sufficiently comment your code. By "sufficiently" we mean that it will be possible for the TA to read a small, specific part of the code and understand how it works. As a rule of thumb, a comment on every line is too much, while large chunks of code with no comments at all is too little. More details are provided in the RM Component document.

Design Choices

Within each component there are certain design choices that you will need to make. Usually, some of the options require less work and result in less efficient code. Other options are more work but result in code that is more stable, faster, or results in less wasted space. For each component the TA will choose some number of design issues that were left unspecified in the component description. You will receive an email message within a few days of submitting the component, asking you to describe how you implemented those aspects of the component. Your descriptions should be at most 1-2 paragraphs long, and they should include references to relevant file names and line numbers within those files. Based on your response -- which the TA will verify by looking at your code -- you will be assigned a "design choices" score. The score will be based on the choices that you made, whether your code properly implements those choices, and the clarity and accuracy of your description. You will have 48 hours to send a reply to the TA's questions; failure to respond within this time will result in a zero score in the design choices category.

Modularity/Correctness

An elegant or modular design of the code is a somewhat subjective measure. Most approaches will earn full credit. There are, of course, some very bad designs that will lose points. In addition, a component can pass all of the functionality tests but not be implemented correctly. For example, a one-megabyte memory leak is considered an incorrect implementation.