Stanford University
Computer Science 244b: Spring 2014

Assignment #2: Distributed Replicated Files

Dates

Intro

The goal of this assignment is to implement client and server prototypes for a distributed file system in which the files are replicated. The purpose of this assignment is to explore a service-specific protocol, relying on transactions for reliable delivery rather than conventional transport techniques; the prototype you will create will focus on this aspect of the file system. The primary invariant that must be maintained is that all available copies of the file must be the same. This is useful in systems that support read-any, i.e., where clients may read from any available replica and know they're getting the latest version.

For this assignment, you will be working individually.

Machine Compatibility

Your program must run on the myth Linux machines in Gates B08 (myth*.stanford.edu). You should use multicast packets to disseminate state as with the previous Mazewar assignment. You are not allowed to use physical broadcast packets to disseminate state. You can re-use the multicast setup code from your first asssignment. The multicast group address is same as first assignment, namely 0xe0010101.

Structure, Interfaces, and Implementation

Your goal is to provide us with the following:

We are providing you with:

The framework code we provide is in /usr/class/cs244b/replFs.
Applications which wish to use the distributed filesystem link to the client-side library and use the client-side interface to make write calls to replicated files. Of course, to this application, the replication and use of the network must be entirely transparent. Your most important tasks are to design and implement the client/server protocol.

Required Client API (Application Programming Interface)

The application interface to the client MUST include the following:

int InitReplFs(unsigned short portNum, int packetLoss, int numServers);
int OpenFile(char *name);
int WriteBlock(int fd, char *buffer, int byteOffset, int blockSize);
int Commit(int fd);
int Abort(int fd);
int CloseFile(int fd);

Prototypes for these are already provided in the starter source code. You must not change the function signatures.

Conspicuous by its absence is the lack of a ReadBlock() call in this API. Reading is not required for this assignment. We will use the files that your server stores locally to validate committed writes.

Protocol Skeleton

In providing the API listed above, the library code at the client and the file servers conspire to provide distributed replicated files. While we have provided an outline of how you should do this, your first job should be to flesh out this skeleton.

A brief outline of the steps involved with accessing a file:

  1. The client opens the file by multicasting to the servers and collecting responses. All reachable & running servers must be accounted for.
  2. The client multicasts writes across the net. Upon receiving these writes, the servers buffer the changes to the file. Note specifically that there are no ACKs at this stage.
  3. If the application commits, two phase commit is used to ensure that all updates from the previous step were received. (If not received, this information must somehow be retransmitted). Upon preparing to commit changes to a file, the client needs to identify to the servers in some way the list of updates to the server to commit, making sure that every server has all of the changes before sending a commit message. If all of the servers return an ``OK to commit'' message, the client then sends out a commit message. Alternatively, the client may send an abort message whereupon the servers revert the file to its previous state.

Assignment Assumptions

For this assignment, you are allowed and required to assume the following:

System-Wide Details

Server Details

Client Details

Testing Criteria

We will be using an automated test application to test your filesystem. It will attempt to stress your system in a number of ways. We will run tests with varying transaction size, with a varying number of servers, etc. A subset of these tests will be supplied to you in advance, and you should run them to make sure your project works with the testing harness. You can find the details on Piazza.

Report

The writeup is intended to be an insightful explication and analysis of your work.

The following sections should be included:

  1. Protocol Specification: Document your protocol by specifying packet formats, sequencing and semantics of packets and protocol events. Your protocol specification MAY include optional fields and behavior to support future directions; if so, please clearly document what subset you implemented.
  2. Evaluation: Discuss the merits and disadvantages of this approach to replication versus using conventional reliable transport.
  3. Future Directions: Discuss extensions, refinements, and modifications to your protocol and implementation that would be required for real deployment. An answer to this discussion necessarily includes consideration of large scale systems and files.

What to turn in

Run make clean, delete any extraneous folders/files, then run submit script /usr/class/cs244b/bin/submit from the directory that contains all the source files, the makefile and the report file. This is assignment/project #2.

Tentative Rubric

70% of your grade will be based on your implementation, including the tests run on it. The other 30% of your grade is based upon the report.