Project 1: Raft Leader Election

Project 1: Raft Leader Election

In Projects 1 and 2 you and a teammate will implement the Raft consensus protocol. In Project 1 you will implement enough of the protocol to perform leader election (everything but the log); you will also implement a simple remote shell. In Project 2 you will add the log to create a replicated state machine that executes shell commands.

For an introduction to Raft, please read the extended version of the Raft paper. For this course you will need to understand Sections 1-5 and the first part of Section 8 (finding the leader); you do not need to understand Sections 6 and 7 in detail. Over the course of Projects 1 and 2 you will implement all of the features described in Section 5.

Features

For Project 1, you must implement enough of the Raft protocol to start a cluster of servers, elect a leader, maintain leadership with heartbeats, and elect a new leader if the current leader fails or can't communicate with the other servers. Specifically, you must:

  • Create a mechanism for network communication between machines that is sufficient for the needs of your Raft implementation. The mechanism must support both communication between servers and communication between clients and servers. You may not use any existing library for network communication, such as gRPC. You should build your mechanism directly on the basic socket system calls such as socket, bind, accept, and connect. Note: clients and servers use network communication in different ways; be sure to think carefully about what facilities each of them needs, and try to find one API that works well for both.

  • Implement the RequestVote and AppendEntries requests described in the Raft paper. You won't implement the actual log until Project 2, so for this project AppendEntries requests will not include any log entries (they will be used only for heartbeats). Implement the requests as if each server has a log that is empty.

  • Implement the timeout-based leader election mechanism described in the Raft paper. Once a leader gets elected, it must issue AppendEntries requests to preserve its leadership. If the current leader crashes or fails to send AppendEntries requests for any reason, then other servers must start a new election. The election mechanism must handle split votes as described in the Raft paper. For this project you will not be able to incorporate log information into the voting process as described in Section 5.4.1 of the Raft paper, but you should be able to implement all of the other election features.

  • Implement a simple persistent storage mechanism to preserve each server's current term and vote for that term, if any. The current values for this information must be stored safely in a file before a server sends a request or response, and the server must recover this information from the file when it restarts. You may not use an existing database system or storage system for this: you must build your functionality on the basic C/C++ file I/O facilities such as fopen, std::iostream, or std::fstream. If you are not sure whether it's OK to use a particular mechanism in your implementation, check with me.

  • Store the persistent data for each server in the current working directory where that server is invoked. It must be possible to run several servers simultaneously on a single laptop, each storing its data in a different directory and listening on a different TCP port. You will find this useful for testing.

  • Ensure that your cluster can elect and maintain a stable leader as long as a majority of the servers in the cluster are running and respond promptly to requests. For example, if one server is down or responds very slowly to requests, it must still be possible for the other servers to elect a leader. This means that each server must be capable of issuing requests concurrently to all of the other servers in the cluster. You cannot wait for a RequestVote request to one server to complete before issuing a RequestVote request to another; if you do, a slow response to the first request could prevent the election from completing successfully. If a server crashes and restarts, the existing leader must reconnect to that server before its election timeout expires, so the restarting server doesn't trigger a new election.

  • Write a simple client application that executes a loop where each iteration (a) reads a one-line shell command from standard input, (b) sends the command to the cluster leader, and (c) prints on standard output the results returned by the leader. When the leader receives a command from a client, it executes the command as a shell command by invoking bash with the -c option. Once the command is completed, the server returns to the client any output generated by the command. If a follower receives a client command, it must reject it and the client must retry with the leader (you must provide a mechanism that allows clients to identify the leader). If the leader crashes, the client must find the new leader, once it has been elected, and retry the request with that leader. Note that leaders can crash in the middle of executing a client request as well as between requests.

I recommend that you read through the Project 2 description before you start on Project 1. Knowing what you will need to implement in Project 2 may change how you implement Project 1, which will save you time later.

Additional Notes and Requirements

  • You must implement the project in C++. Use the compiler options "-Wall -Werror", which cause the compiler to check a variety of things more strictly and refuse to compile your program until you fix the issues. For standard Makefiles, you can do this by adding the following line to your Makefile:

    CFLAGS += -Wall -Werror
  • You must work in teams of two (one team of three is OK if we end up with an odd number of students). This is important for three reasons: first, by working in teams you can attack a larger project, which leads to more interesting design issues; second, the team approach means that you have someone with whom you can discuss your designs; and third, it reduces the number of projects that I have to read, which permits a better tradeoff between class size and instructor sanity.

  • Your most important goal is to create a clean, simple, and elegant code structure. Although I expect your code to work, and it must implement the features described above, I will be judging it primarily on its structure; it's better to spend time cleaning up the structure and documentation than fixing minor bugs.

  • When designing your interfaces, you must work from scratch, without using or consulting any existing code that offers similar functionality, such as existing communication libraries or implementations of Raft. This is important so that you can make design decisions on your own. In addition, many existing packages have bad interfaces, so they may not serve as good models. You may use any of the C++ std:: classes, such as std::unordered_map. If you have any questions about what existing packages it is OK to use, please ask me.

  • You may use an existing package for translating between in-memory data structures and network messages; I recommend using protocol buffers.

  • I will create a private GitHub repository for your team to use and send you information about this repository. Create a branch in this repo named project1 and use this branch for all of your work on the project.

  • For at least one (non-trivial) source file, write the top-level declarations and interface comments before you fill in any of the method bodies. Create a commit on the project1 branch whose only change is the skeletal version of this file, and tag that commit commentsBeforeCode1. Make sure that the commit message also includes the name of this file. Some or all of this initial information will probably change in later commits; that's OK. I recommend that you write comments before code for all your files, but I will only require it for one file. To add a tag to the current commit, invoke the following commands:

    git tag -a commentsBeforeCode1
    git push origin commentsBeforeCode1

    The first command creates a new tag; by default the tag will only be in your local repo. The second command pushes the tag out to the GitHub repo so it will also be visible in any clones of that repo (such as mine). The command git tag will display all of the tags that currently exist in the repo.

  • Debugging distributed systems like this one is hard. For example, you can't always use breakpoints to debug: if a server enters the debugger, it no longer responds to requests, and this can lead to timeouts elsewhere in the cluster. Good logging is crucial: write out messages to the console or to a file that describe all of the major actions of the system, such as receiving requests, granting votes, becoming leader, etc. You can then compare the logs for different servers to understand the system's behavior and track down problems.

  • Your implementation must allow sockets to be reused for multiple requests. This is more difficult than discarding sockets after each request, but it provides significantly better performance and would be a requirement for any production implementation of Raft. Sockets may still get closed, such as when machines crash; when this happens you must reopen them.

  • Overall, your implementation should provide "reasonable" performance, meaning that it shouldn't be dramatically slower than the fastest possible implementation.

  • Be sure to think about memory management issues: who is responsible for dynamically allocated memory, and when does it get freed? You may find the classes std::unique_ptr and std::shared_ptr helpful.

  • Each client and server in a Raft cluster needs to know the network addresses of the servers in the cluster. You do not need to implement the cluster membership mechanism from the Raft paper. You can pick a simple approach such as providing a list of server names on the command-line when you invoke a client or server.

  • Your implementation must allow servers to execute on different machines, or all on a single machine.

  • Your command execution mechanism must support multiple client machines at once.

  • I recommend that you choose an election timeout in the range of 5-10 seconds; this is short enough that you won't have to wait a long time during testing, but long enough so that you don't get overwhelmed with log output from frequent elections.

  • Figure 2 in the Raft paper provides a very precise specification of the behavior of the system, and you should follow this religiously. Even small deviations are likely to result in bugs. You might be interested in reading the advice given to MIT students implementing Raft in a distributed systems class.

  • Section 5.1 of the Raft paper says "Servers retry RPCs if they do not receive a response in a timely manner" but in fact Raft has other retry mechanisms that make it unnecessary to retry at this level. If an AppendEntries request or response is lost, it will be retried during the next heartbeat, and if a RequestVote request or response is lost, the worst that will happen is for the election to time out, in which case the RequestVote will be retried in the next election.

  • I recommend that you invoke setsockopt as follows after binding a socket on which you will listen for incoming connections:

    setsockopt(socketfd, SOL_SOCKET, SO_REUSEADDR)

    This call allows the socket to be reused immediately. Without this call, if your program crashes and you restart it, it won't be able to reopen the socket until a timeout period of 30-120 seconds has elapsed.

Development Environment

I strongly recommend using an integrated development environment (IDE) for the projects in this class. For example, Eclipse has good support for C++. Please configure your development environment so that indent widths are 4 spaces, and only space characters are stored in files, never tabs (this will make it easier for me to review your code: tab characters and/or 8-space indents result in very long lines in the code review tool).

I believe that you can use the following instructions to configure Eclipse for this:

  • Go to Window->Preferences->C++->Code Style->Formatter
  • Select "Edit..."
  • Set "Indentation size" to 4, "Tab size" to 4, and "Tab policy" to "Spaces only", then save this (you may need to create a new named profile to save this information).
  • If you have already created some files using different indentation, you can reformat them by selecting the files in the Package Explorer, then right-clicking one of the files and selecting Source->Format.

Please keep your lines no longer than 80 characters in length. Long lines make code harder to read.

Submitting Your Project

To submit your project, push all of your changes to GitHub on a branch named project1. Then create a pull request on GitHub. The base for the pull request should be your master branch (which has nothing on it except your initial commit) and the comparison branch should be the head of your project1 branch. Use "Project 1" as the title for your pull request. If your project is not completely functional at the time you submit, describe what is and isn't working in the comments for the pull request.

If you are planning to use late days for this project (or any project) please send me an email before the project deadline so that I know how many late days you plan to use.

README file

I will download your GitHub repo and I may try running your code on my laptop. Please create a README file in the top-level directory of your project with instructions for how to build and run your server (e.g., how does each server know the names of the other servers?).