Project 1: A Simple Shell
In this project you will implement a simple shell program that supports
I/O redirection and pipelines. The goal of this
project is for you to learn about the fork, execvp, and waitpid
system calls, as well as I/O redirection.
Commands
Your shell will read commands from its standard input and execute them. Each command consists of one or more words separated by white space (spaces or tabs). The first word specifies a program to run. The shell invokes this program as a child process and all passes of the words (including the program name) to it as arguments. For example, consider this command:
find . -name shell.ccIt has four words: find, ., -name, and shell.c. The first word,
find, is the name of the program to run.
I/O Redirection
Normally, the shell's child processes use the same standard input and
standard output files as the shell.
However, the characters < and > can be used to change the standard
input and/or output of a command.
For example, the command
grep std::string shell.cc > lineswill write all of the lines of the file shell.cc that contain the
string std::string
to the file lines instead of the terminal, and the command
wc < shell.ccwill read shell.cc and output statistics such as the number of lines.
The characters < and > can appear anywhere in a command; they
terminate the current word, if any, and the next word is taken as
the name of the file.
The redirection information is not included in the arguments passed to
the command.
For example, consider the following command:
grep<shell.cc>match.txt -HThe grep command will receive two arguments, grep and -H; its
standard input will come from the file shell.cc and its output
will be written to the file match.txt.
Pipelines
If the character | appears in a command line, it ends one command
and starts another; the standard output of the first command will be
passed as input to the second command using a pipe.
This is called a pipeline.
For example,
grep std::string shell.cc | wcwill arrange for the lines of shell.cc containing the string
std::string to be passed as standard input to wc, where they
will be counted. It has the same effect as the following
commands
grep std::string shell.cc > tmp
wc < tmp
rm tmpHowever, in the pipeline case the grep and wc commands will run
concurrently, so they can potentially finish more quickly.
If a pipeline also includes I/O redirection, < and > apply to
their respective commands in the pipeline and redirections take
priority over pipes (see Other Requirements for details).
Implementation
For this project you will need to use the following system calls:
fork: creates a new process, which is a duplicate of the parentexecvp: after a new child process has been forked, invoke this in the child to start a new program in that process. This kernel call uses the PATH environment variable to find the executable file corresponding to a command; this provides easy access to the standard system programs.waitpid: after creating one or more child processes, use this system call to wait for each child to exit. If you prefer, you may use thewaitkernel call, which waits for any child to exit, rather than waiting for a specific child.open: open a file that is going to be used for I/O redirectionpipe: creates a new pipe for pipelinesdup2: once you have opened a file or pipe, this can be used to make a copy of its file decscriptor in descriptor 0 (for standard input or 1 (for standard output) for a child processclose: to close files and pipes when they are no longer needed
You can learn more about these system calls with the man program.
For example,
man execvpwill print information about the execvp system call.
Development and Testing
Do your work for this project on the Myth cluster. To get started, invoke the following command:
git clone http://web.stanford.edu/class/cs111/starters/shell.git cs111_p1This will create a new directory cs111_p1. Do your work for the
project in this directory. You can use Git to manage your revisions
to the project.
The directory will contain a skeleton
file sh111.cc in which we have provided code to parse command lines.
The parser creates one cmd struct describing each suprocess that must be
invoked;
each cmd contains the words that must be passed to that command, plus
zero or more redirect structs that describe I/O redirection for that
command. See the code for detailed documentation of these structures.
Once a command line has been parsed, the function run_pipeline will be
invoked to create the child processes and wait for them to complete.
You must implement the body of this function, creating children as
described by the argument to run_pipeline. If more than one command
is specified, you must create pipes between consecutive commands in
the pipeline.
The directory will also contain a Makefile. If you type make,
it will compile sh111.cc, creating an executable file sh111.
You can run your shell by typing the following command:
./sh111A prompt should appear
and you should be able to invoke shell commands. Most simple
shell commands should work just as well in sh111 as in your normal
shell. However, a few commands, such as cd and exit, won't work
in sh111. These are built-in commands: normal shells such as
bash execute them directly, without invoking a child process.
Commands like cd need to be
built-in because executing them in a child process won't be useful.
For example, if a child process changes its working directory, that
won't have any effect on the shell or on future commands it invokes,
and if a child process exits, that won't cause the parent shell to
exit. sh111 doesn't support built-in commands, so these commands
won't work.
You can run a series of basic tests on your shell
by invoking the command ./run_tests. It will print error
messages if tests fail; you should then be able to go back and run
the failing tests manually (each is just a series of shell commands)
to debug the problem.
We do not guarantee that these tests are exhaustive, so passing
all of the tests is not necessarily sufficient to ensure a perfect
score (CAs may discover other problems in reading through your code).
Order of Implementation
We recommend implementing the project in stages, so that you can see simple things working before trying more complex things. Here is one possible order:
- When
run_pipelineis invoked, ignore its argument. Justforka child and have the child print a message and then exit immediately, without invokingexecvp. The parent waits for the child to exit. - Get a single command to run in the child process and wait on it.
This should allow commands like
ls -landsleep 5to run just as they would in any standard shell. - Get two commands in a pipeline to run simultaneously in two separate
child processes, with the first piping its output to the second.
This means the command
ls | wc -lwill run and print out the total number of files in your current directory. - Get an arbitrary number of commands in a pipeline to run simultaneously,
each in its own distinct child process. This means the command
ls | grep sh111 | wc -lwill print out the number of files in the current directory that contain the substring "sh111". This task is probably the most difficult, so work carefully! - Finally, incorporate file redirection into each of the commands.
Other Requirements
- You must use pipes to pass data between processes in a pipeline (for example, it is not OK to run the processes one at a time, redirecting one process's output to a file and then redirecting the next process's input from the same file). This allows all of the processes in the pipeline to run concurrently.
- If I/O redirection is also specified for a command in a pipeline, redirection
takes priority over pipes. For example, consider the following command:
The I/O redirection will causegrep std::string shell.cc > grep.out | wcgrep's output to go to the filegrep.outinstead of the pipe connecting towc. The pipe will still exist, andwcwill read from it, but the writing end of the pipe will have been closed, sowcwill immediately read an end-of-file condition. (If the redirection were specified after the pipe symbol, it would have applied towcinstead ofgrep) - You must ensure that each child process starts with only three open
file descriptors: standard input, standard output, and standard error.
If you have opened any other descriptors for files or pipes, you must
close them before
execing the child.
Submitting Your Work
To submit your solutions, cd to the directory containing your
work, then invoke the command
./submitIf you discover problems or improvements after you have submitted, you may resubmit; only the latest submit will be graded.