Lab Solution 2: Multiprocessing and Unix Tools


This lab was created by Jerry Cain.

The first three problems are questions that could easily appear on a midterm or final exam, and they'll be the focus of discussion section. The exercises beyond that are designed to help you master the development tools even more so than you already have, and you'll benefit by working through those. The lab checkoff sheet for all students can be found right here.

Before starting, go ahead and clone the lab2 folder, which contains a working solution to Problem 1 and fodder for some of the later problems that have you experiment with some new development tools.

myth62:~$ git clone /usr/class/cs110/repos/lab2/shared lab2
myth62:~$ cd lab2
myth62:~$ make

Problem 1: Implementing exargs

exargs (designed here to emulate the xargs builtin--type man xargs for the full read) reads tokens from standard input (delimited by spaces and newlines), and executes the command with any initial arguments followed by the tokens read from standard input. exargs (or xargs) is useful when one program is needed to programmatically generate the argument vector for a second. (Understand that the builtin xargs is much more sophisticated than our exargs; xargs has all kinds of parsing options and flags, but our exargs has no such bells and whistles.)

To illustrate the basic idea, consider the built-in factor program (located at /usr/bin/factor), which prints out the prime factorizations of all of its numeric arguments, as with:

myth62:~$ factor 720
720: 2 2 2 2 3 3 5
myth62:~$ factor 9 16 2047 870037764750
9: 3 3
16: 2 2 2 2
2047: 23 89
870037764750: 2 3 3 5 5 5 7 7 7 7 11 11 11 11 11
myth62:~$

To see how exargs works, check this out:

myth62:~$ printf "720" | ./exargs factor
720: 2 2 2 2 3 3 5
myth62:~$ printf "2047 1000\n870037764750" | ./exargs factor 9 16
9: 3 3
16: 2 2 2 2
2047: 23 89
1000: 2 2 2 5 5 5
870037764750: 2 3 3 5 5 5 7 7 7 7 11 11 11 11 11
myth62:~$

Notice how this ultimately runs the factor program, with 9 and 16 as the first two arguments, and followed by the arguments supplied via stdin to exargs. Here, the first process in the pipeline--the printf--is a brute force representative of an executable capable of supplying or extending the argument vector of a second executable--in this case, factor--through exargs. Of course, the two executables needn't be printf or factor; they can be anything that works. If, for example, you wanted to see the amount of code in the lab2 files, you could use exargs to do this within your own lab2 folder:

myth62:~/lab2$ ls *.cc | ./exargs wc --chars --lines --max-line-length
  60 1809  102 buggy-exargs.cc
  64 1939  102 exargs.cc
  24  546   66 nonsense.cc
 148 4294  102 total

As a whiteboard exercise, construct the entire implementation of the exargs program, relying on the following 2 provided utility functions (implemented in the starter code) to parse all of standard input around newlines and spaces, and convert C++ string vectors to C string arrays:

/* Reads as much text as possible from the given stream
 * (e.g. cin, cout, or another input stream) and populates
 * the given vector with the tokens in that input, tokenized
 * by spaces and newlines.
 */
static void pullAllTokens(istream& in, vector<string>& tokens);

/* Converts the given vector of strings to C strings and adds
 * them to the array of C strings starting at the given address.
 */
static void addCPPStringsToCStringArray(vector<string>& strings, char **stringArr);

You need not perform any error checking on user input, and you can assume that all system calls succeed. Implement the entire program to return 0 if and only if the command executed by exargs exits normally with status code 0, and return 1 otherwise.

int main(int argc, char *argv[]) {
  // your code here
}

Solution

int main(int argc, char *argv[]) {
  // Read tokens from standard input
  vector<string> tokens;
  pullAllTokens(cin, tokens);

  pid_t pid = fork();
  if (pid == 0) {
    // In the child, create the arguments array for execvp
    char *exargsv[argc + tokens.size()];

    // Copy in our arguments first
    memcpy(exargsv, argv + 1, (argc - 1) * sizeof(char *));

    // Next, copy in the tokens from stdin and end with a NULL
    addCPPStringsToCStringArray(tokens, exargsv + argc - 1);
    exargsv[argc + tokens.size() - 1] = NULL;

    execvp(exargsv[0], exargsv);
    cerr << exargsv[0] << ": command not found" << endl;
    exit(0);
  }

  int status;
  waitpid(pid, &status, 0);

  // trivia: if all of status is 0, then child exited normally with code 0
  return status == 0 ? 0 : 1; 
}

Problem 2: Incorrect Output File Redirection

The publish user program takes an arbitrary number of filenames as arguments and attempts to publish the date and time (via the date executable that ships with all versions of Unix and Linux). publish is built from the following source:

static void publish(const char *name) {
   printf("Publishing date and time to file named \"%s\".\n", name);
   int outfile = open(name, O_WRONLY | O_CREAT | O_TRUNC, 0644);
   dup2(outfile, STDOUT_FILENO);
   close(outfile);
   if (fork() > 0) return;
   char *argv[] = { "date", NULL };
   execvp(argv[0], argv);
}

int main(int argc, char *argv[]) {
   for (size_t i = 1; i < argc; i++) publish(argv[i]);
   return 0;
}

At first glance, one might expect the program to have printed something like this:

myth62:~$ ./publish one two three four
Publishing date and time to file named "one".
Publishing date and time to file named "two".
Publishing date and time to file named "three".
Publishing date and time to file named "four".

However, that's not what happens. Questions:

Solution

myth62:~$ ./publish one two three four 
Publishing date and time to file named "one".
Publishing date and time to file named "two". 
Sun Apr 15 10:14:14 PDT 2018

Contents of four:

Sun Apr 15 10:14:14 PDT 2018
static void publish(const char *name) {
  printf("Publishing date and time to file named \"%s\".\n", name); 
  if (fork() > 0) return;
  int outfile = open(name, O_WRONLY | O_CREAT | O_TRUNC, 0644); 
  dup2(outfile, STDOUT_FILENO);
  close(outfile);
  char *argv[] = { "date", NULL };
  execvp(argv[0], argv);
}

Problem 3: Short Answer Questions

Here are a bunch of short answer questions that have appeared on past CS110 midterms and final exams.

Solution

Problem 4: Experimenting with strace

strace is an advanced development tool that programmers use to determine what system calls contribute to the execution of a program (generally because the program is malfunctioning, and they're curious if any failing system calls are the root cause of the problem). If, for instance, you want to see how sleep 10 works behind the scenes, you gain a lot by typing this in and following along:

myth62:~$ strace sleep 10
execve("/bin/sleep", ["sleep", "10"], [/* 28 vars */]) = 0
brk(NULL)                               = 0x1e35000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=143245, ...}) = 0
mmap(NULL, 143245, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f638fc15000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3

// lots of calls omitted in the name of brevity

fstat(3, {st_mode=S_IFREG|0644, st_size=4289456, ...}) = 0
mmap(NULL, 4289456, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f638f231000
close(3)                                = 0
nanosleep({10, 0}, NULL)                = 0
close(1)                                = 0
close(2)                                = 0
exit_group(0)                           = ?
+++ exited with 0 +++

A typical strace run starts with an execve (which the shell in the output above uses instead of execvp), and then works through all of these systemsy things to load C++ libraries, build the heap segment, etc., until it reaches the crucial nanosleep call, which is the call that halts the process for 10 seconds. You see gestures to system calls that have come up in CS107, CS110, and your first two assignments: execve, access, mmap, open, and close.

If you're interested only in a particular subset of system calls, you can identify those of interest when invoking strace using the -e trace=, as with this:

myth62:~$ strace -e trace=read ls /usr/class/cs110
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260Z\0\0\0\0\0\0"..., 832) = 832
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\t\2\0\0\0\0\0"..., 832) = 832
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0000\25\0\0\0\0\0\0"..., 832) = 832
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240\r\0\0\0\0\0\0"..., 832) = 832
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260`\0\0\0\0\0\0"..., 832) = 832
read(3, "nodev\tsysfs\nnodev\trootfs\nnodev\tr"..., 1024) = 434
read(3, "", 1024)                       = 0
cgi-bin  include  lecture-examples  lib  local    private_data  repos  samples  staff  tools  WWW
+++ exited with 0 +++
myth62:~$ strace -e trace=mmap,munmap ls /usr/class/cs110
mmap(NULL, 143245, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7faf170d5000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7faf170d4000
mmap(NULL, 2234080, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7faf16cb1000
mmap(0x7faf16ecf000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1e000) = 0x7faf16ecf000
mmap(0x7faf16ed1000, 5856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7faf16ed1000
mmap(NULL, 3971488, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7faf168e7000
mmap(0x7faf16ca7000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1c0000) = 0x7faf16ca7000

// lots of calls omitted in the name of brevity

mmap(0x7faf1646f000, 13352, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7faf1646f000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7faf170d2000
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7faf170d0000
munmap(0x7faf170d5000, 143245)          = 0
mmap(NULL, 4289456, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7faf15e3e000
cgi-bin  include  lecture-examples  lib  local    private_data  repos  samples  staff  tools  WWW
+++ exited with 0 +++

Take some time to experiment with strace, as it'll help you with your Assignment 3 work (because you implement a light version of strace as part of it--surprise!). Type in each of these commands from any directory and see what you get:

Note that calls to fork are reframed as calls to clone, and calls to execvp are reframed as calls to execve.

Rather than list out specific system calls via -e trace=, strace allows you to specify a family of system calls, as per strace's man page, e.g. -e trace=file, or -e trace=memory.

Now type these commands from any directory and see what you get:

See what happens when you try to launch something that can't be launched, because the specified file isn't an executable:

Just for fun, try strace strace, with:

Finally, descend into your lab2 folder and trace a few things there:

Scan strace's man page for information about the -f flag, and be sure you understand why the output of the second command includes so many calls to execve.

Solution

Rather than list the output, it's important to clarify that the first command (with trace=file) should confine itself to execve, access, open, statfs ,stat, and openat. Every single one of these system calls has one or more arguments that is the name of some file (e.g. "ls", "/usr/class/cs110", "/proc/ filesystems"). Note that read, write and close don't show up, since those are descriptor-related, not file-related.

Including trace=memory lists all operations that introduce and configured new memory segments: brk, mmap, mprotect, and munmap. And trace=desc lists anything that returns a descriptor and/or accepts one or more as arguments: mmap (again), open, fstat, close, read, openat, getdents, and write.

For the files that aren't executables, look at this, and see how execvp fails! And note that the error messages are printed to file descriptor 2 (e.g. STDERR_FILENO).

myth62:~$ strace /usr/class/cs110/WWW
execve("/usr/class/cs110/WWW", ["/usr/class/cs110/WWW"], [/* 22 vars */]) = -1 EACCES (Permission denied)
write(2, "strace: exec: Permission denied\n", 32strace: exec: Permission denied
) = 32
exit_group(1)                           = ?
+++ exited with 1 +++
myth62:~$ strace /usr/class/cs110/WWW/index.html
execve("/usr/class/cs110/WWW/index.html", ["/usr/class/cs110/WWW/index.html"], [/* 22 vars */]) = -1 EACCES (Permission denied)
write(2, "strace: exec: Permission denied\n", 32strace: exec: Permission denied
) = 32
exit_group(1)                           = ?
+++ exited with 1 +++

Running strace strace ls lists all of the system calls called by strace, but not those associated with the grandchild ls-running process.

Here's the output for the last set of commands to run:

myth62:~$ printf "4 6 8" | strace -e trace=clone,execve ./exargs factorexecve("./exargs", ["./exargs", "factor"], [/* 22 vars */]) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f63e627fa10) = 4346
4: 2 2
6: 2 3
8: 2 2 2
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=4346, si_uid=13364, si_status=0, si_utime=0, si_stime=0} ---
+++ exited with 0 +++
myth62:~$ printf "4 6 8" | strace -f -e trace=clone,execve ./exargs factor
printf "4 6 8" | strace -f -e trace=clone,execve ./exargs factor
execve("./exargs", ["./exargs", "factor"], [/* 22 vars */]) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fa3976d7a10) = 5159
strace: Process 5159 attached
[pid  5159] execve("/usr/local/sbin/factor", ["factor", "4", "6", "8"], [/* 22 vars */]) = -1 ENOENT (No such file or directory)
[pid  5159] execve("/usr/local/bin/factor", ["factor", "4", "6", "8"], [/* 22 vars */]) = -1 ENOENT (No such file or directory)
[pid  5159] execve("/usr/sbin/factor", ["factor", "4", "6", "8"], [/* 22 vars */]) = -1 ENOENT (No such file or directory)
[pid  5159] execve("/usr/bin/factor", ["factor", "4", "6", "8"], [/* 22 vars */]) = 0
4: 2 2
6: 2 3
8: 2 2 2
[pid  5159] +++ exited with 0 +++
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=5159, si_uid=13364, si_status=0, si_utime=0, si_stime=0} ---
+++ exited with 0 +++

You should note a few things:

Problem 5: Using valgrind and gdb with fork

You know by this point that we've provided a solution to Problem 1 in the lab2 starter code. We're going to use this problem as a vehicle for learning how to use gdb to step through processes that split into multiple ones.

This exercise is framed in terms of an intentionally buggy version of exargs.cc, which we've placed in buggy-exargs.cc. We've done bad things in buggy-exargs.cc to make sure that buggy-exargs fails miserably. Here's a diff between the buggy version and the correct one:

myth62:$ diff exargs.cc buggy-exargs.cc
60c60
<     char *exargsv[argc + tokens.size()];
---
>     char **exargsv = NULL;

Run the following commands to reaffirm that the correct version runs clean under valgrind (aside from the suppressed errors we spoke of in the Assignment 1 handout):

myth62:$ printf "1 2 3 4" | ./exargs factor
1:
2: 2
3: 3
4: 2 2
myth62:$ printf "1 2 3 4" | valgrind ./exargs factor
==17627== Memcheck, a memory error detector
==17627== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==17627== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==17627== Command: ./exargs factor
==17627== 
1:
2: 2
3: 3
4: 2 2
==17627== 
==17627== HEAP SUMMARY:
==17627==     in use at exit: 72,704 bytes in 1 blocks
==17627==   total heap usage: 5 allocs, 4 frees, 77,024 bytes allocated
==17627== 
==17627== LEAK SUMMARY:
==17627==    definitely lost: 0 bytes in 0 blocks
==17627==    indirectly lost: 0 bytes in 0 blocks
==17627==      possibly lost: 0 bytes in 0 blocks
==17627==    still reachable: 0 bytes in 0 blocks
==17627==         suppressed: 72,704 bytes in 1 blocks
==17627== 
==17627== For counts of detected and suppressed errors, rerun with: -v
==17627== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Not surprisingly, things don't go so well with the buggy version:

myth62:$ printf "1 2 3 4" | ./buggy-exargs factor
myth62:$ printf "1 2 3 4" | valgrind ./buggy-exargs factor
==17797== Memcheck, a memory error detector
==17797== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==17797== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==17797== Command: ./buggy-exargs factor
==17797== 
==17798== Invalid write of size 8
==17798==    at 0x4C326CB: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==17798==    by 0x40160A: main (buggy-exargs.cc:61)
==17798==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==17798== 
==17798== 
==17798== Process terminating with default action of signal 11 (SIGSEGV)
==17798==  Access not within mapped region at address 0x0
==17798==    at 0x4C326CB: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==17798==    by 0x40160A: main (buggy-exargs.cc:61)

... continuing with reports of still reachable memory resulting from premature exit ...

valgrind tells us that memcpy is trying to write memory to an invalid address (0x0 is as NULL an address as they come).

We'll play dumb and pretend we don't know why it's segfaulting, even if valgrind does tell us that line 61 of buggy-exargs.cc seems to be the culprit. If we want to use gdb to set a breakpoint in buggy-exargs.cc, line 61, it's totally possible. You just do so like this:

myth62:$ gdb buggy-exargs

... startup preamble omitted for brevity ...

Reading symbols from ./buggy-exargs...done.
(gdb) list 61
56    vector<string> tokens;
57    pullAllTokens(cin, tokens);
58    pid_t pid = fork();
59    if (pid == 0) {
60      char **exargsv = NULL;
61      memcpy(exargsv, argv + 1, (argc - 1) * sizeof(char *));
62      addCPPStringsToCStringArray(tokens, exargsv + argc - 1);
63      exargsv[argc + tokens.size() - 1] = NULL;
64      execvp(exargsv[0], exargsv);
65      cerr << exargsv[0] << ": command not found" << endl;
(gdb) b 61
Breakpoint 1 at 0x4015e4: file buggy-exargs.cc, line 61.
(gdb)

However, according to this, by default gdb doesn't track execution of additional processes spawned off by the primary. If we were to type run right now, gdb would proceed and fully circumvent the child block and miss our breakpoint entirely. Sadness.

Let's confirm by having you execute the following:

Here are some questions for you:

Relaunch gdb as you did before, but this time configure it to follow the child process instead of the parent. Confirm that you break on line 61 as intended, and further confirm that exargsv is NULL as if you've never noticed it before and you've found your bug.

Solution

Checkoff Questions Solutions

myth62:~$ ./publish one two three four 
Publishing date and time to file named "one".

Website design based on a design by Chris Piech
Icons by Piotr Kwiatkowski