CS110: Principle of Computer Systems, Winter 2022

Based on documents by Lisa Yan, Jerry Cain, Julie Zelenski, Nick Bowman, and others

Assessment 2 Practice

Below, you can find links to practice materials - these are full practice exams, but we have noted the questions relevant to the topics on this assessment. Understand that we are trying to give you a sense of what some CS110 exam problems have looked like in the past, but we may not replicate the structure of these practice midterms in the actual assessment. In particular, some of the material on past exams is different from the material from this quarter, and these exams were written for different environments (perhaps closed book, written for longer times - for instance, the final exams were usually 3 hours). Note that some of the problems from the practice exams have been cannibalized to contribute to the section handouts.

Also note that some of these pratice problems speak of a function called sigsuspend, which is just like sigwait in that it blocks execution until a signal arrives. The primary difference is that sigsuspend doesn't actually tell you what signal was delivered and instead assumes the signal was asynchronously managed via some signal handler. We won't test your understanding of sigsuspend, but you are expected to understand sigwait. We dropped sigsuspend from the course this quarter and started using sigwait instead.

  • Practice Midterm 1: PDF | Solutions
    • Problems 2, 3 and 4d all cover multiprocessing
  • Practice Midterm 2: PDF | Solutions
    • Problems 1 and 2g-j all cover multiprocessing
  • Practice Midterm 3: PDF | Solutions
    • Problems 1, 2c-e and 2g all cover multiprocessing
  • Practice Midterm 4: PDF | Solutions
    • Problems 1, 2, 3c-e all cover multiprocessing
  • Practice Final 1: PDF | Solutions
    • Problem 1 covers multiprocessing
  • Practice Final 2: PDF | Solutions
    • Problems 1 and 2a cover multiprocessing
  • Practice Final 3: PDF | Solutions
    • Problems 1, 4b and 4d cover multiprocessing

Spring 2019 Midterm, Selected Questions

This midterm was not available in PDF format, so we've included the relevant questions below, all covering multiprocessing.

Question 1c/1d

For this problem, you will write a program that takes stdin and sends the input to two other programs (provided on the command line) as their input. For example, here is a run using sort and wc:

$ ./split-output /usr/bin/sort /usr/bin/wc
this
is
not
sorted   <--user typed ctrl-D after this line 
is
      4       4      19
not
sorted
this

Notice that the output of the two programs is interleaved -- your program will not handle de-interleaving the output, but you will discuss it in Part (d) of this problem.

For the following questions, assume you have a function available, dualEcho, that reads from fdIn and writes to both fdOut1 and fdOut2 until fdIn closes. If either fdOut1 or fdOut2 has a value of -1, it does not write any data to that file descriptor.

/* Function: dualEcho
 * ------------------
 * Writes all the input from fdIn to both fdOut1 and fdOut2
 * @param fdIn - the file descriptor to read from (until EOF)
 * @param fdOut1 - the first output file descriptor (ignore if -1)
 * @param fdOut2 - the second output file descriptor (ignore if -1)
 */
void dualEcho(int fdIn, int fdOut1, int fdOut2);

c) Complete the main function, which starts two programs and sends stdin to both, using the provided dualEcho function. You should not change stdout of either program.

int main(int argc, char *argv[]) {
    if (argc < 3) {
        printf("Usage:\n\t %s prog1 prog2\n", argv[0]);
        return 0;
    }
    char *progA[] = {argv[1], NULL};
    char *progB[] = {argv[2], NULL};

    // your code here

    return 0;
}

int fdsSupplyA[2];
int fdsSupplyB[2];
pid_t pids[2];
pipe2(fdsSupplyA, O_CLOEXEC);
pids[0] = fork();
if (pids[0] == 0) { // child
    // for supply
    // child will read from pipe
    close(fdsSupplyA[1]); // no need to write    
    // duplicate STDIN
    dup2(fdsSupplyA[0], STDIN_FILENO);
    close(fdsSupplyA[0]); // no need for this any more
    execvp(progA[0], progA);
    printf("Execvp for progA failed!\n");
    exit(0);
}
pipe2(fdsSupplyB, O_CLOEXEC);
pids[1] = fork();
if (pids[1] == 0) { // child
    // for supply
    // child will read from pipe
    close(fdsSupplyB[1]); // no need to write   
    // duplicate STDIN
    dup2(fdsSupplyB[0], STDIN_FILENO);
    close(fdsSupplyB[0]); // no need for this any more
    execvp(progB[0], progB);
    printf("Execvp for progB failed!\n");
    exit(0);
}
// parent will write to supply
close(fdsSupplyA[0]); // no need to read
close(fdsSupplyB[0]); // no need to read
dualEcho(STDIN_FILENO, fdsSupplyA[1], fdsSupplyB[1]);
close(fdsSupplyA[1]);
close(fdsSupplyB[1]);
waitpid(pids[0], NULL, 0);
waitpid(pids[1], NULL, 0);

d) As noted in the original question, your program does not handle de-interleaving the two outputs (i.e., one program does not print all of its output before the other program). Let's assume we actually want program A to print all of its data before program B prints its data.

One potential solution (let's call it "Solution A") would be to create two additional pipes and map stdout of each program to the write end of the pipes. Then, your program could read from the read end of each pipe, first reading program A and writing all of its output to the terminal stdout, and then repeating for program B.

Another potential solution ("Solution B") might have the second child (the one that will execvp program B) raise a SIGSTOP on itself before the execvp, and then have the parent process waitpid for program A to finish before sending a SIGCONT signal to program B.

It turns out that pipes have a maximum internal buffer (64KB on current Linux versions), such that write calls will block when a write pipe buffer is at its capacity, and would not unblock until some data has been read from the read buffer. In 100 words or less, answer the following question: Would either Solution A or Solution B presented above work by modifying our program as described? If both would work, explain which option would be better. If only one would work, explain why it succeeds and the other doesn't. If neither would work, explain the underlying problem that affects both. Again, assume you can not predict how much data will be presented to stdin for your program. [2 points]

Neither works. For solution A: If a child produces more than 64KB output its output pipe fills up. The child’s next write() blocks (until parent consumes). Since child is blocked, it cannot read() from its stdin and its pipe buffer fills causing parent to block on write(). Since parent writes its full stdin to the children before consuming from the children's output pipes, parent will get hung indefinitely on write(). For solution B: The self-stopped program will not read from it's stdin. Eventually it's pipe buffer will fill up and then the parent's write() calls will block causing the parent process to hang indefinitely.

Question 2

Consider the following C program and its execution. Assume all processes run to completion, all system and printf calls succeed, and that all calls to printf are atomic. Assume nothing about scheduling or time slice durations.

static pid_t pid; // global
void signalHandler(int signal) {
    if (pid == 0) {
        printf("child!\n");
    } else {
        printf("parent!\n");
        exit(0);
    }
}

int main(int argc, char *argv[]) {
    signal(SIGUSR1, signalHandler);
    for (int i = 0; i < 3; i++) {
        pid = fork();
        if (pid == 0) { // child
            if (i % 2 == 1) {
                raise(SIGUSR1);
            } else {
                kill(getppid(), SIGUSR1);
            }
            return 0;
        }
    }
    printf("I made it here!\n");
    for (int i = 0; i < 3; i++) {
        waitpid(-1, NULL, 0);
    }
    printf("And I made it here!\n");
    return 0;
}

a) Write down all possible outputs from the program.

Seven possible outputs:
*******************
I made it here!
child!
parent!
*******************
parent!
child!
******************
child!
parent!
******************
child!
I made it here!
parent!
*****************
I made it here!
parent!
child!
*****************
I made it here!
parent!
******************
parent!
******************

b) If the exit 0 line was removed from the signal handler, what are two significant differences that you would always see in the output that you did not see in part (a)?

If exit(0) was removed, there would be another “parent!” printed, and there would also be a “And I made it here!” printed.


Question 3

Believe it or not, sorting is still an active research field, and computer scientists develop new sorting algorithms frequently. You saw merge sort in lab. Some sorting algorithms are better than others. For this problem, you will write sleep sort, a sorting algorithm literally first posted on an online internet forum.

Here is how the sleep sort algorithm works: the function loops through an unsorted vector of integers, forks a process, and sleeps, in seconds, equal to the integer itself. After the sleep is complete, it prints the number to stdout. Convince yourself that this will, indeed, print out a sorted list of integers, and then convince yourself that it is an incredibly inefficient way to sort a list of integers.

a) Write the sleepsort function below. Ensure that the function properly waits for its children to complete before returning.

void sleepsort(vector<int> numbers) {

void sleepsort(vector<int> numbers) {
    for (int n : numbers) {
        pid_t pid = fork();
        if (pid == 0) {
            sleep(n);
            //usleep(n);
            printf("%d\n", n);
            exit(0);
        } 
    }
    for (size_t i=0; i < numbers.size(); i++) {
        waitpid(-1, NULL, 0);
    } 
}

b) sleep sort is sometimes referred to as "one giant race condition that happens to do something useful". In 100 words or less, explain what is meant by this phrase.

This is a giant race condition because all of the sleeps are happening at the same time, and except for the fact that they are different amounts of seconds, you don’t really know when each one will stop first. Because of the granularity of the sleep() value in seconds, the program will work as expected.

c) A clever online forum member commented that you could use usleep, which sleeps for microseconds instead of seconds, to speed up the sort by a factor of a million, instantly. It would be the world's best performance increase resulting from a single-character change in a program. In practice, though, this doesn't work. In 100 words or less: what would cause this amazing speedup to break the sort?

Possible reasons for usleep() to make the program fail:
  • The looping will take more than some number of microseconds, and for small integers, there just won’t be enough granularity to have distinct sleep times
  • Even if the microsecond timing is perfect, there are still very tight race conditions, and things like printf will take enough microseconds to destroy timing granularity.

Question 5

When you type ctrl-C while running a program, a SIGINT ("Interrupt") signal is sent to the program, which usually terminates the program. This is not particularly nice, and the program might be in the middle of something critical. Using a signal handler, we can capture SIGINT, and then set a flag that lets the program know that the user wants to terminate it. If the program checks the flag often, it can handle the termination, but do it gracefully, after it has closed everything down. Here is an example:

static bool okay_to_quit = false;

void sigint_handler(int sig) {
    okay_to_quit = true;
}

int main(int argc, char *argv[]) {
    signal(SIGINT, sigint_handler);
    int fds = open("importantFile.txt", O_WRONLY | O_CREAT | O_EXCL, 0644);
    while (true) {
        printf("I'm doing something important!\n");
        dprintf(fds,"Important data.\n");
        sleep(1);
        if (okay_to_quit) break;
    }
    printf("Safely Quitting!\n");
    close(fds);

    return 0;
}

We've now got a relatively safe way to handle SIGINT. But, what if we want to make things even safer? Sometimes a user can accidentally type ctrl-C. But users will very rarely type two ctrl-Cs in a row, one after the other. Modify the signal handler presented above to only set okay_to_quit to true if the user types ctrl-C twice in a 500ms time frame. Assume you have the following function:

/* Function: get_ms_time
 * This function returns the current time in milliseconds.
 * A duration of time in milliseconds can be calculated by
 * calling this function twice, and subtracting the first
 * value from the second.
 */
int get_ms_time();

We have provided you with two static function variables, which persist between calls to the function. Recall from CS107 that a static function variable is declared once and then retains its value for the duration of the program. If the function is called again, the value is the same as it was the last time the function was run.

/* Function: sigint_handler(int sig)
 * Sets the global okay_to_quit variable to true if
 * a user types ctrl-C twice within kTimeThreshold
 * number of milliseconds.
 */
 void sigint_handler(int sig) {
    const int kTimeThreshold = 500; // ms

    // first_sigint: variable to indicate that we have seen
    // the first of two SIGINTs
    static bool first_sigint = false;

    // start: variable to hold the starter time between calls to this function
    static int start;

    // your code here
}
 }

Robust solution (will handle three in a row if the last two are within 500ms of each other):

void sigint_handler(int sig) {
    // start: variable to hold the start time between calls to this function
    static int start;
    if (!first_sigint) {
        first_sigint = true;
        start = get_ms_time();
    } else {
        int end = get_ms_time();
        int duration = (end - start);
        if (duration < kTimeThreshold) {
            okay_to_quit = true;
        } else {
            start = end; 
        }
    }
}

Other solution (will capture two in a row, but not three):

void sigint_handler(int sig) {
    // start: variable to hold the start time between calls to this function
    static int start;
    if (!first_sigint) {
        first_sigint = true;
        start = get_ms_time();
    } else {
        int duration = (get_ms_time() - start);
        first_sigint = false;
        if (duration < kTimeThreshold) {
            okay_to_quit = true;
        }
    }
}