Problem 1: Double Trouble

12 points

This question has four parts

For this problem, you will write a program that takes stdin and sends the input to two other programs (provided on the command line) as their input. For example, here is a run using sort and wc:

$ ./split-output /usr/bin/sort /usr/bin/wc
this
is
not
sorted   <--user typed ctrl-D after this line 
is
      4       4      19
not
sorted
this

Notice that the output of the two programs is interleaved -- your program will not handle de-interleaving the output, but you will discuss it in Part (d) of this problem.

a) Write a function, writeBuf that uses the write function to print an entire buffer to a file descriptor (you are not allowed to use the dprintf function for this question). [2 points]

/* Function: writeBuf
 * ------------------
 * Writes the contents of buf with size len to the file descriptor fdOut
 * @param fdOut - the file descriptor where the data will be written
 * @param buf - the buffer to write
 * @param len - the number of bytes in the buffer
 */
void writeBuf(int fdOut, char *buf, ssize_t len);

b) Write a function, dualEcho that reads from fdIn and writes to both fdOut1 and fdOut2 until fdIn closes. If either fdOut1 or fdOut2 has a value of -1, you should not write any data to that file descriptor. You can assume that your writeBuf function from part (a) works perfectly. You are not allowed to use any dynamic memory for this function, and you cannot presume to know how much data will be read from fdIn. [3 points]

/* Function: dualEcho
 * ------------------
 * Writes the all input from fdIn to both fdOut1 and fdOut2
 * @param fdIn - the file descriptor to read from (until EOF)
 * @param fdOut1 - the first output file descriptor (ignore if -1)
 * @param fdOut2 - the second output file descriptor (ignore if -1)
 */
void dualEcho(int fdIn, int fdOut1, int fdOut2);

c) Complete the main function, which starts two programs and sends stdin to both, using your (assumably perfect) dualEcho function. You should not change stdout of either program. [5 points]

  int main(int argc, char **argv) {
    if (argc < 3) {
        printf("Usage:\n\t %s prog1 prog2\n",argv[0]);
        return 0;
    }
    char *progA[] = {argv[1], NULL};
    char *progB[] = {argv[2], NULL};

    // your code here:


























    return 0;
}

d) As noted in the original question, your program does not handle de-interleaving the two outputs (i.e., one program does not print all of its output before the other program). Let's assume we actually want program A to print all of its data before program B prints its data.

One potential solution (let's call it "Solution A") would be to create two additional pipes and map stdout of each program to the write end of the pipes. Then, your program could read from the read end of each pipe, first reading program A and writing all of its output to the terminal stdout, and then repeating for program B.

Another potential solution ("Solution B") might have the second child (the one that will execvp program B) raise a SIGSTOP on itself before the execvp, and then have the parent process waitpid for program A to finish before sending a SIGCONT signal to program B.

It turns out that pipes have a maximum internal buffer (64KB on current Linux versions), such that write calls will block when a write pipe buffer is at its capacity, and would not unblock until some data has been read from the read buffer. In 100 words or less, answer the following question: Would either Solution A or Solution B presented above work by modifying our program as described? If both would work, explain which option would be better. If only one would work, explain why it succeeds and the other doesn't. If neither would work, explain the underlying problem that affects both. Again, assume you can not predict how much data will be presented to stdin for your program. [2 points]

Problem 2: Signal Puzzle

6 points

This question has 2 parts.

Consider the following C program and its execution. Assume all processes run to completion, all system and printf calls succeed, and that all calls to printf are atomic. Assume nothing about scheduling or time slice durations.

 static pid_t pid; // global
void signalHandler(int signal) {
    if (pid == 0) {
        printf("child!\n");
    } else {
        printf("parent!\n");
        exit(0);
    }
}

int main(int argc, char *argv[]) {
    signal(SIGUSR1, signalHandler);
    for (int i = 0; i < 3; i++) {
        pid = fork();
        if (pid == 0) { // child
            if (i % 2 == 1) {
                raise(SIGUSR1);
            } else {
                kill(getppid(), SIGUSR1);
            }
            return 0;
        }
    }
    printf("I made it here!\n");
    for (int i = 0; i < 3; i++) {
        waitpid(-1, NULL, 0);
    }
    printf("And I made it here!\n");
    return 0;
}

a) Write down all possible outputs from the program. [4 points]

b) If the exit 0 line was removed from the signal handler, what are two significant differences that you would always see in the output that you did not see in part (a)? [2 points]

Problem 3: Sleep Sort

5 points

This problem has 3 parts.

Believe it or not, sorting is still an active research field, and computer scientists develop new sorting algorithms frequently. You saw merge sort in lab last week. Some sorting algorithms are better than others. For this problem, you will write sleep sort, a sorting algorithm literally first posted on the 4chan Internet forum, known for its anonymous, meme-creating, troll-happy users who seem to be there only for the lulz.

Here is how the sleep sort algorithm works: the function loops through an unsorted vector of integers, forks a process, and sleeps, in seconds, equal to the integer itself. After the sleep is complete, it prints the number to stdout. Convince yourself that this will, indeed, print out a sorted list of integers, and then convince yourself that it is an incredibly stupid way to sort a list of integers (though arguably admirable, in a 4chan sense).

a) Write the sleepsort function below. Ensure that the function properly waits for its children to complete before returning. [3 points]

void sleepsort(vector<int> numbers) {

b) sleep sort is sometimes referred to as one giant race condition that happens to do something useful. In 100 words or less, explain what is meant by this phrase. [1 point]

c) A clever 4chaner commented that you could use usleep, which sleeps for microseconds instead of seconds, to speed up the sort by a factor of a million, instantly. It would be the world's best performance increase resulting from a single-character change in a program. In practice, though, this doesn't work. In 100 words or less: what would cause this amazing speedup to break the sort? [1 point]

Problem 4: File Systems

7 points

This problem has 4 parts.

Recall that the Unix v6 file system has 512 byte sector sizes, with a 32-byte inode per file, 2-byte block id numbers, and 16-byte struct direntv6 entries. Use this information to answer the following questions. All answers must be 100 words or less for credit.

a) Newer hard drives and file systems sometimes define the sector size to be 4096 bytes. Describe the pros and cons of a larger sector size. [2 points]

b) Assume a newer file system changed the size of an inode to 64 bytes. Explain how you would use the extra 32 bytes for the greatest benefit of the file system. List one negative of having a larger inode size. [2 points]

c) What is the main limitation to a 2-byte block id number? Name a tradeoff to having a 4-byte block number. [2 points]

d) Explain how a directory is "just a regular file" in terms of the Unix v6 file system. [1 point]

Problem 5: Safer SIGINT

3 points

When you type ctrl-C while running a program, a SIGINT ("Interrupt") signal is sent to the program, which usually terminates the program. This is not particularly nice, and the program might be in the middle of something critical. Using a signal handler, we can capture SIGINT, and then set a flag that lets the program know that the user wants to terminate it. If the program checks the flag often, it can handle the termination, but do it gracefully, after it has closed everything down. Here is an example:

static bool okay_to_quit = false;

void sigint_handler(int sig) {
    okay_to_quit = true;
}

int main(int argc, char *argv[]) {
    signal(SIGINT, sigint_handler);
    int fds = open("importantFile.txt", O_WRONLY | O_CREAT | O_EXCL, 0644);
    while (true) {
        printf("I'm doing something important!\n");
        dprintf(fds,"Important data.\n");
        sleep(1);
        if (okay_to_quit) break;
    }
    printf("Safely Quitting!\n");
    close(fds);

    return 0;
}

We've now got a relatively safe way to handle SIGINT. But, what if we want to make things even safer? Sometimes a user can accidentally type ctrl-C. But users will very rarely type two ctrl-Cs in a row, one after the other. Modify the signal handler presented above to only set okay_to_quit to true if the user types ctrl-C twice in a 500ms time frame. Assume you have the following function:

/* Function: get_ms_time
 * This function returns the current time in milliseconds.
 * A duration of time in milliseconds can be calculated by
 * calling this function twice, and subtracting the second
 * value from the first.
 */
int get_ms_time();

We have provided you with two static function variables, which persist between calls to the function. Recall from CS107 that a static function variable is declared once and then retains its value for the duration of the program. If the function is called again, the value is the same as it was the last time the function was run.

/* Function: sigint_handler(int sig)
 * Sets the global okay_to_quit variable to true if
 * if a user types ctrl-C twice within kTimeThreshold
 * number of milliseconds
 */
void sigint_handler(int sig) {
    const int kTimeThreshold = 500; // ms

    // first_sigint: variable to indicate that we have seen
    // the first of two SIGINTs
    static bool first_sigint = false;

    // start: variable to hold the start time between calls to this function
    static int start;