Lab 6 Extra Problem (Solution)

Solutions by Nick Troccoli and Lisa Yan

Go back to this week's extra problem (without solutions) here.

Go back to this week's lab solutions here.

These optional problems are an opportunity to further exercise your understanding of assembly and the runtime stack.

1) Recursion

About 15 min. The factorial program that we saw in lecture runs a classic recursive factorial function. Let's review it and take a closer look at the assembly level to understand how it works.

  • Run ./factorial 5 6 7 8 9 to compute some small factorials. Try some larger values: ./factorial 10 11 12 13 14 15 16. Something fishy seems to be going on. The factorial function grows at an alarming rate -- it doesn't take long before overflowing the range of an integer!
  • Look over the code for the factorial function and trace out what you think will happen if you try to compute the factorial of a negative number. Now try it: ./factorial -1. Did you get what you expected? Q: How is an invalid number causing a memory problem? Run factorial under gdb on argument -1. Use backtrace to find out where the program was executing at the time of the crash. Does this shed some light on what has gone wrong?

    The program should crash with -1 because of a stack overflow - there were too many stack frames, and so the program ran out of stack memory. This is because the parameter is an unsigned int, so a negative number is cast to a positive one, and bypasses the initial check!

  • Get into gdb and figure out how big each factorial stack frame is. Here are two different strategies; try both!

    • Disassemble factorial and scan its instructions. You are looking for those instructions that change %rsp, since this is what grows and shrinks the stack. Each pushq and callq instruction copies an 8-byte value to the stack and decrements %rsp. %rsp can also be explicitly moved to open up space for local variables or scratch results. (In this case, however, factorial exhibits no manual adjustment of %rsp. The only data stored in its frame are saved registers placed there by push/call instructions.) Count the push/call instructions in factorial and and multiply by 8 to get the total number of bytes used per stack frame.
    • Set a breakpoint on factorial, let a few calls through, and then use the gdb commands info frame 1 and info frame 2 to examine the top two stack frames. Look for the frame addresses labeled Stack frame at and subtract the two to compute the size of a single frame.
    • Q: These two approaches should give you the same answer. Do they?

      Both approaches should give a stack frame size of 16 bytes. There is one push and one call instruction, for example. And the difference between two frame addresses should be 16 bytes (0x10).

  • In the OS on myth, the stack segment is configured at program start to a particular fixed limit and cannot grow beyond that. The bash shell built-in ulimit -a (or limit for csh shells) displays various process limits. Q: What does it report is the process limit for stack size?

    8192KB.

  • Divide the stack limit by the size of a factorial frame (info: 1 Megabyte = 1024 Kilobytes, 1 Kilobyte = 1024 Bytes). This estimates the maximum stack depth for factorial. Try factorial of -1 under gdb again. When it crashes, use backtrace -10 to see the stack top to determine the stack depth. Q: How close is that depth to the maximum you estimated?

    The math should say 8192KB / 16B = 8192 * 1024 / 16 = 524,288 frames. This is pretty close to the topmost stack number, 524190!

  • Edit the Makefile to change the optimization level used to compile the program. Find the line that reads factorial: CFLAGS += -Og. The current setting is -Og, a relatively modest level of optimization. Change -Og to -O2 to apply more aggressive optimizations. Make clean and make, then run ./factorial -1again. Woah! What happened? Q: Did the compiler "fix" the infinite recursion? Disassemble the optimized factorial to see what the compiler did with the code. This fancy optimization is called "tail-recursion elimination". (Note: gdb may not work as well with higher optimization levels, so you may have trouble e.g. setting breakpoints in the optimized version).

    It's subtle, but the recursion was replaced by a loop! It removes any need for stack space at all (no more push or call) so the function can run as long as needed to compute an answer - in this case, for a negative number it will be cast to a large positive number and continue executing - it overflows from the multiplication and thus eventually evaluates to 0.

2) "Channeling" Information Between Functions

About 15 min. The channel program was inspired by a bug encountered in a past offering of CS107. The code in question was "working fine" until adding a print statement, which caused a incommensurate amount of grief. Removing the print statement "fixed" the problem - very mysterious!

  • Review the code in channel.c. The init_array and sum_array function each declare a stack array. The init function sets the array values, the sum function adds the array values. Neither init nor sum takes any arguments. There is a warning from the compiler about sum_array, what is it?
  • The program calls init_array() then sum_array(). Despite no explicit pass/return between the two calls, the array seems to be magically passed as intended. Q: How are these functions communicating? The term for this is "channeling". (Hint: when data is popped off the stack, it isn't cleared out - it just increments %rsp to indicate that memory area can be reused).

    sum_array is able to read the contents of the stack frame that init_array left behind because sum_array is called immediately afterwards by the same caller, so both functions' stack frames are put at the same place in memory. Additionally, they have the exact same local variable nums, that overlays right on top of nums from init_array, so even though sum_array doesn't have any parameters, it "sees" the array that init_array just left behind and is able to read it as though it were a parameter.

  • The program's second attempt to channel the array from init to sum fails. Q: What is different about this case?

    printf is called in between init_array and sum_array. This means that printf uses that stack space before sum_array, and so it may overwrite or change the memory contents there to no longer be what init_array left behind.

Tell your group a story about a previous situation where your program's behavior was reproducibly yet inexplicably altered by adding/removing/changing some innocent and seemingly unrelated code -- does this exercise shed light on a possible explanation?

3) Beyond Pluto

The earlier problem about stack smashing with pluto was a well-crafted example that only worked because of the way pluto was written. Experiment with writing your own functions that exhibit buffer overflow issues and experiment with how you can craft cases that let you jump to arbitrary functions, or even arbitrary parts of functions. What do you find?