Solutions by Nick Troccoli and Lisa Yan
Go back to this week's extra problem (without solutions) here.
Go back to this week's lab solutions here.
These optional problems are an opportunity to further exercise your understanding of assembly and the runtime stack.
1) Recursion
About 15 min.
The factorial program that we saw in lecture runs a classic recursive factorial function. Let's review it and take a closer look at the assembly level to understand how it works.
- Run
./factorial 5 6 7 8 9to compute some small factorials. Try some larger values:./factorial 10 11 12 13 14 15 16. Something fishy seems to be going on. The factorial function grows at an alarming rate -- it doesn't take long before overflowing the range of an integer! - Look over the code for the
factorialfunction and trace out what you think will happen if you try to compute the factorial of a negative number. Now try it:./factorial -1. Did you get what you expected? Q: How is an invalid number causing a memory problem? Runfactorialunder gdb on argument -1. Usebacktraceto find out where the program was executing at the time of the crash. Does this shed some light on what has gone wrong?The program should crash with -1 because of a stack overflow - there were too many stack frames, and so the program ran out of stack memory. This is because the parameter is an unsigned int, so a negative number is cast to a positive one, and bypasses the initial check!
-
Get into
gdband figure out how big eachfactorialstack frame is. Here are two different strategies; try both!- Disassemble
factorialand scan its instructions. You are looking for those instructions that change%rsp, since this is what grows and shrinks the stack. Eachpushqandcallqinstruction copies an 8-byte value to the stack and decrements%rsp.%rspcan also be explicitly moved to open up space for local variables or scratch results. (In this case, however, factorial exhibits no manual adjustment of%rsp. The only data stored in its frame are saved registers placed there by push/call instructions.) Count the push/call instructions infactorialand and multiply by 8 to get the total number of bytes used per stack frame. - Set a breakpoint on
factorial, let a few calls through, and then use the gdb commandsinfo frame 1andinfo frame 2to examine the top two stack frames. Look for the frame addresses labeledStack frame atand subtract the two to compute the size of a single frame. - Q: These two approaches should give you the same answer. Do they?
Both approaches should give a stack frame size of 16 bytes. There is one
pushand onecallinstruction, for example. And the difference between two frame addresses should be 16 bytes (0x10).
- Disassemble
-
In the OS on
myth, the stack segment is configured at program start to a particular fixed limit and cannot grow beyond that. The bash shell built-inulimit -a(orlimitfor csh shells) displays various process limits. Q: What does it report is the process limit for stack size?8192KB.
- Divide the stack limit by the size of a
factorialframe (info: 1 Megabyte = 1024 Kilobytes, 1 Kilobyte = 1024 Bytes). This estimates the maximum stack depth for factorial. Try factorial of -1 under gdb again. When it crashes, usebacktrace -10to see the stack top to determine the stack depth. Q: How close is that depth to the maximum you estimated?The math should say 8192KB / 16B = 8192 * 1024 / 16 = 524,288 frames. This is pretty close to the topmost stack number, 524190!
- Edit the Makefile to change the optimization level used to compile the program. Find the line that reads
factorial: CFLAGS += -Og. The current setting is-Og, a relatively modest level of optimization. Change-Ogto-O2to apply more aggressive optimizations. Make clean and make, then run./factorial -1again. Woah! What happened? Q: Did the compiler "fix" the infinite recursion? Disassemble the optimizedfactorialto see what the compiler did with the code. This fancy optimization is called "tail-recursion elimination". (Note: gdb may not work as well with higher optimization levels, so you may have trouble e.g. setting breakpoints in the optimized version).It's subtle, but the recursion was replaced by a loop! It removes any need for stack space at all (no more
pushorcall) so the function can run as long as needed to compute an answer - in this case, for a negative number it will be cast to a large positive number and continue executing - it overflows from the multiplication and thus eventually evaluates to 0.
2) "Channeling" Information Between Functions
About 15 min.
The channel program was inspired by a bug encountered in a past offering of CS107. The code in question was "working fine" until adding a print statement, which caused a incommensurate amount of grief. Removing the print statement "fixed" the problem - very mysterious!
- Review the code in
channel.c. Theinit_arrayandsum_arrayfunction each declare a stack array. Theinitfunction sets the array values, the sum function adds the array values. Neither init nor sum takes any arguments. There is a warning from the compiler aboutsum_array, what is it? - The program calls
init_array()thensum_array(). Despite no explicit pass/return between the two calls, the array seems to be magically passed as intended. Q: How are these functions communicating? The term for this is "channeling". (Hint: when data is popped off the stack, it isn't cleared out - it just increments%rspto indicate that memory area can be reused).sum_arrayis able to read the contents of the stack frame thatinit_arrayleft behind becausesum_arrayis called immediately afterwards by the same caller, so both functions' stack frames are put at the same place in memory. Additionally, they have the exact same local variablenums, that overlays right on top ofnumsfrominit_array, so even thoughsum_arraydoesn't have any parameters, it "sees" the array thatinit_arrayjust left behind and is able to read it as though it were a parameter. - The program's second attempt to channel the array from
inittosumfails. Q: What is different about this case?printfis called in betweeninit_arrayandsum_array. This means thatprintfuses that stack space beforesum_array, and so it may overwrite or change the memory contents there to no longer be whatinit_arrayleft behind.
Tell your group a story about a previous situation where your program's behavior was reproducibly yet inexplicably altered by adding/removing/changing some innocent and seemingly unrelated code -- does this exercise shed light on a possible explanation?
3) Beyond Pluto
The earlier problem about stack smashing with pluto was a well-crafted example that only worked because of the way pluto was written. Experiment with writing your own functions that exhibit buffer overflow issues and experiment with how you can craft cases that let you jump to arbitrary functions, or even arbitrary parts of functions. What do you find?