These optional problems are an opportunity to further exercise your understanding of assembly and the runtime stack.
1) Recursion
About 15 min.
The factorial program that we saw in lecture runs a classic recursive factorial function. Let's review it and take a closer look at the assembly level to understand how it works.
- Run
./factorial 5 6 7 8 9to compute some small factorials. Try some larger values:./factorial 10 11 12 13 14 15 16. Something fishy seems to be going on. The factorial function grows at an alarming rate -- it doesn't take long before overflowing the range of an integer! - Look over the code for the
factorialfunction and trace out what you think will happen if you try to compute the factorial of a negative number. Now try it:./factorial -1. Did you get what you expected? How is an invalid number causing a memory problem? Runfactorialunder gdb on argument -1. Usebacktraceto find out where the program was executing at the time of the crash. Does this shed some light on what has gone wrong? -
Get into
gdband figure out how big eachfactorialstack frame is. Here are two different strategies; try both!- Disassemble
factorialand scan its instructions. You are looking for those instructions that change%rsp, since this is what grows and shrinks the stack. Eachpushqandcallqinstruction copies an 8-byte value to the stack and decrements%rsp.%rspcan also be explicitly moved to open up space for local variables or scratch results. (In this case, however, factorial exhibits no manual adjustment of%rsp. The only data stored in its frame are saved registers placed there by push/call instructions.) Count the push/call instructions infactorialand and multiply by 8 to get the total number of bytes used per stack frame. - Set a breakpoint on
factorial, let a few calls through, and then use the gdb commandsinfo frame 1andinfo frame 2to examine the top two stack frames. Look for the frame addresses labeledStack frame atand subtract the two to compute the size of a single frame. - These two approaches should give you the same answer. Do they?
- Disassemble
-
In the OS on
myth, the stack segment is configured at program start to a particular fixed limit and cannot grow beyond that. The bash shell built-inulimit -a(orlimitfor csh shells) displays various process limits. What does it report is the process limit for stack size? - Divide the stack limit by the size of a
factorialframe (info: 1 Megabyte = 1024 Kilobytes, 1 Kilobyte = 1024 Bytes). This estimates the maximum stack depth for factorial. Try factorial of -1 under gdb again. When it crashes, usebacktrace -10to see the stack top to determine the stack depth. How close is that depth to the maximum you estimated? - Edit the Makefile to change the optimization level used to compile the program. Find the line that reads
factorial: CFLAGS += -Og. The current setting is-Og, a relatively modest level of optimization. Change-Ogto-O2to apply more aggressive optimizations. Make clean and make, then run./factorial -1again. Woah! What happened? Did the compiler "fix" the infinite recursion? Disassemble the optimizedfactorialto see what the compiler did with the code. This fancy optimization is called "tail-recursion elimination". (Note: gdb may not work as well with higher optimization levels, so you may have trouble e.g. setting breakpoints in the optimized version).
2) "Channeling" Information Between Functions
About 15 min.
The channel program was inspired by a bug encountered in a past offering of CS107. The code in question was "working fine" until adding a print statement, which caused a incommensurate amount of grief. Removing the print statement "fixed" the problem - very mysterious!
- Review the code in
channel.c. Theinit_arrayandsum_arrayfunction each declare a stack array. Theinitfunction sets the array values, the sum function adds the array values. Neither init nor sum takes any arguments. There is a warning from the compiler aboutsum_array, what is it? - The program calls
init_array()thensum_array(). Despite no explicit pass/return between the two calls, the array seems to be magically passed as intended. How are these functions communicating? The term for this is "channeling". (Hint: when data is popped off the stack, it isn't cleared out - it just increments%rspto indicate that memory area can be reused). - The program's second attempt to channel the array from
inittosumfails. What is different about this case?
Tell your group a story about a previous situation where your program's behavior was reproducibly yet inexplicably altered by adding/removing/changing some innocent and seemingly unrelated code -- does this exercise shed light on a possible explanation?
3) Beyond Pluto
The earlier problem about stack smashing with pluto was a well-crafted example that only worked because of the way pluto was written. Experiment with writing your own functions that exhibit buffer overflow issues and experiment with how you can craft cases that let you jump to arbitrary functions, or even arbitrary parts of functions. What do you find?