Lab 6: Runtime Stack

Lab sessions Tue Oct 27 to Thu Oct 29

Lab written by Julie Zelenski, with modifications by Nick Troccoli

Pre-Lab Exercises

Before attending your lab, please work through the "Pre-lab Exercise" section below. We plan for the exercise to take about 20-30min, and it will be essential to the further problems you'll work through during your lab session. Feel free to post on the discussion forum if you have questions while working!

Learning Goals

This lab is designed to give you a chance to:

Become more familiar with useful GDB commands and tricks when working with assembly
observe and understand the correct operation of the runtime stack
diagnose symptoms of stack mismanagement

Get Started

Once you get set up with your group, share with them anything fun you did over the weekend!

Clone the repo by using the command below to create a lab6 directory containing the project files.

git clone /afs/ir/class/cs107/repos/lab6/shared lab6

Note: there will be compiler warnings when you make this starter project - this is expected for some of the exercises!

Pre-Lab Exercise: GDB Tips and Tricks + Function Pointers

GDB is an indispensible tool when working on the next assignment, particularly for reverse engineering. We can't emphasize enough how familiarity with GDB will make your life easier while working on it! Share your most productive gdb tips with your labmates to get a few new tricks into everyone's repertoire!

Examining Stack Frames

Here is a reminder of some commands introduced in previous labs, assignments and lectures:

Use backtrace to display all of the stack frames from the current point of execution back to main. With an argument N, backtrace N shows only the N innermost frames and backtrace -N shows only the N outermost.
The up, down, and frame n commands allow you to change the selected frame. These commands don't change the state of program execution (the execution remains where it stopped in the topmost frame), but they allow you to examine the runtime state from the perspective of another stack frame. For example, changing to the frame main allows to print the variables/parameters that are visible only in that scope.
The info frame command tells you the story about this stack frame. The info args and info locals provide information about the parameters and local variables, respectively.
You can view data on the stack using the x command on $rsp (note that g means "quadword", or 8 bytes, and x means "in hexadecimal")
```
(gdb) x/4gx $rsp   // 4 quadwords, in hex, read from top of stack
```

Auto-display and Printing Registers

The info reg command displays current values for all integer registers. Use e.g. p $rax to print value from a single register. In gdb, remember that access to a register is via $name instead of the more familiar %name.

A register is treated as an untyped 8-byte value and when you ask gdb to print it, it shows a decimal integer or hex address. You can dictate how to interpret the value by adding a /[format] to the print command or by using a C typecast, e.g.

(gdb) p $rax
$1 = 4196128
(gdb) p/t $rax
$2 = 10000000000011100100000
(gdb) p (char *)$rax
$3 = 0x400720 "Hello, world!\n"

The handy gdb command display sets up an expression to be repeatedly evaluated and printed as you single-step. For example, try these display expressions:

(gdb) display/2gx $rsp     // 2 quadwords, in hex, read from stack top
(gdb) display/3i $rip      // next 3 assembly instructions to execute

Type display with no arguments to list all of the currently-set expressions to display, and undisplay X to stop displaying expression X in the list shown by display. Clever use of the display command can provide similar benefits to the tui GUI-like gdb interface, minus the flakiness and garbled display.

Conditional Breakpoints

You can set breakpoints to only trigger when certain conditions in your code are true. For instance, say you have the following loop in your code:

1    for (int i = 0; i < count; i++) {
2        ...
3    }

If you wanted to step through the code inside the loop just the last time the loop executed, with a normal loop you may have to skip over many program breaks before you get to the part you want to examine. However, gdb lets you add an optional condition (in C code syntax) for when the breakpoint should be stopped at:

(gdb) break 2 if i == count - 1

The format is [BREAKPOINT] if [CONDITION]. Now this breakpoint on line 2 will only be hit the last time around the loop! You can even use local variables in your expression, as shown above with i and count. Experiment to see what other useful conditions you might use.

Breakpoint Commands

You can set up a sequence of commands for gdb to execute each time a particular breakpoint is reached. Any valid gdb command is fair game -- get a backtrace, enable/disable other breakpoints, evaluate a C expression, change the value in a register, and so on. You can even use breakpoint commands to temporarily patch buggy code from within the debugger!

Let's say a buggy line 192 allocates one fewer byte than needed:

 192   s = malloc(strlen(t));         // oops, supposed to be len + 1

First, we set a breakpoint on line 192.

(gdb) break 192
Breakpoint 1 at 0x400a8d: file program.c, line 192.

Next, we add commands to that breakpoint to insert a correct allocation, jump over the buggy line, and continue from there. We can do this with command; if we run this, it will prompt us for what commands to execute for the most recently-added breakpoint. It will continue allowing us to type until we type end:

(gdb) commands
Type commands for breakpoint(s) 1, one per line.
End with a line saying just "end".
print s = malloc(strlen(t)+1)
jump 193
continue
end

Notice in particular the print s = .... line - this is technically a print command, but it's really changing the value of s while the program is running! This also works for registers, too; e.g. print $rax = .... to change what's in a register, which is useful for confirming hypotheses about what happens when registers have different values.

Thus, each time line 192 is about to be executed, the breakpoint commands intervene to automatically insert the correct allocation, skip over the buggy statement, and continue execution. Neat! (and perhaps quite useful for certain explosive devices...)

Function Pointers in Assembly

One last assembly-related detail that may be useful on your next assignment (hint hint) is how function pointers work. As mentioned in lecture, you can have a jmp or call instruction with an operand that is either the label you would like to jump to, which is hardcoded in the instruction itself, or the location of where the address is we should jump to. For instance, if the jump location is stored in %rax, then we could say jmp *%rax or call *%rax to jump to or call the function at the address stored in %rax. This is useful, because a function pointer is really just storing the address of where a function's first instruction lives. Take a look at function_pointers.c, which contains the gfind_max generic function from lab4, as well as some code that calls it. Step through the assembly code for the gfindmax function to find where the compare_function parameter is referenced. What assembly instruction calls the function pointer? Print out the value of the compare_function parameter, and then put a breakpoint at the start of the cmp_alpha function and resume until you hit it. Then, print out the value of the %rip register. These two approaches should give you the same value printed out. Do they?

Exercises

1) Stack Mechanics (15 min)

Let's try out some of the new GDB commands from the prelab exercise while examining the mechanics of the runtime stack. Open stack.c in your editor and peruse the C code for the various functions.

Run the stack program under gdb. Set a breakpoint on kermit and when at the breakpoint (note that you won't be able to set these before running the program), set up these auto-display expressions:

    (gdb) display/3i $rip
    (gdb) display/4gx $rsp

The first command says "display the 3 [i]nstructions starting from $rip" - in other words, display the next 3 assembly instructions to execute. It will display content in the following format, highlighting the instruction we are currently on, and the two instructions coming next:

=> 0x400587 <kermit>:   push   %rbp
   0x400588 <kermit+1>: push   %rbx
   0x400589 <kermit+2>: mov    %edi,%ebx

The second command says "display the 4 [g]iant words (quadwords) in he[x] starting at $rsp" - in other words, display the top 4 quadwords (the top 4 8-byte entries) on the stack. It will display content in the following format (note that the specific values may differ from what you see):

0x7fffffffe950: 0x00000000004005f0  0x00000000004005be
0x7fffffffe960: 0x0000000000000000  0x00007ffff7a2d830

The way to read this is from top-to-bottom, left-to-right; in other words, interpret it in the following "stack diagram" where the top of the stack (the lowest stack address) is at the bottom:

Address	Contents
...	...
0x7fffffffe968	0x00007ffff7a2d830
0x7fffffffe960	0x0000000000000000
0x7fffffffe958	0x00000000004005be
0x7fffffffe950	0x00000000004005f0

Now use stepi to single-step by assembly instruction. Walk through the kermit function, including the calls made to dinky/binky. Observe how control is transferred from caller to callee and back. The auto-display instructions show you what is happening on the stack and how $rip and $rsp are changing. Here are a few aspects in particular to talk through:

At the start of kermit function, it pushes the values from the caller-owned registers %rbp and %rbx to the stack and at the end, it pops them back off. This is because these are caller-owned registers; a callee must ensure that these registers are the same as before they were called. So if a function wants to use one, it must save and put back later the existing value of the register. However, if it calls a function itself, it does not need to worry about that register's contents being overwritten. In this case, for instance, main (or another caller) might also be relying on %rbp and %rbx. Thus, kermit must put them back in their original state before returning. The converse is callee-owned registers; a callee has no obligation to save these registers' existing values before using them. However, if it calls a function itself, it should probably save the register's contents somewhere else first, as its callee may use that register as well! For instance, if main uses a callee-owned register and then calls kermit, kermit could use the same register and overwrite main's value, since kermit is not obligated to preserve it.
In kermit, what assembly instruction is responsible for setting up the second parameter to the call to binky? Why is there no instruction to set up the first one? kermit must save the parameter passed to it (stored in %rdi) somewhere else, since it must now use %rdi for binky's a parameter - where does kermit copy its own first parameter p to?
After the call to binky, kermit can access the returned value in %eax. However, later it calls dinky, which will also put a return value into %eax. To where does kermit copy binky's return value to use it later?
In both of these cases, it looks like kermit uses caller-owned registers. Why must it use caller-owned registers? Why would a callee-owned register not work in these cases?

x86 has a mere handful of registers and they are in high demand; the compiler works hard to maximize their use. Parameters and return values are passed/returned using registers, and local variables will be kept in registers whenever possible. The compiler will prefer use of "scratch registers" (e.g. callee-owned registers) where possible so as to avoid having to save/restore the caller's data (as is required if using the caller-owned registers).

2) Stomping Around (15 min)

What happens when a function creates chaos within its stack frame? What are the observed symptoms? How do we debug these kinds of errors? Let's play around with some example programs - understanding these will be critical to identifying vulnerabilities in the next assignment! (hint hint)

The clear_array function in buggy.c has the wrong bounds on its loop. It's a common-enough bug, especially to programmers unaccustomed to C's zero-based array indexing, but it has a surprising consequence in this context.

Before you try executing the program, first discuss with your group: what do you expect to be the consequence?
Run the program. It never seems to complete. Hmmm.
Run it under gdb. Interrupt the program using Ctl-C and see where it is executing. Continue the program, then interrupt again. Where is it executing now?
Interrupt again and then step from here. Why is the loop taking so darn long? Just what is happening with the value of i?

Spoiler: the value of i is changing as a result of code that doesn't directly assign to it. These kind of bugs are called "stomping" or "clobbering" memory. A variable can be "stomped on" due to an under/overrun of a neighboring region, using a pointer into a deallocated stack frame, referring to freed heap memory, or using an uninitialized pointer. These errors are difficult to debug since the variable is not named in the code that affects it and the stomping may go unnoticed until much later, making it difficult to connect cause and effect. Whereas Valgrind is awesome about reporting when you write to an invalid address, it cannot detect that you wrote to a valid address for the wrong reason. We need another strategy/tool to find this kind of error.

Let's learn how to use a gdb watchpoint . The gdb watch command is given an expression or a memory location to watch. gdb sets up a special kind of breakpoint that stops your program whenever there is a change in the value of that expression or a write to that memory location. Here are some examples:

(gdb) watch myvar           // report when myvar changes
(gdb) watch *0x608502       // report if write to memory location

Run under gdb, set a breakpoint on clear_array and once hit, set up a watchpoint to monitor changes to i.
Let's also add a display command to display the assembly instruction. (gdb) display/3i $rip
Continue from here and gdb will stop on each change to i. The watchpoint will alert you to each increment of the counter in the loop: from 0 to 1, 1 to 2 and so on, but it also will report when i gets stomped on.
When i changes from 2 to 3, execute disassemble in GDB to see exactly what assembly instruction is responsible for changing i at that point. As a tip, the => will point to the instruction that is about to be executed, so the one that changed i comes before this. Where does this instruction come from in the C code?
When i changes from 5 to 0, execute disassemble again in GDB to see exactly what assembly instruction is responsible for changing i at that point. It looks like it's a different instruction from the one changing i before. What is that instruction? Where does it come from in the C code? What code is thus responsible for clobbering i?

Watchpoints can be a useful tool for tracking down those bugs that make mysterious changes to memory (and may be a useful tool in reverse engineering a bomb, too. Hmm...)

3) Don't Gets Me Started (20 min)

Here's another example of stack frame chaos. The standard C library function gets has an inherently awful design. It is intended to read a single line of text from stdin, stopping at the first newline character, and writing the read characters into the client's buffer. The fatal flaw of gets is that its only argument is the starting address of the buffer, with no indication of that buffer's length. Without the length, gets cannot tell when it should stop writing characters to avoid overflowing the buffer. There is absolutely no way to use gets safely. Its use has long been deprecated in favor of the properly-constrained fgets function, but for reasons of backward compatibility, gets lives on in the standard library.

First read the BUGS section of man gets for harsh words against the function and hints of the security problems therein. When you compile the smash program, you'll get a variety of warnings from the compiler and linker that try to further dissuade you from using gets. Let's observe the consequences if we proceed past the warnings and use the function anyway.

The smash.c program calls gets to read the user's input into a buffer of size 12. Any input longer than 12 will write past the end of the stack-allocated buffer. The consequence ranges from silent to catastrophic depending on how many excess bytes are written.

Run the smash program and when prompted for your name, enter G. Hopper. This name fits in 12, no problem. Now enter John Hennessy. Just a little long is ok? Now enter Marc Tessier-Lavigne. Definite overrun, but still getting away with it? Push it to John Jacob Jingleheimer Schmidt and we finally get the crash we deserve. Run the program under gdb and enter this same long name. At the point of the crash, use backtrace. Woah, what the heck has happened to the stack? Let's see if we can figure that out.
Disassemble the greet function and sketch a diagram of its stack frame with your group. Be sure to note the location of the buffer and its relationship to neighboring data. You can get this information by looking at the assembly instructions and trace how they manipulate stack memory. This diagram is critical for explaining the program's behavior!
For any input longer then 12, the characters read by gets will overflow the buffer. What is the immediate neighbor of the buffer? Why is this critical data that when overwritten, will cause the program to crash? How long must the input be to reach this critical data? What if the input is 1 byte too long? 2 bytes? 4 bytes? even longer?
The crash does not occur at the time when the data is overwritten; it is later when the corrupted data is used that is the problem. Identify the assembly instruction in the greet function at which the crash occurs and explain what is happening in that instruction.
Run the program again. When prompted, enter this exact input: meghan duchess of sussex. Pluto? How did we end up there? The function pluto isn't even called from anywhere in the program. This outcome is most surprising! Go back to your stack diagram and see if you can work out how this happened. When greet finished, instead of resuming in main, it "resumed" in pluto . What was the resume address of the instruction in main? How did the intended resume address change into an address in pluto?

Fun followup reading: Peter van der Linden's book Expert C Programming: Deep C Secrets summarizes the most famous gets exploit in "the early bird gets() the Internet Worm". The entire Expert C book is available online with your Stanford login and chock full of fascinating information -- we highly recommend it for its illuminating and comprehensive coverage of all things C! read it here (requires authentication, accesses Stanford's subscription to Safari Books Online in the left sidebar)

[Optional] Extra Problems

Finished with lab and itching to further exercise your assembly and runtime stack skills? Check out our extra problems!

Recap

Nice work on the lab! It's okay if you don't completely finish all of the exercises during lab; your sincere participation for the full lab period is sufficient for credit. However, we highly encourage you to finish on your own whatever is need to solidify your knowledge. Also take a chance to reflect on what you got what from this lab and whether you feel ready for what comes next! The takeaways from lab6 should be understanding the gdb commands to dig around in the stack frames and being able to relate the contents of the stack to the runtime program state - both valuable when debugging. Here are some questions to verify your understanding and get you thinking further about these concepts:

Why are the registers used for function parameters completely fixed, yet no conventions apply to where locals are stored?
Explain the rules for caller-owned versus callee-owned register use. What is the advantage of this arrangement relative to a simpler scheme such treating all registers as caller-owned or callee-owned?
What are the signs in gdb that help you diagnose a stack overrun has trashed the return address?