Lab sessions Mon May 09 to Thu May 12
Lab written by Julie Zelenski
This lab is designed to give you a chance to:
Find an open computer and somebody new to sit with. Introduce yourself and celebrate/commiserate how things went for you on last week's midterm.
Get started. Clone the starter project using the command
hg clone /afs/ir/class/cs107/repos/lab6/shared lab6
This creates the lab6 directory which contains source files and a Makefile.
Pull up the online lab checkoff and have it open in a browser so you'll be able to jot down things as you go. At the end of the lab period, you will submit that sheet and have the TA check off your work.
Dissecting the stack. Use gdb to make some observations about the runtime stack. Open stack.c
in your editor and peruse the C code for the various *inky
functions. Build the stack
program and run gdb on it.
dinky
and run the program. When the breakpoint is hit, use gdb command backtrace
to see the current frames on the runtime stack.up
, down
, and frame n
commands allow you to change the selected frame. These commands don't change the state of program execution (the execution remains where it stopped in the topmost frame), but they allow you to examine the runtime state from the perspective of another stack frame. For example, you change to the frame for winky
or main
, you will then be able to print the variables/parameters that are visible only in that scope.info frame
command tells you the story about this stack frame. The info args
and info locals
provide information about the parameters and local variables, respectively. Try changing frames and using the various info
commands to see results in different contexts. You can dump the raw memory from the stack using the x
command.
(gdb) x/4gx $rsp examines 4 hex quadwords from stack starting at current rsp
(gdb) display/4gx $rsp will auto-display top 4 quadwords on stack as you step
Disassemble some of the *inky
functions to see coordination between the caller and callee for a function call. Trace through the caller setup (parameters passing, call), callee prolog (push saved registers, make space on stack if needed), callee epilog (restore saved registers, return), and caller resume.
Parameters and local variables will generally be stored in registers whenever possible, but there are some cases where the stack memory has to be used. Let's take a look at some of those situations:
main
and binky
to see how/where the extra parameters are communicated from caller to callee. binky
, one of the register-based parameters is copied to the stack. Which one? Why?struct coord
. Examine the disassembly for slinky
, dinky
, and winky
to see how a struct is passed as an argument. What about when passing a pointer to the struct?init_array
function to see.winky
to learn how local variables of struct type are stored. The disassembly for winky
shows adjusting the stack pointer to make space for 16 bytes of locals, yet the function declares two structs, each 16 bytes. What is going on?
The main
function prints the address of its stack frame when run. Run the program a few times from the shell and note the addresses that are printed--- is the stack always being placed in the same location or not? Now run the program multiples times under gdb, is the stack always being placed in the same location when running under gdb?
Parameter passing by "channeling". The code in the channeling
function in stack.c
is based on a bug I once helped a novice student with.
channeling
function calls the init_array
function to initialize an array followed by a call to sum_array
to sum the values. Run the program and execute channeling. Answer no when it asks if you'd like to print a debug statement. You'll see that the code seems to work just fine despite the fact that neither function takes any parameters and init doesn't anything! Just exactly how are these functions communicating?Perhaps this exercise will help you make sense of some previous situation where your program's behavior was reproducibly yet inexplicably altered by adding/removing/changing some innocent and seemingly unrelated code!
Recursion. The factorial
function in stack.c
is a classic recursive formulation to compute factorial.
factorial(n)
becomes erroneous? Why does that happen?factorial
stack frame is. Do this two different ways: examine the disassembly for factorial (to see how/where %rsp
is adjusted, count uses of push/call to add values to stack) and again by setting a breakpoint on factorial
and print $rsp
in gdb to see change between successive calls. Multiply the size of the frame by the number of frames to get an estimate of the maximum possible size of the runtime stack. What is the maximum value factorial you can attempt to compute before you seem to run out of stack space?ulimit -a
(or limit
for csh shells) reports the process limits, including the stack size. Does the stack size reported by the shell match up with the calculation you did above?Change the process limit on stack size using the command shown below. (use limit stacksize 100000
for csh shells). Re-run the program, what changes?
ulimit -s 100000
Edit the Makefile to change the optimization level used to compile the program. Search for the comment "EDIT HERE" to find the line that begins stack.o: CFLAGS += -Og
which is specifying the compiler flags for compiling the stack program. The current setting is -Og
, a relatively modest level of optimization. Change -Og
to -O2
to apply more aggressive optimizations. Re-compile, run again, and enter -1. Woah! What happened? Did it actually "fix" the infinite recursion? Disassemble the optimized factorial
to see what the compiler did with the code. This fancy optimization is called "tail-recursion elimination".
Overrunning and stack smashing. What happens when a function writes outside the bounds of its stack frame? The overrun_array
function in stack.c
writes past the end of the local array. Run the program and when asked, try overrunning the array by 1 position, then 2, and so on. There is no observed symptom for a small overrun, but eventually there is a catastrophic consequence. What happens and why? Now edit the body of the loop to instead change the array value like this nums[i] += 97
. Recompile and run the program again and test overrunning the array again. How does the observed behavior change? What is happening?
overrun_array
function deliberately and obviously writes off the end of the array. Most stack smashing problems are more subtle. The check_name
function in attack.c
shows an all-too-common bug of reading into a stack-allocated buffer naively assuming the buffer will also be big enough for what is being read.attack
program. The compiler warns about "gets being dangerous" (we will ignore this for now, but will soon realize why we should heed gcc's advice...) Run the program and when it asks for your name, enter "Pat". Run again and try entering "Leland". So far so good. Now run again and respond that your name is "John Jacob Jingleheimer Schmidt". What happens?attack
under gdb, enter the very long name to reproduce the crash and examine the backtrace---how odd and unhelpful! Set a breakpoint on the check_name
function. Run the program and when it stops at this breakpoint, use print $rsp
to show current value of stack pointer, the result will be assigned to the gdb convenience variable $1
. Examine the return address on top of the stack using x/1gx $1
. Use next
to walk through the gets
call, enter the long name, then examine the return address again x/1gx $1
. What happened? Try x/s $1
to get a different view of it. What has happened to the return address? What will now happen when check_name
tries to return?attack.o: CFLAGS +=
and change -fnostack-protector
to -fstack-protector
. Recompile and re-run. Now what happens when you enter a long name? The stack protector is a special bit of code that halts the program on stack buffer overruns. Have you ever run into this message before?gets
-- read its man page for advice on what to use instead.
Optional extra challenge for those curious about stack-smashing evilware. Poorly-written functions like check_name
can be exploited by malicious programs in a buffer overrun attack. The basic idea is to supply just the "right" input when overflowing an unprotected stack-allocated buffer.
main
. Where is the code supposed to return after a call to check_name
? Where would you rather it return instead? What kind of input could you supply to check_name
that could get the program to return to that location instead? attack
must be compiled without stack protection (In the Makefile around "EDIT HERE", be sure the line reads attack.o: CFLAGS += -fnostack-protector
and recompile). Feed Sneaky's input to the program like this ./attack < sneaky_input
. How does Sneaky Guy manage to get an A? It may help to use od -t x1 sneaky_input
to see a raw dump of the file contents.Before you leave, be sure to submit your checkoff sheet (in the browser) and have lab TA come by and confirm so you will be properly credited for lab If you don't finish everything before lab is over, we strongly encourage you to finish the remainder on your own. Double-check your progress with self check.