Lab 6 Solutions

Written by Nick Troccoli and Lisa Yan, based on walkthroughs by Ryan Noon, Nate Hardison and John Louie

Lecture Review

Here are key concepts relevant to this week's lab:

  • condition codes are special flags that are automatically updated by the result of the most recent arithmetic or logical operation.

  • instructions like conditional jumps, conditional moves, and set can behave based on the state represented by the condition codes.

  • The special %rip register stores the address of the next instruction to execute. If we jump to another instruction, %rip is updated.

  • if statements are usually implemented by testing the inverse of the if statement check, and jumping past the if statement body if the test succeeds.

  • while loops are usually implemented by jumping to the test, and then jumping back to the body if the condition holds.

  • for loops are built on top of while loops, using the same format but adding an initialization and update step.

  • The special %rip register stores the address of the next instruction to execute. If we jump to another instruction or call another function, %rip is updated. In particular, when we call another function, we must save the caller's next instruction to execute so that we can resume there when the callee finishes. The call instruction does this for us automatically by storing it on the stack, and the ret instruction pops the value off into %rip.

  • The special %rsp register stores the address of the current top of the stack. Instructions adjust %rsp either be explicitly incrementing or decrementing it to make more or less space, or by pushing or popping things on/off the stack using the push and pop instructions.

  • If we call a function, we must put parameters we want to pass into the correct special registers to store the 1st - 6th parameters. Parameters after the 6th (if any) are pushed onto the stack.

  • If we call a function, when the function finishes we can find its return value (if any) in the special %rax register.

  • Local variables may sometimes be stored in registers rather than on the stack for optimization purposes, but there are several reasons why a local variable must be on the stack:

    • if we have run out of registers
    • if the program uses the & operator on a local variable, we must put it on the stack so that it has an address (registers do not have addresses)
    • if we have array or struct data, and we need to use address arithmetic (e.g. can't index into a register)
  • Some registers are labeled as callee-owned or caller-owned (see the reference sheet for the full list!).

    • Callee-owned means that a function can feel free to use that register without worrying about what was previously put there by its caller. However, if it calls a function itself, it should probably save the register's contents somewhere else first, as its callee may use that register as well!
    • Caller-owned means that a function must save and put back later the existing value of the register if it wants to use it. However, if it calls a function itself, it does not need to worry about that register's contents being overwritten, as this distinction guarantees that a callee will ensure the register is ultimately the same as before it was called.

Solutions

1) GDB and Stack Mechanics

Q1: What is the significance of this 8 byte value at the top of the stack right when kermit is called? (Hint: what does the call instruction put there?). Try disassemble main to see if this value pops up anywhere.

A1: This is the return address - the address of the caller instruction that the program should resume at when this function is done. Here, it should be the address of the instruction in main immediately following the call kermit instruction.

Q2: Why must kermit push these registers onto the stack before it uses them? (Hint: are they caller owned? Or callee-owned?)

A2: %rbp and %rbx are caller-owned registers, so the caller assumes they will look the same before and after we execute. Therefore, if we wish to use them, it is our responsibility to preserve the existing value and put it back later when we are done.

Q3: To where does kermit copy binky's return value to use it later? Step through to confirm your understanding.

A3: kermit saves the return value from binky into %ebx - the other of the caller-owned registers it uses.

Q4: What pointer type should we cast $rdi to in order to complete this expression? (Hint: what pointer type is arr really?) Try out your idea to see if it prints the array!

A4: It should be an int *, since it is a pointer to the first element of the array. Thus, the expression in GDB should be print ((int *)$rdi)[0]@$rsi.

Q5: How does this instruction that is modifying arr[1] connect back to the C code?

A5: The instruction is mov %edx,(%rcx). This is moving the incremented value (in %edx) into the memory location for the ith array element.

2) Stack Misuses

This problem is a crucial introduction to examining stack behavior and vulnerabilities. It is very useful (hint hint) for part of assign5! Our goal is to help you feel more comfortable drawing stack diagrams and using them to understand program behavior.

The core idea for this problem is that gets will write past the end of buf if the user enters a sufficiently long input. It turns out that num lives above buf on the stack, so if we write far enough we will overwrite num. Then, when the function returns it, it will return the value we wrote instead of 2.

Part 1: Stack Diagram

Here is the annotated assembly for the start of greet:

sub    $0x28,%rsp    # 40 bytes
movl   $0x2,0x1c(%rsp) # num = 2
mov    $0x402008,%rdi # printf string
mov    $0x0,%eax # printf 0 param
callq  0x401050 <printf@plt>
mov    %rsp,%rax # buf start
mov    %rax,%rdi # buf start
callq  0x401060 <gets@plt>

Q1: In the first instruction, how much stack space is initially reserved for greet? This is the initial size of our stack frame. %rsp points to the "top" of the stack (likely the bottom in your diagram).

A1: 40 bytes.

A diagram of the stack, consisting of 1 stack frame labeled to be 40 bytes big.  %rsp points to the top of the stack frame (the bottom in the diagram).

Q2: In the second instruction, we copy 2, presumably into num. Where is num stored, relative to %rsp? Add that to your diagram.

A2: it is at $rsp + 0x1c, or $rsp + 28.

The same diagram as diagram 1, but now 28 bytes from the bottom of the diagram (aka 28 bytes below the top of the stack), we label the start of the 4 byte space for num.  %rsp still points to the top of the stack frame (the bottom in the diagram).

Q3: In the last 3 instructions, we set up the call to gets. In the C code, we pass in buf (a 16-byte buffer) as the parameter to gets. Looking at the assembly, what does that tell us about where buf is stored relative to %rsp? Add that to your diagram.

A3: it is at $rsp.

The same diagram as diagram 2, but now the first 16 bytes at the bottom of the diagram (aka the first 16 bytes on the top of the stack) we label as storing buf.  The 12 bytes above that in the diagram are unused, followed by the 4 bytes for num from before, and the 8 bytes above that in the diagram are unused, for a total of 40 bytes. %rsp still points to the top of the stack frame (the bottom in the diagram).

Part 2: Stack Smash

Q4: What is the shortest input (remember the null terminator!) that we can specify that will overwrite part of num? What is it overwritten to?

A4: If we enter 28 characters, the null terminator (29th character) will overwrite the first byte of num and make num 0.

Q5: The ASCII character with numeric value 107 is k. How can we craft an input that overwrites the first byte of num with the letter k, and the second byte with the null terminator?

A5: If we enter 29 characters, and the last one is k, we will overwrite the first byte of num with k, and the second byte of num with a null terminator. Thus, num will become 107!

Checkoff Answers

  • Kermit. What is the significance of the 8 byte value at the top of the stack right when kermit (or any other function) is called? This is the return address - the address of the caller instruction that the program should resume at when this function is done. Here, it should be the address of the instruction in main immediately following the call kermit instruction.
  • Printing in GDB. Assume we are working with another program and we know that %rbx represents an array of strings (in other words, it is a pointer to the first element, which is a char *). What GDB expression could we print to see the element at index 2 in this array? print ((char **)$rbx)[2] or print *((char **)$rbx + 2).
  • Watchpoints. In the clyde function, what assembly instruction is responsible for modifying arr[1]? The instruction is mov %edx,(%rcx). This is moving the incremented value (in %edx) into the memory location for the ith array element.
  • Stack Diagrams. In the greet function, how much space is there on the stack between the end of buf and the start of num? 12 bytes.
  • Doing the impossible. How is it possible that greet can return 107? buf is a fixed size, but the user could enter a string of any length. Therefore, the string could be written beyond the bounds of buf and into memory next to it on the stack. num is stored 12 bytes after buf on the stack, so if we write a sufficiently long string, it will overwrite the memory for num. Specifically, if we enter 29 characters, and the last one is k, we will overwrite the first byte of num with k, and the second byte of num with a null terminator. Thus, num will become 107!

Additional Thought Questions

  • Why are the registers used for function parameters completely fixed, yet no conventions apply to where locals are stored? Even though a local variable can be stored at any address on the stack, we can specify its location via its address. However, without assigned registers for parameters, we would not be able to tell another function which register stores its parameters since registers do not have addresses.
  • Explain the rules for caller-owned versus callee-owned register use. What is the advantage of this arrangement relative to a simpler scheme such treating all registers as caller-owned or callee-owned? Caller owned registers must be restored by the callee before returning. This means callees must preserve their existing values, but callers may assume that their values will be preserved across function calls. Callee owned registers do not have to be restored by the callee before returning. This means callees do not have to preserve their existing values, but callers cannot assume that their values will be preserved across function calls. Having both provides flexibility to use what's most appropriate in each situation (e.g. if all were callee-owned, we'd have to store values on the stack to preserve them; if all were caller-owned, we'd always have to save existing register values, even if that function isn't calling any other functions)
  • Explain how you might go about sketching a stack diagram for a function given its assembly. Look at assembly instructions that modify the stack pointer (e.g. push, pop, sub, etc.) and track changes over time. Match up the assembly to the C code to see where different variables live.

[Optional] Extra Problems

1) Recursion

Q1: How is an invalid number causing a memory problem? Run factorial under gdb on argument -1. Use backtrace to find out where the program was executing at the time of the crash. Does this shed some light on what has gone wrong?

A1: The program should crash with -1 because of a stack overflow - there were too many stack frames, and so the program ran out of stack memory. This is because the parameter is an unsigned int, so a negative number is cast to a positive one, and bypasses the initial check!

  • Get into gdb and figure out how big each factorial stack frame is. Here are two different strategies; try both! Both approaches should give a stack frame size of 16 bytes. There is one push and one call instruction, for example. And the difference between two frame addresses should be 16 bytes (0x10).

Q2: In the OS on myth, the stack segment is configured at program start to a particular fixed limit and cannot grow beyond that. The bash shell built-in ulimit -a (or limit for csh shells) displays various process limits. What does it report is the process limit for stack size?

A2: 8192KB

Q3: How close is that depth to the maximum you estimated?

A3: The math should say 8192KB / 16B = 8192 * 1024 / 16 = 524,288 frames. This is pretty close to the topmost stack number, 524190!

Q4: Woah! What happened? Did the compiler "fix" the infinite recursion? Disassemble the optimized factorial to see what the compiler did with the code.

A4: It's subtle, but the recursion was replaced by a loop! It removes any need for stack space at all (no more push or call) so the function can run as long as needed to compute an answer - in this case, for a negative number it will be cast to a large positive number and continue executing - it overflows from the multiplication and thus eventually evaluates to 0.

2) "Channeling" Information Between Functions

This problem shows an example of the consequences of stack memory not actually being cleared out when it is no longer used - instead, the stack "pops" elements off simply by adjusting %rsp. This means previous memory is still there until it's overwritten in the future by another function's stack frame that is put on top of it!

Q1: Review the code in channel.c. The init_array and sum_array function each declare a stack array. The init function sets the array values, the sum function adds the array values. Neither init nor sum takes any arguments. There is a warning from the compiler about sum_array, what is it?

A1:

channel.c: In function ‘sum_array’:
channel.c:38:20: warning: ‘nums[i]’ may be used uninitialized in this function
         sum += nums[i];
                    ^

Q2: The program calls init_array() then sum_array(). Despite no explicit pass/return between the two calls, the array seems to be magically passed as intended. How are these functions communicating? The term for this is "channeling". (Hint: when data is popped off the stack, it isn't cleared out - it just increments %rsp to indicate that memory area can be reused).

A2: sum_array is able to read the contents of the stack frame that init_array left behind because sum_array is called immediately afterwards by the same caller, so both functions' stack frames are put at the same place in memory. Additionally, they have the exact same local variable nums, that overlays right on top of nums from init_array, so even though sum_array doesn't have any parameters, it "sees" the array that init_array just left behind and is able to read it as though it were a parameter.

Q3: The program's second attempt to channel the array from init to sum fails. What is different about this case?

A3: printf is called in between init_array and sum_array. This means that printf uses that stack space before sum_array, and so it may overwrite or change the memory contents there to no longer be what init_array left behind.

3) Clobbering Memory

Q1: When i changes from 2 to 3, execute disassemble in GDB to see exactly what assembly instruction is responsible for changing i at that point. As a tip, the => will point to the instruction that is about to be executed, so the one that changed i comes before this. Where does this instruction come from in the C code?

A1: The instruction is addq $0x1,0x28(%rsp), which comes from the i++ in the code.

Q2: When i changes from 5 to 0, execute disassemble again in GDB to see exactly what assembly instruction is responsible for changing i at that point. It looks like it's a different instruction from the one changing i before. What is that instruction? Where does it come from in the C code? What code is thus responsible for clobbering i?

A2: The instruction is movq $0x0,(%rsp,%rax,8), which comes from the arr[i] = 0 in the code. Thus, the code responsible for clobbering i is the assignment to arr[i] = 0, since i gives an address too far when we do the pointer arithmetic and then we assign zero to the value of i and start the loop all over again as a result!