Assignment 5: Some Assembly Required

Due: Mon Nov 9 11:59 pm
Late submissions accepted until Wed Nov 11 11:59 pm

Assignment by Michael Chang & Julie Zelenski
idea originated by Randal Bryant & David O'Hallaron (CMU). Modifications by Nick Troccoli.

Learning Goals

This assignment focuses on understanding assembly code representations of programs. You will be building your skills with:

  • reading and tracing assembly code
  • understanding how data access, control structures, and function calls translate between C and assembly
  • reverse-engineering
  • understanding the challenges of writing secure and robust systems
  • mastering the gdb debugger!

Overview

There are two parts to this assignment. The first part is about an ATM withdrawal program containing some vulnerabilities - you'll need to use your C and assembly skills to find and demonstrate how to exploit these vulnerabilities. The second part is the binary bomb program, where you're given an executable "bomb" program (no C code provided!) to "deactivate" using your assembly and reverse-engineering skills. These problems are like C/assembly "puzzles" to solve, and we hope you enjoy solving them and exploring this material as much as we enjoyed creating them!

To get started on this assignment, clone the starter project using the command

    git clone /afs/ir/class/cs107/repos/assign5/$USER assign5

The starter project contains the following:

  • bomb: your binary bomb executable program, custom-generated for each student
  • custom_tests: the file where you will add custom tests to exploit vulnerabilities in the provided ATM withdrawal program
  • input.txt: a blank text file where you should add the passwords for each binary bomb level, one per line. You can run bomb with this file as a command-line argument and it will first read from this file before prompting you for further input, allowing you to avoid re-typing passwords for deactivated levels each time.
  • readme.txt: a file where you should add answers to some short written questions about the ATM and binary bomb programs.
  • .gdbinit: a gdb configuration file you can optionally use to run certain gdb commands each time gdb launches. See the section on using GDB in binary bomb for more information.
  • samples: a symbolic link to the shared directory for this assignment. It contains:
    • atm: the executable ATM program, which you will explore for vulnerabilities.
    • atm.c: the C source code for the ATM program, which you will explore for vulnerabilities. Note that you're not able to edit or recompile this code/executable.
    • bank: a folder containing customers.db, a file with the list of all users and balances for the ATM program
    • minibomb and minibomb.c: a sample "practice" executable you can work on with others if you want practice material similar to what binary bomb is like. You can check your answers with the .c file.
    • SANITY.INI and sanity.py: files to configure and run sanity check. You can ignore these files.
    • wordlist: a list of dictionary words used for bombs. You can ignore this file.
  • tools: contains symbolic links to the sanitycheck and submit programs for testing and submitting your work.

Do not start by running the bomb to "see what it will do". You will quickly learn that what it does is explode :-) When started, it immediately goes into waiting for input and when you enter the wrong response, it will explode and deduct points. Thoroughly read the binary bomb information in this spec before attempting to deactivate it!

You will be using gdb frequently on this assignment. Make sure you have downloaded the CS107 GDB config file. You can find how to do this at the top of the CS107 GDB Guide.

Please make sure to adhere to the honor code and collaboration policy for this assignment. Even without any code being submitted, you should not be doing any joint debugging/development, sharing or copying written answers, sharing specific details about bomb behavior, etc.


1. Code Study: Security and Robustness

The samples/atm program simulates the operation of a simplified automated teller machine. The ATM program is invoked with an amount and the credentials for a particular account. If the credential is authorized and the account has sufficient funds, the amount is withdrawn and dispersed in cash. The ATM is supposed to maintain bank security by rejecting unauthorized access and denying excessive withdrawals.

Run samples/atm 40 myname (replacing myname with your myth login name) to make a $40 withdrawal from the account associated with your login name. Success! Every time you run the program, it will print out information to the terminal about the transaction that took place, or the error that occurred, if any. For example, if you ask to withdraw $100 from your account, it should be denied with an error message because that would bring your current $107 balance below the required minimum. If you try to sneak cash from another account instead of yours (e.g. samples/atm 40 troccoli) or use a fake name (e.g. samples/atm 40 not_a_user), your credential should get rejected as unauthorized. So far, so good; the ATM seems to be doing its job. (Note: Each time you run the program anew, all balances return to their original starting levels. No money actually changes hands in this ATM; which is a blessing given its security flaws.)

The bank recently updated the ATM software to a version with some additional features. The IT team reviewed the new code and thought it all looked good, but having now installed it in production, they are observing some suspicious activity. The bank has called you because your superior C and assembly skills are just what's needed to investigate and resolve these problems!

Your first task is to review the (read-only) source code for the program in samples/atm.c. The program is roughly 150 lines of C code of similar complexity to what you have been writing this quarter, and is decomposed and fairly readable, though sorely lacking in comments. You should find that the program's approach seems reasonable and the code is sincere in its attempt to operate correctly. Once you're done reading, take a minute to reflect on how far your awesome C skills have come to let you read through this provided program!

By following program output and balances, the bank has noticed three operational anomalies that they need your help investigating.

Deliverables

For each of the vulnerabilities below, construct a test case to showcase how it can be exploited and add it to your custom_tests file. Note that there may be more than one way to trigger a vulnerability. In your readme.txt, you should also provide for each a concise description of the underlying defect in the code, an explanation of exactly how you constructed your test case to exploit it, AND your recommendation for fixing it. The bank is not looking for a major rewrite/redesign, so in your proposed changes you should directly address the vulnerability with minimal other disruption. Note that there may be more than one possible remedy for fixing each issue. Also make sure you do not remove intended functionality of the bank program, and account for any potential additional security issues introduced by your proposed fix. Here is a list of the attacks you must provide:

  • case a: make a withdrawal as yourself that withdraws more money than is present in your account
  • case b: withdraw $40 from one of the CS107 staff member's accounts
  • case c: withdraw $300 from the bank vault despite its disabled passcode

Make sure your custom_tests file is formatted correctly. In particular, comments must be on their own line, and comment lines must start with #. Each test case line should start with samples/atm. Make sure to run your custom tests before submitting to confirm they execute properly.

Running ATM in GDB: to run the ATM program in GDB, make sure you go into the samples folder first and then run the atm program from there.

a) Negative Balances

The old version of the ATM program restricted a withdrawal to be at most the full account balance, allowing the customer to drain their account to $0, but no further. The new program has changed the withdraw function to require a non-zero minimum balance. The expected behavior should be that all account balances stay above this minimum. However, the bank saw an otherwise ordinary withdrawal transaction that not only caused an account to go below the minimum, it overdrew so far as to end up with a negative balance. Oops, that's definitely not supposed to happen! Review the C code for the withdraw function, specifically the changes from the old version. It seems to work in many cases, but apparently not all. Read carefully through this function to try and discover the flaw - your understanding of signed and unsigned integers will be useful here! Once you have found the vulnerability, determine a command to make a withdrawal as yourself that withdraws more money than is present in your account.

b) Unauthorized Account Access

The bank has also received a customer complaint about an unauthorized withdrawal from their account. It seems that another user with different credentials was able to successfully withdraw money from the aggrieved customer's account. Moreover, the credential used appears to be entirely fake - no such user exists in the database! A user should not be able to access a different customer's account and especially not by supplying a bogus credential! Review the C code for the find_account function that is responsible for matching the provided username to their account number. It seems to work properly for valid accounts, but not for invalid usernames. Can you spot what this function does in this case? Once you do, it may seem that this function will behave unpredictably in this case. Your next task is to examine the generated assembly to determine precisely how the function will behave. Think about registers with special responsibilities and where it's assumed certain values will live. Once you have found the vulnerability, determine a command with a designed bogus name credential to withdraw $40 from one of the CS107 staff member's accounts. (The samples/bank/customers.db file contains information about all valid users and their balances, and the first 16 users in the database are staff accounts.)

c) Accessing The Master Vault

The most worrisome issue is repeated illicit withdrawals from the master vault account, account number 0. The name on the master account is not an actual user, so this account cannot be accessed using the simple username-based credential. Instead, the user must specify two arguments, the account's number and its secret passcode, as a form of heightened security. At first the bank thought the vault passcode had been leaked, but changing the passcode did nothing to thwart the attack. In a fit of desperation, the bank removed the vault passcode file altogether, figuring this would disable all access to the vault, yet the rogue user continues to make withdrawals from it! It seems that the high-security passcode authentication may have its own security flaw! The code that handles this authentication is in the lookup_by_number and read_secret_passcode functions. These functions work correctly in many situations, but fail in certain edge cases. Remember that it seems that in certain cases supplied credentials are accepted despite the lack of a saved passcode file. The vulnerability is subtle in the C code, so you should also use GDB to examine the code at the assembly level and diagram out the memory on the stack for these functions, as it is the arrangement of the various data and lack of care in accessing the stack-based variables that leads to the security vulnerability in this case. Your exploit should not involve reading from any file. Once you have found the vulnerability, determine a command to withdraw $300 from the bank vault despite its disabled passcode.

Optional Further Exploration

During the course of your investigation, you may find additional problems beyond the ones listed above. You are only required to address these three issues, but you are welcome to explore further to find additional problems!

NOTE: If you liked this exercise, you'll love CS155, a class on computer security that challenges you to exploit various vulnerabilities in programs. See the CS155 website for more information!


2. Binary Bomb

NOTE: Do not start by running the bomb to "see what it will do". You will quickly learn that what it does is explode :-) When started, it immediately goes into waiting for input and when you enter the wrong response, it will explode and deduct points. Thoroughly read the binary bomb information below before attempting to deactivate it!

Those nefarious Cal students have broken into our myth machines and planted some mysterious executables we are calling "binary bombs." Without the original source, we don't have much to go on, but we have observed that the programs seem to operate in a sequence of levels. There are 4 levels in total. Each level asks the user to enter a string. If the user enters the correct string, it defuses the level and the program proceeds on. But given the wrong input, the bomb explodes by printing a message and terminating. To deactivate the entire bomb, one needs to successfully defuse each of its levels.

This is where we need your help. Each of you is given a bomb unique to you; your mission is to apply your best assembly detective skills to work out the input required to pass each level and deactivate the bomb.

Your bomb is given to you as an executable, i.e. as compiled object code. From the assembly, you will work backwards to construct a picture of the original C source in a process known as reverse-engineering. Note that you don't necessarily need to recreate the entire C source; your goal is to work out a correct input to pass the level, which requires a fairly complete exploration of the code path you follow to deactivate the level, but any code outside that path can be investigated on a need-to-know basis. Once you understand what makes your bomb "tick", you can supply each level with the input it requires and defuse it. The levels get progressively more complex, but the expertise you gain as you move up from each level increases as well. One confounding factor is that the bomb explodes whenever it is given invalid input. Each time your bomb explodes, it notifies the staff, which deducts from your score. Thus, there are consequences to exploding the bomb-- you must be careful!

Reverse-engineering requires a mix of different approaches and techniques and will give you an opportunity to practice with a variety of tools, most importantly GDB. Building a well-developed gdb repertoire can pay big dividends the rest of your career!

"Minibomb"

Want to get a feel for what binary bomb is like, but in a more practice-friendly setting? The "minibomb" in the samples/ folder is a practice executable we created that is similar in spirit to your binary bomb (it doesn't share any code with your real bomb, it's just made with a similar reverse-engineering goal in mind). You can practice working on that, work together with other students, and check your answers with the included source code for the bomb. For the minibomb, you must get past two stages, stage 1 and stage 2. Stage 1 is a function called stage1 that is passed 1 parameter, which is the first command line argument. Stage 2 is a function called stage2 that is passed 1 parameter, which is the second command line argument. E.g. you run samples/minibomb [stage1password] [stage2password]. Your goal is to get both functions to return 1, and not 0. The minibomb is completely optional, but we encourage you to use it as a practice tool if you'd like!

Logistics

Our investigative efforts been able to confirm a few things about how the bombs operate:

  • If you start the bomb with no command-line argument, it reads input typed at the console.
  • If you give an argument to the bomb, such as input.txt:

    ./bomb input.txt
    

    the bomb will read lines from that file until it reaches EOF (end of file), and then switch over to reading from the console. This feature allows you to store inputs for solved levels in input.txt and avoid retyping them each time.

  • Explosions can be triggered when executing at the shell or within gdb. However, gdb offers you tools you can use to intercept explosions, so your safest choice is to work under gdb and employ preventive measures.

  • The bomb in your repository was lovingly created just for you and is unique to your id. It is said that the bomb can detect if an impostor attempts to execute your bomb and won't play along.
  • The bombs are designed for the myth computers (running on the console or logged in remotely). There is a rumor that the bomb will refuse to run anywhere else.
  • The bombs were compiled from C code using gcc. It seems the bombs were created without changing the compile flags to achieve much obfuscation of the object code.
  • It seems as though the function names were left visible in the object code, with no effort to disguise them. Thus, a function name of initialize_bomb or read_five_numbers can be a clue. Similarly, it seems to use the standard C library functions, so if you encounter a call to qsort or sscanf, it is the real deal.
  • Direct modification of the binary bomb executable can change its behavior, but be forewarned that we will test your submission against your original unmodified binary, so while hacking the executable is great fun, it won't be of much use as a strategy for solving the levels.
  • There is one important restriction: Do not use brute force! You could write a program to try every possible input to find a solution. But this is trouble for several reasons:

    • You lose points on every incorrect guess which explodes the bomb.
    • A notification is sent on each bomb explosion. Wild guessing will saturate the network, creating ill will among other users and attracting the ire of the system administrators who have the authority to revoke your privileges because you are abusing shared resources.
    • We haven't told you how long the strings are, nor have we told you what characters they can contain. Even if you made the (wrong) assumptions that they all are less than 80 characters long and only contain lowercase letters, you will have 2680 guesses for each level. Trying them all will take an eternity, and you will not have an answer before you graduate.
    • Part of your submission requires answering questions that show your understanding of the assembly code, which guessing will not provide. :-)

Getting Started

Here are some steps you should take to get started on this part of the assignment.

  1. Use the nm utility on the executable (nm bomb) to print what's called the "symbol table" of the executable. The symbol table contains the names of functions and global variables and their addresses. The names may give you a sense of the structure of the bomb.
  2. Use the strings utility on the executable (strings bomb) to print all the printable strings contained in the executable, including string constants. See if any of these strings seem relevant in defusing the bomb.
  3. gdb and objdump will be most helpful after this. objdump -d bomb outputs the assembly for the bomb executable. Reading and tracing the disassembled code is where the bulk of your information will come from. Scrutinizing the lifeless object code without executing is a technique known as deadlisting. Once you sort out what the object code does, you can, in effect, translate it back to C and then see what input is expected. This works reasonably well on simple passages of code, but can become unwieldy when the code is more complex. That is where gdb comes in.
  4. gdb lets you single-step by assembly instruction, examine (and change!) memory and registers, view the runtime stack, disassemble the object code, set breakpoints, and more. Live experimentation on the executing bomb is the most direct way to become familiar in what's happening at the assembly level.
  5. pull up tools like the Compiler Explorer interactive website from lab, or gcc on myth, to compile and explore the assembly translation of any code you'd like. For example, if you're unsure how to a particular C construct translates to assembly, how to access a certain kind of data, how break works in assembly, or how a function pointer is invoked by qsort, write a C program with the code in question and trace through its disassembly. Since you yourself wrote the test program, you also don't have to fear its explosive nature :-) You can compile directly on myth using a copy of a Makefile from any CS107 assignment/lab as a starting point, and then use gdb or objdump to poke around.

Before attempting to deactivate the bomb, you should use the above tools, and gdb tricks below, to figure out how to reliably prevent explosions. There are simple manual blocks that give some measure of protection, but it is best to go further to develop an invincible guard. Feel free to use any technique at your disposal, such as leveraging gdb features, tweaking the global program state, modifying your setup, tricking the bomb into running in a safe manner, or hacking the bomb executable. Avoiding the entire explosion is one straightforward approach to ensure that we won't hear about it, but there are ways to selectively disable just the transmission portion to the course staff. Once you figure how to set up appropriate protection against explosions, you will then be free to experiment with the levels without worry. Note that the bomb can only explode when it is "live", i.e., executing in shell or running with gdb. Using tools such as nm, strings, and objdump to examine the executable cannot explode the bomb.

Using gdb

The debugger is absolutely invaluable on this assignment. Here are some suggestions on how to maximize your use of gdb. You'll also get more practice with gdb tricks in the assembly labs.

  • Expand your gdb repertoire. The labs have introduced you to handy commands such as break, x, print, info, disassemble, and stepi/nexti. Here are some additional commands that you might find similarly useful: display, set variable, watch, jump, kill, and return. Within gdb, you can use help name-of-command to get more details about any gdb command. See the quick gdb reference card for a summary of many other neat gdb features.
  • Get fancy with your breakpoints. You can breakpoints by function name, source line, or address of a specific instruction. Use commands to specify a list of commands to be automatically executed whenever a given breakpoint is hit. These commands might print a variable, dump the stack, jump to a different instruction, change values in memory, return early from a function, and so on. Breakpoint commands are particularly useful for installing actions you intend to be automatically and infallibly completed when arriving at a certain place in the code. (hint!)

    gdb kill workaround: gdb 7.7 (current version on myth as of 11/2017) has a bug when attempting to use kill in the commands sequence for a breakpoint that creates a cascade of problems --can cause gdb itself to crash or hang. The gdb command signal SIGKILL can be used as an alternate means to kill a program from a commands sequence that doesn't trip this bug.

  • Use a .gdbinit file. The provided file named .gdbinit in the assignment folder can be used to set a startup sequence for gdb. In this text file, you enter a sequence of commands exactly as you would type them to the gdb command prompt. Upon starting, gdb will automatically execute the commands from it. This will be a convenient place to put gdb commands to execute every time you start the debugger. Hint: wouldn't this be useful for creating breakpoints with commands that you want to be sure are always in place when running the bomb? The .gdbinit file we give you in the starter repo has only one command to echo Successfully executing commands from .gdbinit in current directory. If you see this message when you start gdb, it confirms the .gdbinit file has been loaded.

  • Custom gdb commands. Use define to add your own gdb "macros" for often-repeated command sequences. You can add defines to your .gdbinit file so you have access to them in subsequent gdb sessions as well.
  • Fire up tui mode (maybe...). The command layout asm followed by layout reg will give you a split window showing disassembly and register values. This layout will display current values for all registers in the upper pane, the sequence of assembly instructions in the middle pane, and your gdb command line at the bottom. As you single-step with si, the register values will update automatically (those values that changed are highlighted) and the middle pane will follow instruction control flow. This is a super-convenient view of what is happening at the machine level, but sadly, you have to endure a number of quirks and bugs to use it. The tui mode can occasionally crash gdb itself, killing off gdb and possibly the bomb while it's at it. Even when tui is seemingly working, the display has a habit of turning wonky, often fixable by the refresh command (use this early and often!) but not always. A garbled display could cause you to misunderstand the program state, misidentify where your bomb is currently executing, or accidentally execute a gdb command you didn't intend. Any explosion suppression mechanism that requires you, the fallible human, to take the right action at a critical time could easily be waylaid by interference, so don't attempt tui before you have invincible automatic protection against explosions. Selective use of auto-display expressions (introduced in lab6) is a great alternative with less disruption. You can exit tui using ctrl-x a and re-enter it again (this doesn't require leaving gdb and losing all your state).

Bomb Deliverables

You should add the passwords to defuse each level in your input.txt file. We will test by running ./bomb input.txt on your submission. The input.txt file in your submission should contain one line for each level you have solved, starting from level 1. Malformed entries in your input.txt or wrong line-endings (see FAQ below) will cause grading failures. To avoid surprises, be sure that you have verified your input.txt in the same way we will in grading (i.e. ./bomb input.txt). We also have a few follow-up questions that you should answer in your readme.txt file:

  1. What tactics did you use to suppress/avoid/disable explosions?
  2. level_1 contains an instruction near the start of the form mov $<multi-digit-hex-value>,%edi. Explain how this instruction fits into the operation of level_1. What is this hex value and for what purpose is it being moved? Why can this instruction reference %edi instead of the full %rdi register?
  3. level_2 contains a jle that is not immediately preceded by a cmp or test instruction. Explain how a branch instruction operates when not immediately preceded by a cmp or test. Under what conditions is this particular jle branch taken?
  4. Explain how the loop in the winky function of level_3 is exited.
  5. The read_array function used in level_4 declares a local variable that is stored on the stack at 0x8(%rsp). What is the type/size of this variable? Explain how can you discern its type from following along in the assembly, even though there is no explicit type information in the assembly instructions. Within read_array there is no instruction that writes to this variable. Explain how the variable is initialized (what value it is set to and when/where does that happen?).
  6. Explain how the mycmp function is used in level_4. What type of data is being compared and what ordering does it apply?

Sanity Check

The default sanitycheck test cases are ATM inputs and one test case that reports the line count of your input.txt file. This sanitycheck is configured to only allow test cases for ATM in your custom_tests file. The bomb executable is not run by sanitycheck.

NOTE: when running your own custom tests, make sure to inspect the output to ensure your tests are causing the behavior you expect! The sanitycheck tool itself does not verify that the tests cause the specified exploits.

Submitting

Once you are finished working and have saved all your changes, check out the guide to working on assignments for how to submit your work. We recommend you do a trial submit in advance of the deadline to allow time to work through any snags. You may submit as many times as you would like; we will grade the latest submission. Submitting a stable but unpolished/unfinished version is like an insurance policy. If the unexpected happens and you miss the deadline to submit your final version, this previous submit will earn points. Without a submission, we cannot grade your work.

We would also appreciate if you filled out this homework survey to tell us what you think once you submit. We appreciate your feedback!

Grading

For this assignment, here is a tentative point breakdown (out of 82):

  • custom_tests (15 points) Each successful attack test case earns 5 points. We will test by running tools/sanitycheck custom_tests on your submission. Your custom_tests should contain 3 test cases, one for each ATM attack.
  • readme.txt (35 points) The ATM and bomb questions will be graded on the understanding of the issues demonstrated by your answers and the thoroughness and correctness of your conclusions.
  • Input.txt (32 points) Each bomb level you have solved earns 8 points. We will test by running ./bomb input.txt on your submission. The input.txt file in your submission should contain one line for each level you have solved, starting from level 1. Malformed entries in your input.txt or wrong line-endings (see FAQ below) will cause grading failures. To avoid surprises, be sure that you have verified your input.txt in the same way we will in grading (i.e. ./bomb input.txt).
  • Bomb explosions (up to 6 points deducted) Each bomb explosion notification that reaches the staff results in a 1 point deduction, capped at 6 points total.

Post-Assignment Check-in

How did the assignment go for you? We encourage you to take a moment to reflect on how far you've come and what new knowledge and skills you have to take forward. Once you finish this assignment, your assembly skills will be unstoppable! You successfully found vulnerabilities in a program using its source and assembly, and reverse engineered a complex program without having access to its source at all. Rock on!

To help you gauge your progress, for each assignment/lab, we identify some of its takeaways and offer a few thought questions you can use as a self-check on your post-task understanding. If you find the responses don't come easily, it may be a sign a little extra review is warranted. These questions are not to be handed in or graded. You're encouraged to freely discuss these with your peers and course staff to solidify any gaps in you understanding before moving on from a task.

  • What are some of the gdb commands that allow re-routing control in an executing program?
  • What is the main indication that an assembly passage contains a loop?
  • Explain the difference between a function's return value and its return address.
  • Consider the mechanics of function pointer work at the assembly level. How is a call through a function pointer the same/different when compared to an ordinary function call?
  • For performance reasons, the compiler prefers storing local variables in registers whenever possible. What are some reasons that force the compiler to store a local variable on the stack instead?
  • For the instruction sequence below, what must be true about values of op1 and op2 for the branch to be taken? What changes if ja is substituted for jg?
    cmp op1,op2 
    jg target
    

Frequently Asked Questions

I get an error message about auto-loading .gdbinit being declined when starting gdb. What does this mean?

There is a provided .gdbinit file in the assignment starter code that is helpful for auto-executing gdb commands on launch when working on your binary bomb. GDB loads the file automatically if it's in that directory. If you are seeing an error message, this means that you haven't installed the CS107 GDB configuration file to permit GDB to load this assignment file - you can find instructions for how to do so on the CS107 GDB Guide page.

My input passes the level when typed manually, but when I added the same input to input.txt, it explodes. What gives?

When testing on input.txt, we advise you do so with your explosion defense in place against possible editing glitches. The contents of input.txt should consist of the input for each level on its own line and each line should end with a standard Unix newline. Stop in gdb and examine the line read from your file to spot the discrepancy between what you need and what you have. Look carefully for extraneous leading/trailing spaces or mismatched line endings. Emacs uses the correct line endings (\n) by default. Editors on other platforms that are using the line-ending conventions for Mac (\r) or Windows (\r\n) will cause you grief. The easiest approach to avoid problems is to edit the input.txt file using Emacs on myth.

I found some other assembly reference material that seems syntactically/logically inconsistent with the assembly from our textbook/lecture/tools. What's up?

The gnu tool chain defaults to the att (AT&T) syntax and all of our materials (text, lecture, lab) are consistent with this syntax. If you hunt down other resources in the wild, you may encounter Intel syntax where the order of operands are reversed, register names are not prefixed with %, immediate values are not prefixed with $, indirection is expressed with brackets instead of parentheses, and so on. For example, the att instruction push %rbp is written as push RBP in Intel and att movl $1, (%rsp) becomes movl [RSP], 1. Translating between them can be confusing, so it's recommended that you stick to resources that use the same syntax as our tools/text.

How do I print register values in gdb?

The gdb command info reg will show the current value for all registers. You can also access individual register values for use in gdb commands such as print, examine, or display. The register names are prefixed by dollar sign in gdb. A register value is treated as void*; you can apply a typecast to change the interpretation. Some examples:

(gdb) p/t $rax           # print %rax, binary
(gdb) p (char *)$rax     # print %rax, interpret as char*
(gdb) x/2wd $rax         # examine memory (deref %rax), show 2 ints
(gdb) display/2gx $rsp   # auto-print 2 quadwords from stack top in hex

To use the register value in a larger expression, be sure to use C syntax, not assembly. For example, if you need to dereference a register, apply *, not wrap in parentheses. If you ask gdb to evaluate an expression in assembly syntax, it handles it fairly oddly:

(gdb) p ($rax)               # parens ignored, ($rax) same as $rax
(gdb) p 0x8($rsp)            # gdb will segfault on this

Instead use C syntax, including typecast where necessary.

(gdb) p *(long *)$rax  
(gdb) p *(long *)((char *)$rsp + 8)

The disassembly shows %eax being set to 0 before certain function calls. What's with that?

Variable argument functions (e.g printf and scanf variants) require a little extra setup relative to normal calls. The x86-64 calling conventions for variable argument functions must indicate the presence of any float/double arguments by setting %rax to the count of vector registers used. If none are used (i.e. no parameters of float/double type), it sets %rax to zero.