Written by Julie Zelenski
Warning!
Advice from the writeup that bears repeating: Do not start by running the bomb to "see what it will do..." You will quickly learn that "what it does" is explode :-) When started, it immediately goes into waiting for input and when you enter the wrong response, it will explode and deduct points. Your first task should be to put your kid gloves on and carefully poke around. Once you figure how to set up appropriate protection against explosions, you will then be free to experiment with the levels without any nail-biting anxiety about setting off the bomb.
Tools for bomb
Here are some possible points of attack for your great reverse-engineering adventure:
- The
nmutility dumps the symbol table from an executable. The symbols includes the names of functions and global variables and their addresses. The symbol table by itself is not a lot to go on, but just reading the names might give you a little sense of the lay of the land. - The
stringsutility will display all the printable strings in an executable, including all string constants. What strings do you find in your bomb? Do any of them seem of relevance to the task at hand? - The
objdumpdisassembler can dump of the object code into its disassembled equivalent. Reading and tracing the disassembled code is where the bulk of your information will come from. Scrutinizing the lifeless object code without executing is a technique known as deadlisting. Once you sort out what the object code does, you can, in effect, translate it back to C and then see what input is expected. This works reasonably well on simple passages of code, but can become unwieldy when the code is more complex. - The
gcccompiler. If you're unsure how to a particular C construct translates to assembly or how to access a certain kind of data, another technique is to try starting from the other side. Write a little C program with the code in question, compile it, and then trace its disassembly, either deadlisted or in gdb. For example, if you're not sure how a break statement works or how a function pointer is invoked by qsort, this would be a good way to find out. Since you yourself wrote the test program, you also don't have to fear its explosive nature :-) You can compile by directly invoking gcc or setting up a simple makefile (use the Makefile from any CS107 assignment/lab as a starting point). - The
Compiler Explorerinteractive compiler website. Less heavyweight than firing up gcc yourself is the handy gcc-in-a-web-site we previewed in lab6. Type in a code snippet and get its immediate translation to assembly -- easy-peasy! The tool is doing the same translation you could do via gcc, but in a convenient way that encourages interactive exploration.
Gdb strategies for bomb
The gdb debugger is absolutely invaluable here. You can use gdb to single-step by assembly instruction, examine (and change!) memory and registers, view the runtime stack, disassemble the object code, set breakpoints and watchpoints, re-route control flow, write your own custom commands, and more. Live experimentation on the executing bomb is the most direct way to become familiar in what's happening at the assembly level. Here are some suggestions on how to maximize your use of gdb on the bomb:
- Expand your gdb repertoire. The labs have introduced you to handy commands such as
break,x, print,display,info,disassemble, andstepi/nexti. Here are some additional commands that you might find similarly useful:set variable,watch,jump,kill, andreturn. Within gdb, you can usehelp name-of-commandto get more details about any gdb command. See the quick gdb reference card for a summary of many other neat gdb features. -
Get fancy with your breakpoints. You can breakpoints by function name, source line, or address of a specific instruction. Use
commandsto specify a list of commands to be automatically executed whenever a given breakpoint is hit. These commands might print a variable, dump the stack, jump to a different instruction, change values in memory, return early from a function, and so on. Breakpoint commands are particularly useful for installing actions you intend to be automatically and infallibly completed when arriving at a certain place in the code. (hint!)gdb kill workaround: gdb 7.7 (current version on myth as of 11/2017) has a bug when attempting to use
killin the commands sequence for a breakpoint that creates a cascade of problems --can cause gdb itself to crash or hang. The gdb commandsignal SIGKILLcan be used as an alternate means to kill a program from a commands sequence that doesn't trip this bug. -
Using a .gdbinit file. The file named
.gdbinitin the current directory can be used to set a startup sequence for gdb. In this text file, you enter a sequence of commands exactly as you would type them to the gdb command prompt. If your personal gdb configuration allows loading an init file, upon starting, gdb will automatically execute the commands from it. This will be a convenient place to put gdb commands to execute every time you start the debugger. Hint: wouldn't this be useful for creating breakpoints with commands that you want to be sure are always in place when running the bomb?Enable .gdbinit: To enable use of .gdbinit, you must create/edit your personal gdb configuration to allow loading it. See discussion of auto-loading at the end of our gdb FAQ. If your auto-loading is declined, copy the command below and execute it in your shell to update your configuration file. You will need to make this configuration change only once.
bash -c 'echo set auto-load safe-path / >> ~/.gdbinit'The .gdbinit file we give you in the starter repo has only one command to echo
Successfully executing commands from .gdbinit in current directory. If you see this message when you start gdb, it confirms the .gdbinit file has been loaded. -
Custom gdb commands. Use
defineto add your own gdb "macros" for often-repeated command sequences. You can add defines to your.gdbinitfile so you have access to them in subsequent gdb sessions as well. - Fire up tui mode (maybe...). The command
layout asmfollowed bylayout regwill give you a split window showing disassembly and register values. This layout will display current values for all registers in the upper pane, the sequence of assembly instructions in the middle pane, and your gdb command line at the bottom. As you single-step withsi, the register values will update automatically (those values that changed are highlighted) and middle pane will follow instruction control flow. This is a super-convenient view of what is happening at the machine level, but sadly, you have to endure a number of quirks and bugs to use it. Thetuimode can occasionally crash gdb itself, killing off gdb and possibly the bomb while it's at it. Even when tui is seemingly working, the display has a habit of turning wonky, often fixable byrefreshbut not always. A garbled display could cause you to misunderstand the program state, misidentify where your bomb is currently executing, or accidentally execute a gdb command you didn't intend. Any explosion suppression mechanism that requires you, the fallible human, to take the right action at a critical time could easily be waylaid by interference, so don't attempttuibefore you have invincible automatic protection against explosions. Selective use of auto-display expressions (as introduced in lab7) is a great alternative with less disruption.
Common questions
What does the sanitycheck verify for this assignment? Does it run my bomb?
The default sanitycheck test cases are ATM inputs and one test case that reports the line count of your input.txt file. This sanitycheck is configured to only allow test cases for ATM in your custom_tests file. The bomb executable is not run by sanitycheck.
The ATM exercise refers to a "transaction log". What does this mean? Is there a log file accessible somewhere?
Review the ATM code to see what/when the program writes as logging output. Imagine that output captured and appended onto a "transaction log" file. You don't have access to the previous log file, but you can run the program to see what is logged for various operations.
For (1b) what is meant by "impersonating" the TA?
Impersonation is a nefarious entity approaching the ATM pretending to be User B and asking to make a withdrawal from User B's account. If this exploit succeeds, it would suggest there is a vulnerability in how identities are verified. Note that the logged transaction would show that User B was given money from User B's account (and alarmingly, this bogus transaction would not be distinguishable from a valid one). The ATM may be vulnerable to such an exploit, but this possibility isn't even on the bank's radar (hmm.... bonus issue!).
For (1b), the log records a withdrawal made from User B'a account by a user who is not User B. This indicates there is an exploit that wrongly allows access to a different user's account.
For (1b) does it matter which TA we target as victim of our exploit?
Any should work. As a twisted homage to your favorite TA, choose them as the victim. You could pilfer from one of your fellow students instead, but that seems impolite :-)
How do I know if the bomb has exploded?
When an unadulterated bomb explodes, it prints "KABOOM", notifies the authorities, and terminates. The bomb can only explode when it is "live", i.e., executing in shell or running with gdb. Using tools such as nm, strings, and objdump to examine the executable will not explode the bomb.
How can I tell if the staff heard my explosion?
The bomb has no secrets -- all the code is right there. If you dig into the code that processes explosions you can determine for yourself how/when/whether the word gets out. Avoiding the entire explosion is one straightforward approach to assure that we won't hear about it, but there are ways to selectively disable just the transmission portion.
I have an idea about stopping explosions by <insert-cool-idea-here>. Is this allowed?
For suppressing explosions anything goes! There are simple manual blocks that give some measure of protection, but it is best to go further to develop an invincible guard. Whether you leverage gdb features, tweak the global program state, modify your setup, trick the bomb into running in a safe manner, or hack the bomb executable, we're good with any technique that keeps the explosions quiet.
I just exploded my bomb, but I assure you it was an accident/misunderstanding/not my fault! I hadn't even read the assignment writeup before I started. Can I undo that explosion?
We count all explosions that reach us. Consider it a fun challenge to develop a protection so secure and/or to tread so carefully that you never detonate the bomb, but should an explosion slip through, be assured that no cute baby animals have lost their lives and that uncaught explosion is a loss of a mere single point.
Do we need to reverse every single line of C source within the level to solve it?
Your goal is to work out a correct input to pass the level. This will require a fairly complete exploration of the code path you follow to defuse, but any code outside that path can be investigated on a need-to-know basis.
My input defuses the level when typed manually, but when I added the same input to input.txt, it explodes. What gives?
When testing on input.txt, we advise you do so with your explosion defense in place against possible editing glitches. The contents of input.txt should consist of the input for each level on its own line and each line should end with a standard Unix newline. Stop in gdb and examine the line read from your file to spot the discrepancy between what you need and what you have. Look carefully for extraneous leading/trailing spaces or mismatched line endings. The unix editors available on myth (emacs, vim, gedit, etc.) use the correct line endings (\n) by default. Editors on other platforms that are using the line-ending conventions for Mac (\r) or Windows (\r\n) will cause you grief. The easiest approach to avoid problems is to edit the input.txt file using a unix editor on a unix system.
I found some other assembly reference material that seems syntactically/logically inconsistent with the assembly from our textbook/lecture/tools. What's up?
The gnu tool chain defaults to the att (AT&T) syntax and all of our materials (text, lecture, lab) are consistent with this syntax. If you hunt down other resources in the wild, you may encounter Intel syntax where the order of operands are reversed, register names are not prefixed with %, immediate values are not prefixed with $, indirection is expressed with brackets instead of parentheses, and so on. For example, the att instruction push %rbp is written as push RBP in Intel and att movl $1, (%rsp) becomes movl [RSP], 1. Translating between them can be confusing, so it's recommended that you stick to resources that use the same syntax as our tools/text.
I hate tui! I just managed to si through an explosion because I was confused about the next instruction to be executed. How can I make tui behave?
Grr, I have a love-hate relationship with tui myself. Whoever is responsible for it was obviously not a CS107 alum who has learned to thoroughly test their code, no? Remedies to try in order of increasing desperation:
refreshearly and often- exit tui using
ctrl-x aand re-enter (this doesn't require leaving gdb and losing all your state) - quit gdb and start all over (this can be made less annoying by use of
.gdbinitto re-set the state for you)
Keep a list on what actions seem to trigger problems for you and avoid doing those things (for example, on my Mac, resizing my terminal window while in active tui mode creates unresolvable havoc, so I don't do that). The split reg/asm window is such a great way to follow along while single-stepping it's worth a little pain to baby tui along. However, if your anti-explosion strategy relies on you choosing the appropriate next action, you are susceptible to being misled by tui at a critical time, so don't even attempt tui until you have a rock-solid automated defense.
How do I print register values in gdb?
The gdb command info reg will show the current value for all registers. You can also access individual register values for use in gdb commands such as print, examine, or display. The register names are prefixed by dollar sign in gdb. A register value is treated as void*; you can apply a typecast to change the interpretation. Some examples:
(gdb) p/t $rax # print %rax, binary
(gdb) p (char *)$rax # print %rax, interpret as char*
(gdb) x/2wd $rax # examine memory (deref %rax), show 2 ints
(gdb) display/2gx $rsp # auto-print 2 quadwords from stack top in hex
To use the register value in a larger expression, be sure to use C syntax, not assembly. For example, if you need to dereference a register, apply *, not wrap in parentheses. If you ask gdb to evaluate an expression in assembly syntax, it handles it fairly oddly:
(gdb) p ($rax) # parens ignored, ($rax) same as $rax
(gdb) p 0x8($rsp) # gdb will segfault on this
Instead use C syntax, including typecast where necessary.
(gdb) p *(long *)$rax
(gdb) p *(long *)((char *)$rsp + 8)
The disassembly shows %eax being set to 0 before certain function calls. What's with that?
Variable argument functions (e.g printf and scanf variants) require a little extra setup relative to normal calls. The x86-64 calling conventions for variable argument functions must indicate presence of any float/double arguments by setting %rax to the count of vector registers used. If none are used (i.e. no parameters of float/double type), it sets %rax to zero.