Due: Sat Mar 4 11:59 pm
Ontime bonus 5%. Grace period for late submissions until Mon Mar 6 11:59 pm
Assignment by Julie Zelenski & Michael Chang, ideas taken from the original by Randal Bryant & David O'Hallaron (CMU)
Learning goals
Completing this assignment will give you:
- solid practice in reading and tracing assembly code
- a clear understanding of how data access, control structures, and function calls translate between C and asm
- experience reverse-engineering object code
- a compelling reason to invest in mastering the gdb debugger!
Walthrough video
This is a walkthrough video to help you get started. (Suggestion: watch first before reading the spec)
The problem
Those nefarious Cal students have broken into our myth machines and planted some mysterious executables we are calling "binary bombs." These programs are believed to be armed and dangerous. Without the original source, we don't have much to go on, but we have observed that the programs seem to operate in a sequence of levels. Each level challenges the user to enter a string. If the user enter the correct string, it defuses the level and the program proceeds on. But given the wrong input, the bomb explodes by printing an earth-shattering KABOOM! and terminating. To deactivate the entire bomb, one needs to successfully defuse each of its levels.
The Cal students have littered our systems with these landmines and we need your help. Each of you is given a bomb to disable. Your mission is to apply your best asm detective skills to work out the input required to pass each level and render the entire bomb harmless.
Your bomb is given to you as an executable, i.e. as compiled object code. Someone has salvaged a scrap of paper containing the C source for main, but no further source code is to be found. From the assembly for the rest of the executable, you will work backwards to construct an picture of the original C source in a process known as reverse-engineering. Once you understand what makes your bomb "tick", you can supply each level with the input it requires and defuse it. The levels get progressively more complex, but the expertise you gain as you move up from each level should offset this difficulty. One confounding factor is that the bomb has a hair trigger, prone to exploding at the least provocation. Each time your bomb explodes, it notifies the staff, which deducts from your score. Thus, there are consequences to detonating the bomb-- you must tread carefully!
Reverse-engineering requires a mix of different approaches and techniques and will give you an opportunity to practice with a variety of tools (nm, objdump, strings). The most powerful weapon in your arsenal will be the debugger and an important learning goal of the assignment is to expand your gdb prowess. Building a well-developed gdb repertoire can pay big dividends the rest of your career!
Logistics
Our counter-intelligence efforts been able to confirm a few things about how the bombs operate:
- If you start the bomb with no command-line argument, it reads input typed at the console.
-
If you give an argument to the bomb:
./bomb input.txtthe bomb will read lines from
input.txtuntil it reaches EOF (end of file), and then switch over to reading from the console. This feature allows you to store inputs for solved levels ininput.txtand avoid retyping them each time. -
Explosions can be triggered when executing at the shell or within gdb. However, gdb offers you tools you can use to intercept explosions, so your safest choice is to work under gdb and employ protective measures.
- The bomb in your repository was lovingly created just for you and is unique to your id. It is said that the bomb can detect if an impostor attempts to execute your bomb and won't play along.
- The bombs are designed for the myth computers (running on the console or logged in remotely). There is a rumor that the bomb will refuse to run anywhere else.
- The bombs were compiled from C code using gcc. Apparently Cal students don't know how to edit a Makefile to change the flags to achieve much obfuscation of the object code.
- The Cal students also weren't aware the function names would be visible in the object code, so they didn't take pains to disguise them. Thus, a function name of
initialize_bomborread_five_numberscan be a clue. Similarly, they played it straight with use of the standard C library functions, so if you encounter a call toqsortorsscanf, it is the real deal. - Direct modification of the binary bomb executable can change its behavior, but be forewarned that we will test your submission against your original unmodified binary, so while hacking the executable is great fun, it won't be of much use as a strategy for solving the levels. (Although it can be an entertaining and educational exercise in suppressing explosions...)
-
There is one important restriction: Do not use brute force! You could write a program to try every possible input to find a solution. But this is trouble for several reasons:
- You lose points on every incorrect guess which explodes the bomb.
- A notification is sent on each bomb explosion. Wild guessing will saturate the network, creating ill will among other users and attracting the ire of the system administrators who have the authority to revoke your privileges because you are abusing shared resources.
- We haven't told you how long the strings are, nor have we told you what characters they can contain. Even if you made the (wrong) assumptions that they all are less than 80 characters long and only contain lowercase letters, you will have 2680 guesses for each level. Trying them all will take an eternity, and you will not have an answer before you graduate.
- Part of your submission requires answering questions that show your understanding of the assembly code, which guessing will not provide. :-)
-
Don't miss out on the companion advice page: Go to advice/FAQ page
Readme file
The readme.txt file contains a set of questions for you to answer, one about explosions and a level-specific question for each level. The level questions are a mix of reversing tasks and short answer questions. You are to edit the readme to include your answers to these questions and submit it with your defused bomb.
For one of the levels, we ask you to reverse a assembly sequence back into C code. For this reversing task, your version need not be line-for-line perfectly faithful reproduction of the original C source, but it should have accurate "shape" and structure to match the constructs being used (if/else, loops) and the control flow. Given an if statement, does it include an else, is there a sequence of if statements that are nested or cascading? Given a loop, where does it start and stop? How does increment? Can you tell if the loop is a for, while, or do-while? Is there any use of break or continue in the loop body? For function calls, what are the parameters being passed? How/what does it return?
The requirement for the readme file is not intended to be onerous. Strunk and White famously say "vigorous writing is concise" and we couldn't agree more. The short answer questions are intended be answered in just a few sentences. Excess verbiage can weaken an otherwise correct explanation, so you're best off sticking the facts!
Grading
For this assignment, the total number of points is expected to be 67, distributed as follows:
- Input.txt. (40 points) Each level you have solved earns 8 points. We will test by running
bomb input.txton your submission. Theinput.txtfile in your submission should contain one line for each level you have solved, starting from level 1, with no extraneous text. - Bomb explosions. (up to -6 points deducted) Each bomb explosion notification that reaches the staff results in a 1 point deduction, capped at -6 points total.
- Readme (27 points) 2 points for explosion suppression, 5 points for each level question. A clear, concise, correct answer will earn full credit. Answers that are vague, inaccurate, or overly verbose may earn partial credit. For the reversing task, we award full credit to a reasonably faithful reproduction of the C source.
Submissions received by the assignment deadline will receive an on-time bonus equal to 5% of the points earned.
Getting started
The starter project contains your compiled binary bomb, the C source for bomb.c including the main function, two text files, and a .gdbinit file. Check out a copy of your cs107 repository with the command:
hg clone /afs/ir/class/cs107/repos/assign5/$USER assign5
Do not start by running the bomb to "see what it will do..." You will quickly learn that "what it does" is explode :-) When started, it immediately goes into waiting for input and when you enter the wrong response, it will explode and deduct points. Your first task should be to put your kid gloves on and carefully poke around. Once you figure how to set up appropriate protection against explosions, you will then be free to experiment with the levels without any nail-biting anxiety about setting off the bomb.
As you solve successive levels, add the necessary input to the input.txt file and answer the level-specific question in the readme.txt file. Be sure to regularly commit your changes to these files.
Finishing
The committed version of input.txt should contain the strings that correctly defuse the levels that you solved and readme.txt should contain the answers to the questions. When grading, we will run your bomb on your input.txt to verify which levels you have successfully defused. Malformed entries in your input.txt or wrong line-endings (see FAQ at end of advice page) will cause grading failures. To avoid surprises, be sure that you have verified your input.txt in the same way we will in grading (i.e. bomb input.txt). When finished, you must submit to send your files for grading. If needed, review our late policy.