Project 1: The DEET Debugger

In this project, you’ll implement the DEET debugger (Dodgy Eliminator of Errors and Tragedies) to get the deets on those pesky bugs in your code.

bug spray

This project will give you practice with multiprocessing in Rust, and will give you a better sense of how processes are managed by the operating system as well as how ptrace can be used to circumvent process boundaries. While DEET is simpler and less powerful than GDB, you’ll experience the mechanics that all debuggers are based on. We welcome you to add your own features to build the debugger that you would want to use!

This is a big and complex project that will synthesize everything you’ve learned so far. Please ask questions on slack if anything is unclear or if you’re feeling stuck/confused!

Logistics

This project is due on Sunday, February 20th at 11:59PM pacific time.

You may work individually or in a group of 2-3.

If you would be interested in working on a different project and have a proposal in mind, let me know! This is a small class, and I would love to support your individual interests.

You can find a copy of my in-class project 1 walkthrough here.

Finally, please ask questions in the #proj1-discussion channel (or to me directly)! It’s still very easy to get tripped up by Rust syntax and mechanics, and there are some nontrivial concepts at play here as well. I want this project to be an opportunity to build experience with Rust and learn a bit about computer systems!

Working with a partner

If you work in a group, you have two options:

Only one person submits:
- Add a comment to the top of main.rs including all partners' names and sunet IDs (Stanford email usernames).
- Message me on Slack to let me know you’re working together and which repository you’ll be working in. I’ll add the rest of the project group to that repository.
Work together but submit separately: If you’d like to collaborate closely, as if you’re in a group, but still write your own code, you’re welcome to do that.
- In this case, add a comment to the top of main.rs indicating whom you worked with.
- NOTE: As always, you’re welcome to discuss the projects with anyone in the class, but you should keep your discussions high-level with anyone outside of your group. Outside of your group, the CS110 honor code policies apply.

If you are submitting together, I strongly recommend that you do not simply split up the milestones below and then pass off the work. If at all possible, work together synchronously over an audio or video call. If you do split up work, make sure that you check in regularly and document what you do (e.g., good comments). This project is sufficiently complex that each of you need to understand all of the parts involved, and I think you’ll benefit the most if you work closely with your partner to figure out how to solve problems and structure your code instead of working separately.

If you are working in a team, check out the tools tips here for live collaboration and working with git.

Getting set up

You should have received an invite to join this project’s Github repository. If you didn’t get an email invite, try going to this link:

https://github.com/cs110l/proj1-YOURSUNETID

You can download the code using git as usual:

git clone https://github.com/cs110l/proj1-YOURSUNETID.git proj1

We recommend using a Linux system and an x86 architecture for development on this project. Unfortunately, the interface of ptrace differs between Linux and BSD (e.g. Mac) systems, and is not available on Windows, and different systems store debugging symbols in different ways. Additionally, we rely on registers that only exist on x86, and “emulated” x86 systems (e.g., pretending to be x86 on ARM) don’t seem to support ptrace.

While it is certainly possible to extend your debugger to work on multiple platforms, we will only target Linux and x86 here for simplicity.

If you’re not on Linux and x86 already, you can use myth or rice. If you’re not on an M1 Mac, you can, alternatively, get a Linux setup locally using any of the options described on this handout. There’s also some more background about why local options for this project won’t work on an M1 Mac, in case it’s of interest.

Milestone 0: Read the starter code

This is the first large project in CS 110L, and it may be one of your first times working with a more substantial codebase. Take some time to orient yourself with the starter code, writing/drawing things out as necessary.

There are a few files you should be aware of:

main.rs is a short file that serves as the entrypoint for the program. You won’t need to make any changes here.
debugger.rs contains the code that implements the command-line interface for DEET. You’ll be making a lot of changes here.
debugger_command.rs contains some code for parsing commands that are typed into DEET. Any time you add a new command, you’ll need to add code here.
inferior.rs contains code to manage child processes being run by the debugger. As you add features that involve controlling the program being debugged, you will need to add code here.
dwarf_data.rs contains a series of helper functions for extracting debugging symbols (e.g. line numbers, variable names, function names) from the executable being debugged. You won’t need to make any changes here, but you will need to use these functions in Milestone 3.
gimli_wrapper.rs contains functions that are used to read debugging symbols from a binary file. It is messy code patched together from several Gimli examples; please don’t read it :) (unless you plan to do an extension and need to collect more information from the dwarf file)

In addition, we have provided a series of sample programs that you can use to test your debugger. These programs are written in C and are in the samples/ directory, although we’d like to note that you could use DEET to debug Rust programs as well!

You should run make to compile the sample programs before proceeding.

⚠️ If you’re using Vagrant or Docker, be careful to run make inside the VM/container instead of in your regular terminal, or else you might compile Windows or MacOS executables instead of Linux ones, causing confusion later on :) ⚠️

Milestone 1: Run the inferior

In this milestone, you will modify the debugger to start an inferior. An inferior is a process that is being traced by the debugger.

Currently, code in debugger_command and debugger extracts arguments from the r command and passes them to Inferior::new:

🍌 cargo run samples/sleepy_print
   Compiling deet v0.1.0 (/deet)
    Finished dev [unoptimized + debuginfo] target(s) in 13.41s
     Running `target/debug/deet samples/sleepy_print`
(deet) r 3
Inferior::new not implemented! target=samples/sleepy_print, args=["3"]
Error starting subprocess
(deet)

Your first job is to implement Inferior::new to spawn a child process running our target program. This child process should have debugging enabled, which you can accomplish using the ptrace syscall: the child process can call ptrace with PTRACE_TRACEME after fork() but before exec, telling the operating system “hey! please allow my parent process to trace my execution.”

Note: before modifying Inferior::new, you’ll have to use what you’ll need at the top of inferior.rs. As a hint, Command is defined in std::process::Command, and pre_exec is defined in std::os::unix::process::CommandExt.

In Inferior::new, you should do the following things:

Create a Command for the target program with the provided arguments.
Use pre_exec to call child_traceme in the child process. See the lecture 10 slides for an example of how to call pre_exec.
spawn the process. If spawn fails, you should return None.
- (The ok()? syntax from the Week 3 exercises might be helpful!)
Wait, then verify that the child has started and is set up to be traced:
- When a process that has PTRACE_TRACEME enabled calls exec, the OS will load the specified program into the process, and then, before the program starts running, it will pause the process with SIGTRAP.
- You should call waitpidto wait until the child process has started and paused and confirm that it pauses with the signal SIGTRAP.
- You can call waitpid directly, or you can construct the Inferior you’ll ultimately return and use the Inferior::wait method provided. If you need a pid to pass to waitpid, check out the sample code in Inferior::pid.
- Note: if this check fails – waitpid returns, but the child process isn’t stopped with SIGTRAP – return None.
Finally, assuming everything went well, return a Some option containing an Inferior with the Child you got from spawn-ing!

At this point, you’ve created an Inferior! I’d recommend running cargo build to make sure everything compiles.

As mentioned, PTRACE_TRACEME causes programs to start in a stopped state. In order to test this milestone, you’ll need to implement a way to get the program to execute.

To do this, implement a cont method on Inferior. This method should wake up the inferior and run it until it stops or terminates.

Add a pub fn to the impl Inferior block. (Note: you won’t be able to call it continue, since that’s a reserved keyword.)
To wake up the inferior, you can use ptrace::cont (pass None for sig). To wait, you can use self.wait(None).
Our continue method returns Result<Status, nix::Error> in order to pass on the resulting program status or any errors to the caller. We use ? operator to simplify error handling.
Update Debugger::run to call this continue method after it constructs an Inferior.
Use the status returned from your continue method to print a message about the status of the inferior. (You’re welcome to panic if continue results in an Error.)
- Note: for our continue method, because we’re using the Status type defined in inferior.rs, we changed our use declaration at the top of debugger.rs to use crate::inferior::{Inferior, Status}.

Expected outcomes:

You can start inferiors and pass arguments using the run command
When an inferior stops or terminates, the debugger should print a message (e.g. Child exited (status 0))
You can run a program multiple times within a debugging session

Example output:

🍌 cargo run samples/sleepy_print
    Finished dev [unoptimized + debuginfo] target(s) in 1.94s
     Running `target/debug/deet samples/sleepy_print`
(deet) r 3
0
1
2
Child exited (status 0)
(deet) r 3
0
1
2
Child exited (status 0)
(deet)

Milestone 2: Stopping, resuming, and restarting the inferior

Sometimes, when a process deadlocks, it is helpful to temporarily stop it, poke around (e.g. print a backtrace to see where it is deadlocked), then resume it. In this milestone, we will add the ability to pause and resume an inferior.

As it happens, our debugger already has the ability to pause an inferior. Normally, SIGINT (triggered by Ctrl-C) will terminate a process, but if a process is being traced under ptrace, SIGINT will cause it to temporarily stop instead, as if it were sent SIGSTOP. (The same is true for all signals that typically terminate a process. This is useful for debugging: if a program segfaults but is being traced under ptrace, the program will stop instead of terminating so that you can get a backtrace and inspect its memory.) You can try this out: run samples/sleepy_print under your debugger with the argument 5. Press ctrl+c, and the program will stop.

Now, we need a way to resume a stopped process. Let’s add a continue command, similar to the one GDB has:

To add a command, you’ll need to add an enum variant to the DebuggerCommand enum in debugger_command.rs, and you’ll need to update DebuggerCommand::from_tokens to return your new variant when c, cont, or continue are typed in DEET.
Then, update Debugger::run to continue the inferior when the continue command is typed. You can use your continue method from the previous milestone.
Print the status of the inferior when it stops or terminates next, similar to the run command. (Might be a good place for decomposition?)

🍌 cargo run samples/sleepy_print
    Finished dev [unoptimized + debuginfo] target(s) in 2.56s
     Running `target/debug/deet samples/sleepy_print`
(deet) run 5
0
1
^CChild stopped (signal SIGINT)
(deet) cont
2
3
^CChild stopped (signal SIGINT)
(deet) cont
4
Child exited (status 0)
(deet)

To close out this milestone, take care of a few edge cases:

What happens if you type continue before you type run? Your implementation should check whether an inferior is running, and print an error message if there is not one running.
Also, what happens when you pause an inferior using ctrl+c, then type run? You should take care to kill any existing inferiors before starting new ones, so that there is only one inferior at a time.
- You can use Child::kill to kill a process, and then you’ll need to reap the killed process.
- (We added an Inferior::kill method and called this from Debugger::run, although you are not required to do so.)
Similarly, what happens if you exit DEET while a process is paused? You should update the handling of DebuggerCommand::Quit to terminate (and reap!) the inferior if one is running.

(deet) run 5
0
1
^CChild stopped (signal SIGINT)
(deet) quit
Killing running inferior (pid 216)

If you want to test your management of child processes, use DEET to start a sleepy_print inferior, pause it, start a new inferior, and pause that second inferior. In a separate terminal, run ps aux | grep sleepy_print (or docker exec deet ps aux | grep sleepy_print if you are using docker). This will search for all processes you’ve started running sleepy_print. There should only be one samples/sleepy_print process. If you see multiple, or you see a <defunct> entry, then you are not killing or reaping child processes properly.

🍌 cargo run samples/sleepy_print
   Compiling deet v0.1.0 (/deet)
    Finished dev [unoptimized + debuginfo] target(s) in 29.80s
     Running `target/debug/deet samples/sleepy_print`
(deet) cont 
Nothing is being debugged!
(deet) run 5
0
1
^CChild stopped (signal SIGINT)
(deet) run 5
Killing running inferior (pid 204)
0
1
^CChild stopped (signal SIGINT)
(deet)

# Note: Run "ps aux | grep sleepy_print" if you aren't using Docker
🍌 docker exec deet ps aux | grep sleepy_print
501          1  0.6  0.2  16292  4448 pts/0    Ss+  10:29   0:00 target/debug/deet samples/sleepy_print
501        210  0.0  0.0   4504   704 pts/0    t+   10:29   0:00 samples/sleepy_print 5

Expected outcomes:

You can pause an inferior using ctrl+c.
You can resume an inferior using continue.
Inferiors can be paused/resumed several times.
The status of the inferior is printed whenever it stops/terminates.
At most one inferior process exists at any time. No zombie processes!
Any running inferior is terminated when the debugger quits.

Milestone 3: Printing a backtrace

In this milestone, you’ll implement code to print a stack trace for a paused program.

To start with, let’s get the backtrace command set up:

Define a new DebuggerCommand that is returned when the user types bt, back, or backtrace.
Define a method print_backtrace(&self) -> Result<(), nix::Error> in Inferior that prints “hello world” and returns an empty Ok (written as Ok(())).
Call this method when the user types a backtrace command.

Test this out to ensure that your debugger is able to read and process the backtrace command:

🍌 cargo run samples/sleepy_print
   Compiling deet v0.1.0 (/deet)
    Finished dev [unoptimized + debuginfo] target(s) in 29.80s
     Running `target/debug/deet samples/sleepy_print`
(deet) r 5 
^CChild stopped (signal SIGINT)
(deet) back
Hello world!
(deet)

Once you have done this, let’s move onto implementing a real print_backtrace.

As a first step, let’s print out the value of the %rip register. %rip is the instruction pointer, so printing its contents will give us the address of the instruction that the target process is executing.

You can use ptrace::getregs to get the inferior’s register values (propagating the error if this fails). Use println!("{:#x}", ...) to print the register value in hexadecimal. Note that you may see a different value than us depending on the machine you are compiling on.

🍌 cargo run samples/segfault
    Finished dev [unoptimized + debuginfo] target(s) in 2.61s
     Running `target/debug/deet samples/segfault`
(deet) run
Calling func2
About to segfault... a=2
Child stopped (signal SIGSEGV)
(deet) back
%rip register: 0x400b95
(deet)

Great, we’re printing something! Specifically, for the program above, we’re printing the address of the instruction being executed when the segfault happened. In other words, we’re printing a representation of the instruction that caused the segfault. However, we’re not printing a very useful representation.

In order to be useful, a backtrace should show function names and line numbers so that a programmer can identify which parts of their program is running. (This is called “source-level debugging”.) However, a running executable is comprised only of assembly instructions and has no awareness of function names or line numbers. In order to print such information, we need to read extra debugging symbols that are stored within an executable compiled specifically for debugging. This debugging information stores mappings between addresses and line numbers, functions, variables, and more. With this information, we can find where variables are stored in memory or figure out what line is being executed based on the value of the processor’s instruction pointer.

On many platforms, debugging symbols are stored in a format called DWARF and embedded inside the executable file. In developing this assignment, we discovered that DWARF is extremely complicated, and there are not yet any good high-level DWARF parsers in Rust. In order to avoid subjecting you to the same pain we went through, we have provided you with some functions in dwarf_data.rs that you can use in your debugger implementation.

To use these functions, you should first add these two lines to main.rs:

mod dwarf_data;
mod gimli_wrapper;

This defines the public functionality in the dwarf_data and gimli_wrapper files as “modules” for our project.

Then, in debugger.rs, add use crate::dwarf_data::{DwarfData, Error as DwarfError} to the top of the file. This will allow you to directly refer to DwarfData and DwarfError in your code. Then, and at the beginning of Debugger::new, use the following code to load the target executable file into our “dwarf data” format:

let debug_data = match DwarfData::from_file(target) {
    Ok(val) => val,
    Err(DwarfError::ErrorOpeningFile) => {
        println!("Could not open file {}", target);
        std::process::exit(1);
    }
    Err(DwarfError::DwarfFormatError(err)) => {
        println!("Could not load debugging symbols from {}: {:?}", target, err);
        std::process::exit(1);
    }
};

You should store debug_data inside the Debugger struct (i.e., add a new member to the struct.)

Let’s update print_backtrace to be more helpful.

Add a DwarfData parameter. In Debugger::run, pass in the debug_data from your debugger struct. As a reminder, this provides you with a mapping from the memory addresses of instructions when they are executed to files/lines/functions/etc. in your source code.
Armed with your %rip value and the DwarfData struct associated with your target executable, use DwarfData::get_line_from_addr to get the file name and line number corresponding to the current instruction, and use DwarfData::get_function_from_addr to get the function name.
- (Note: as usual, you’ll have to use crate::dwarf_data::DwarfData at the top of the file.)

Print out this information, and you will have the start of something useful:

👾 cargo run samples/segfault
    Finished dev [unoptimized + debuginfo] target(s) in 2.43s
     Running `target/debug/deet samples/segfault`
(deet) r
Calling func2
About to segfault... a=2
Child stopped (signal SIGSEGV)
(deet) back
func2 (/deet/samples/segfault.c:5)
(deet)

Amazing!

We can now see where the program is stopped, but we want to show a full stack trace: what function called func2, and what functions came before that? To figure this out, we need to understand a bit more about how the stack is laid out.

The stack consists of stack frames, where each function’s local variables are placed in its own stack frame. At the top of each stack frame is a return address, which stores the address in the text segment where we should go to after returning from this function.

When printing a backtrace, we do so using the return addresses. First, we print the line number corresponding to %rip (where we are currently executing). Then, we print the line number corresponding to the return address of our current stack frame. Then, we print the line number for the return address of the previous stack frame, and so on, until we reach the main function.

This may sound simple, but we have a problem: How do we actually find the top of the current stack frame? There are no registers that point to the top of the stack frame, nor is there any information in the executable telling us how large the stack frame is. (Those familiar with assembly may be familiar with %rbp, the base pointer register, which used to serve this purpose but is no longer consistently available, for performance reasons.)

To solve this problem, we can once again rely on DWARF debugging information to figure out how big the stack frame is given which function we are currently executing. The concrete mechanics for this are pretty complicated, but we have provided you with a DebugInfo::get_frame_start_address method that does this for you.

debug_info.get_frame_start_address() will return the address of the stack frame, so you can find the address of the return address by subtracting 8. Then, you can read the return address using ptrace::read. This value becomes the instruction pointer (rip) for the previous stack frame, and the top of the frame returned by get_frame_start_address becomes the stack pointer (rsp) for the previous stack frame. With these new rip and rsp values, we can call get_frame_start_address again to get the top of the previous stack frame, use that to get the return address, and repeat the process, working our way up the stack!

Codifying this process, we can implement a backtrace like this (pseudocode):

instruction_ptr = %rip
frame_bottom = %rsp
while true:
    print function name, file, and line number for instruction_ptr
    if function name == "main":
        break
    frame_top = get_frame_start_address(instruction_ptr, frame_bottom)
     -> if you can't find the top of the frame, print an error, e.g.
        "Warning: unknown stack frame layout, can't unwind further",
        and break out of this loop. (This can happen if, e.g., the stack
        has been corrupted by a buffer overflow.)
    instruction_ptr = read memory at frame_top - 8
    frame_bottom = frame_top

(As a reminder, you can create the equivalent of a while true loop using the loop keyword in Rust.)

To read memory, you can use ptrace::read:

let new_instruction_ptr = ptrace::read(self.pid(), addr_to_read as ptrace::AddressType)? as u64;

(The as keyword implements casting.)

When you’ve implemented the above in print_backtrace, you should be able to print a full backtrace:

👾 cargo run samples/segfault
    Finished dev [unoptimized + debuginfo] target(s) in 2.43s
     Running `target/debug/deet samples/segfault`
(deet) r
Calling func2
About to segfault... a=2
Child stopped (signal SIGSEGV)
Stopped at /deet/samples/segfault.c:5
(deet) back
func2 (/deet/samples/segfault.c:5)
func1 (/deet/samples/segfault.c:12)
main (/deet/samples/segfault.c:15)
(deet)

Milestone 4: Print stopped location

When an inferior stops, GDB prints the file/line number that it stopped at. This is extremely helpful when dealing with breakpoints and step debugging, which we will tackle in the next few milestones.

You may have noticed that Status::Stopped includes a u64 containing the value of %rip for the stopped process. Modify your Debugger implementation such that when the inferior stops, if line number information is available from DwarfData::get_line_from_addr (i.e., if this method returns Some), DEET prints the line number where the program stopped. If you’re up for it, you can print the function name as well!

🍌 cargo run samples/segfault
    Finished dev [unoptimized + debuginfo] target(s) in 2.07s
     Running `target/debug/deet samples/segfault`
(deet) r
Calling func2
About to segfault... a=2
Child stopped (signal SIGSEGV)
Stopped at /deet/samples/segfault.c:5
(deet)

Milestone 5: Setting breakpoints

In this milestone, we’ll allow a user to set a breakpoint at a specific memory address using a command like break *0x123456 (or b *0x123456 for short).

First, update DebuggerCommand and Debugger to parse a break command. We recommend storing a simple String target in the DebuggerCommand enum variant, and then do more sophisticated parsing (e.g. ensure the target string starts with *, and extract the address as a u64 from the string) in Debugger. This is because, in the next milestones, you will be updating this code to take different kinds of breakpoints, e.g. breakpoints on function names or line numbers.

You may use this code to parse a u64 from a hexadecimal string:

fn parse_address(addr: &str) -> Option<u64> {
    let addr_without_0x = if addr.to_lowercase().starts_with("0x") {
        &addr[2..]
    } else {
        &addr
    };
    u64::from_str_radix(addr_without_0x, 16).ok()
}

Note that users should be able to set breakpoints before any inferior is running. (If you make them run the inferior first, it will likely exit before they are able to set breakpoints.) As such, you should store set breakpoints in a Vec<u64> in the Debugger struct. When a user types break *0x123456, you should add 0x123456 to the list of set breakpoints.

(deet) b *0x123456
Set breakpoint 0 at 0x123456

Our implementation prints out a confirmation message along with a breakpoint number, but this is not required.

When creating an Inferior, you should pass Inferior::new a list of breakpoints. In Inferior::new, after you wait for SIGTRAP (indicating that the inferior has fully loaded) but before returning, you should install these breakpoints in the child process.

How does one set a breakpoint on a process? The answer is more hacky than you might expect, yet this is exactly how GDB works. To set a breakpoint on the instruction at 0x123456, simply use ptrace to write to the child process’s memory, replacing the byte at 0x123456 with the value 0xcc. This corresponds to the INT (“interrupt”) instruction; any process that runs this instruction is temporarily halted.

This is simple in concept but slightly challenging in practice because ptrace does not support writing single bytes to a child’s memory. In order to write a byte, you must read a full 8 bytes into a long, use bitwise arithmetic to substitute the desired byte into that long, and then write the full long back to the child’s memory. Additionally, despite the nix crate’s ptrace having a much nicer interface than the ptrace syscall, it’s still a bit funky to use (it requires some bizarre type conversions). As such, we would rather you not spend time on trying to figure out how to do this. You may use the following code:

use std::mem::size_of;

fn align_addr_to_word(addr: u64) -> u64 {
    addr & (-(size_of::<u64>() as i64) as u64)
}

impl Inferior {
    fn write_byte(&mut self, addr: u64, val: u8) -> Result<u8, nix::Error> {
        let aligned_addr = align_addr_to_word(addr);
        let byte_offset = addr - aligned_addr;
        let word = ptrace::read(self.pid(), aligned_addr as ptrace::AddressType)? as u64;
        let orig_byte = (word >> 8 * byte_offset) & 0xff;
        let masked_word = word & !(0xff << 8 * byte_offset);
        let updated_word = masked_word | ((val as u64) << 8 * byte_offset);
        ptrace::write(
            self.pid(),
            aligned_addr as ptrace::AddressType,
            updated_word as *mut std::ffi::c_void,
        )?;
        Ok(orig_byte as u8)
    }
}

You can test this by modifying Debugger::new to call debug_data.print(). This will print out a list of locations in the loaded binary. You can set a breakpoint on one of these locations, and the program should stop there with a SIGTRAP. For example, below, I set a breakpoint at the beginning of func2 (where the segfault is triggered), which happens to be at 0x400b6d for my particular compiler. When I run the program, it does not segfault (since the breakpoint was before the line that causes the segfault), and DEET prints that it stopped on line 3.

👾 cargo run samples/segfault
   Compiling deet v0.1.0 (/deet)
    Finished dev [unoptimized + debuginfo] target(s) in 30.75s
     Running `target/debug/deet samples/segfault`
------
samples/segfault.c
------
Global variables:
Functions:
  * main (declared on line 14, located at 0x400bed, 24 bytes long)
  * func1 (declared on line 9, located at 0x400ba9, 68 bytes long)
    * Variable: a (int, located at FramePointerOffset(-20), declared at line 9)
  * func2 (declared on line 3, located at 0x400b6d, 60 bytes long)
    * Variable: a (int, located at FramePointerOffset(-20), declared at line 3)
Line numbers:
  * 3 (at 0x400b6d)
  * 4 (at 0x400b75)
  * 5 (at 0x400b8c)
  * 6 (at 0x400b97)
  * 7 (at 0x400ba3)
  * 9 (at 0x400ba9)
  * 10 (at 0x400bb1)
  * 11 (at 0x400bbd)
  * 12 (at 0x400be7)
  * 14 (at 0x400bed)
  * 15 (at 0x400bf1)
  * 16 (at 0x400c00)
(deet) break *0x400b6d
Set breakpoint 0 at 0x400b6d
(deet) r
Calling func2
Child stopped (signal SIGTRAP)
Stopped at /deet/samples/segfault.c:3
(deet)

Expected outcomes:

Users should be able to use break *addr to set breakpoints before an inferior starts running
When the inferior starts running, 0xcc should be written to the address of each breakpoint
Users should be able to use break *addr even after an inferior has started running (e.g. you should be able to ctrl+c on a sleeping program and set a breakpoint).

Debugging note: If you get an ESRCH error (no matching process found) while trying to set a breakpoint, make sure you are waiting for the child process to stop (and ensuring it stopped, rather than exiting or being signalled). If you don’t call wait() before setting breakpoints in Inferior::new, then it’s possible you’ll try to set breakpoints before the child process has started running.

Milestone 6: Continuing from breakpoints

Continuing from a breakpoint is as simple and as hacky as setting a breakpoint was.

When we have “hit a breakpoint,” the inferior has executed the 0xcc INT instruction, causing the inferior to pause (due to SIGTRAP). However, the 0xcc instruction overwrote the first byte of a valid instruction in the program. If we continue execution from after 0xcc, we will have skipped a legitimate instruction. Worse, many instructions are multiple bytes long. If we set a breakpoint on a multi-byte instruction and continue execution as is, the CPU will attempt to interpret the second byte of the instruction as a new, separate instruction. It’s likely the program will crash due to a segfault or illegal instruction error.

In order to continue from a breakpoint, we need to replace 0xcc with the original instruction’s value. Then, we need to rewind the instruction pointer (%rip) so that it points at the beginning of the original instruction (instead of pointing one byte in).

After doing this, we can resume execution. However, our breakpoint is no longer in the code, since we have swapped 0xcc for the real instruction. If we had set a breakpoint in a loop or in a function that is called multiple times, this is not ideal!

This problem is addressed with yet another hack. After replacing 0xcc with the original instruction’s first byte, we tell ptrace to continue by just one instruction, instead of completely resuming execution. Then, once the inferior has executed the full instruction, we replace it with 0xcc again to restore the breakpoint. Finally, we call ptrace::cont as usual to resume execution.

Here is pseudocode to implement these strategies in a “continue” method. I have reordered the above to make it slightly easier to implement, but the substance is the same:

if inferior stopped at a breakpoint (i.e. (%rip - 1) matches a breakpoint address):
    restore the first byte of the instruction we replaced
    set %rip = %rip - 1 to rewind the instruction pointer
     -> Be sure to call ptrace::setregs to update the actual register in the child process
    ptrace::step to go to next instruction
    wait for inferior to stop due to SIGTRAP
     -> (if the inferior terminates here, then you should return that status and
        not go any further in this pseudocode)
    restore 0xcc in the breakpoint location

ptrace::cont to resume normal execution
wait for inferior to stop or terminate

Evidently, to do this, you’ll need to keep track of the breakpoints that are installed, as well as the instructions they replaced. You can do this however you like. We maintain a HashMap<u64, u8> mapping breakpoint addresses (u64) to original instruction values (u8).

Expected outcomes:

Users should be able to set breakpoints at instructions and continue onwards from them.

Debugging tip:

We recommend printing a disassembly of the function where you’re setting a breakpoint. For example, if you’re setting a breakpoint in func2, you should run:

gdb -batch -ex "disassemble/rs func2" samples/segfault

This will allow you to see addresses where valid assembly instructions are. For example, based on my output, I see instructions at 0x400b6d, 0x400b71, 0x400b75, and so on. (Your output will vary based on your compiler/version.)

Every time you call write_byte, print the address you’re writing to, and every time your inferior is stopped (e.g. after a self.wait() call), print out ptrace::getregs(self.pid())?.rip to see where the inferior is executing. If you accidentally set the instruction pointer incorrectly, you might end up setting rip to point in between instructions, or maybe to the wrong place entirely. If you disassemble the binary to see where the valid instructions are, and frequently print out rip to see what is being executed, you’ll have an easier time pinpointing these kinds of problems.

Example output:

As an example, here I run samples/segfault, setting initial breakpoints on lines 15 and 10, then (after running the inferior and hitting the first breakpoint) adding another breakpoint at line 5. You can see that I hit each of the three breakpoints before the program eventually segfaults.

🍌 cargo run samples/segfault
    Finished dev [unoptimized + debuginfo] target(s) in 2.04s
     Running `target/debug/deet samples/segfault`
------
samples/segfault.c
------
Global variables:
Functions:
  * main (declared on line 14, located at 0x400bed, 24 bytes long)
  * func1 (declared on line 9, located at 0x400ba9, 68 bytes long)
    * Variable: a (int, located at FramePointerOffset(-20), declared at line 9)
  * func2 (declared on line 3, located at 0x400b6d, 60 bytes long)
    * Variable: a (int, located at FramePointerOffset(-20), declared at line 3)
Line numbers:
  * 3 (at 0x400b6d)
  * 4 (at 0x400b75)
  * 5 (at 0x400b8c)
  * 6 (at 0x400b97)
  * 7 (at 0x400ba3)
  * 9 (at 0x400ba9)
  * 10 (at 0x400bb1)
  * 11 (at 0x400bbd)
  * 12 (at 0x400be7)
  * 14 (at 0x400bed)
  * 15 (at 0x400bf1)
  * 16 (at 0x400c00)
(deet) break *0x400bf1
Set breakpoint 0 at 0x400bf1
(deet) break *0x400bb1
Set breakpoint 1 at 0x400bb1
(deet) r
Child stopped (signal SIGTRAP)
Stopped at /deet/samples/segfault.c:15
(deet) break *0x400b8c
Set breakpoint 2 at 0x400b8c
(deet) cont
Child stopped (signal SIGTRAP)
Stopped at /deet/samples/segfault.c:10
(deet) cont
Calling func2
About to segfault... a=2
Child stopped (signal SIGTRAP)
Stopped at /deet/samples/segfault.c:5
(deet) cont
Child stopped (signal SIGSEGV)
Stopped at /deet/samples/segfault.c:5
(deet)

Milestone 7: Setting breakpoints on symbols

As a finishing touch, modify your implementation of Debugger to allow setting breakpoints on line numbers and functions in addition to raw addresses.

If the specified breakpoint target starts with *, set a breakpoint on a raw address as you did in the previous two milestones. If the target parses as a u64 without error, treat it as a line number. Finally, if a function exists whose name matches the specified target, set a breakpoint at that function. You should print an error message if none of these cases succeed.

You can use DwarfData::get_addr_for_line and DwarfData::get_addr_for_function to translate a line number or function name into an address. (You can pass None as the first argument to each function, unless you feel like supporting GDB’s syntax that allows for setting a breakpoint on a line in a specific file.) Then, you can simply use your code from the previous milestones to set a breakpoint at an address.

Once you have this working, you may also want to delete the debug_data.print() from Debugger::new which you added in Milestone 5. This isn’t necessary anymore.

🍌 cargo run samples/segfault
   Compiling deet v0.1.0 (/deet)
    Finished dev [unoptimized + debuginfo] target(s) in 26.91s
     Running `target/debug/deet samples/segfault`
(deet) break 15
Set breakpoint 0 at 0x400bf1
(deet) break func1
Set breakpoint 1 at 0x400bad
(deet) break func2
Set breakpoint 2 at 0x400b71
(deet) r
Child stopped (signal SIGTRAP)
Stopped at /deet/samples/segfault.c:15
(deet) c
Child stopped (signal SIGTRAP)
Stopped at /deet/samples/segfault.c:9
(deet) c
Calling func2
Child stopped (signal SIGTRAP)
Stopped at /deet/samples/segfault.c:3
(deet) c
About to segfault... a=2
Child stopped (signal SIGSEGV)
Stopped at /deet/samples/segfault.c:5
(deet)

Voilà! You have a functional debugger ready to knock the socks off of any GDB user!

We hope you enjoyed the process of working through this and are proud of what you’ve built! It may not be the fanciest debugger in town, but you’ve implemented the foundation that all debuggers are built on. Hopefully this also gives you some respect for systems tooling – this was a lot of work, and there’s a lot going on here!

Optional extensions (extra credit)

If you implement any extensions, add a comment to the top of main.rs so that I know you’ve done so! Please indicate what you implemented, how far you got (are there any remaining bugs that you know of?, and how I can test it.

Next line

To implement something like GDB’s “next” command, you can add a single-step method to Inferior that steps forward by one instruction (being careful to manage breakpoints properly). Then, you can call this method in a loop until you end up on a different line, or until the inferior terminates.

Print source code on stop

Each time the inferior stops, in addition to showing a line number, GDB prints the line of source code that the inferior stopped on. This is extremely helpful when step debugging. It’s not too difficult to implement: since you know the file path and line number, you can read the file and print the appropriate text from it.

Print variables

You may have noticed that we populated DwarfData with a list of variables in each function. Using this information, you can implement something like GDB’s print command to inspect the contents of variables.