Lab sessions Tue Feb 02 to Thu Feb 04
Solutions by Nick Troccoli and Lisa Yan
Go back to this week's lab writeup (without solutions) here.
Learning Goals
During this lab, you will:
- investigate how arrays and pointers work in C
- get further practice with gdb and valgrind
- experiment with code that dynamically allocates memory on the heap
Get Started
Clone the lab starter code by using the command below. This command creates a lab3 directory containing the project files.
git clone /afs/ir/class/cs107/repos/lab3/shared lab3
Tip: the GDB and Valgrind guides on the Resources are great references for the various commands and strategies you can use to debug. In particular, the GDB guide has a section on good debugging strategies, including ways to use GDB to fix any issues you encounter. We'd highly recommend reviewing them if you haven't already!
Pre-lab Exercise: Debugging
It's vital to use good debugging strategies when working on your programs. Developing good debugging strategies consists of both understanding the mechanics of how to use debugging tools like GDB and Valgrind, as well as how to apply them effectively as part of the larger process of finding, narrowing in on, and fixing bugs. Now is the time to invest in mastering the powerful tools provided by a debugger like gdb, and using a careful and systematic debugging approach! This process will be particularly critical on upcoming assignments, so we want to provide lots of practice. Here is a core checklist you should use when approaching debugging (taken from our Guide to GDB and debugging).
To provide practice with this debugging checklist, we have provided a buggy program, pig_latin.c, with a lurking issue. It is based on the code shown in the previous lecture for translating words and phrases to pig latin. However, it seems to crash on certain inputs! In particular, the most recent change made to the code was to add functionality to return NULL in the pig_latin function if the word cannot be translated to Pig Latin (if it starts with a character that is not a lowercase letter). But when running some custom tests, it seems to fail. Let's follow the debugging process above to fix it!
- Observe the bug. This part is already done for you - there is a custom test case in
custom_teststhat exhibits the crash. Try running it using sanity check to see! - Create a reproducible input. There is already a provided test case that causes a crash, but it's a complex input that may be unwieldy to use for debugging. It's important to try and reproduce a bug with as small an input as possible to better understand it and make tracing easier later. Try creating such an input in custom tests using the information provided so far. Try doing this without looking at the source code! This approach can sometimes be helpful to not get sidetracked by what you think your code does vs. what it actually does.
There are many different possible answers; ideally the test case will be two words, the second of which starts with e.g. a number.
- Narrow the search space. Now our task is to narrow in on what portion of code we think is causing the issue. This is important to help avoid time-consuming processes like stepping through the entire program execution if the bug is lurking in just one place. Luckily, for a crash, GDB can do part of this for us! If you run a crashing program in GDB, it will pause at the source of the crash, and let you poke around the program state. Try doing this now:
- run the program in GDB using the test case you found earlier
- when it crashes, you should see GDB alert you of a segfault (a crash), as well as what line caused it - helpful!
- try using the
backtracecommand to see the call stack for the crash - this shows exactly where in the execution the crash happened. You can also print out variable values and use other GDB commands to inspect the state of the program at the crash. - Great! Now we've narrowed in on the line and function causing the issue.
- Analyze. Now our job is to use GDB to poke around in this state to get more information about why the program crashed. Try printing out values to discover why the crash occurred on that line. Then try using GDB to run the program again, but try breakpointing at the start of the function containing the crash and step through it to get a better understanding of what's going on.
The issue is that
pig_wordisNULL, not the empty string. Stepping through the function from the start shows thatpig_latinreturnsNULLfor invalid strings, and the first time on line 83 we check correctly, but the second time we do not. - Devise and run experiments. Once you have a hypothesis of what is causing the issue, try coming up with another test input to validate your hypothesis, and use GDB to confirm your suspicion.
- Modify code to squash the bug. Now brainstorm and implement a fix for the issue. Make sure you understand exactly what changes you are making and why you are making them. Confirm the changes fix the issue by running your tests from earlier. Woohoo! We fixed the bug - nice work!
The fix is to change the statement to instead check if
pig_wordisNULL.
Hopefully this process gave you a better idea of how to follow an effective debugging strategy to make every minute of debugging count. Here are a few other key takeaways from this process:
- Good testing helps find bugs. As step 1 indicates, you won't be able to apply your honed debugging skills if you don't unearth potential bugs in the first place! Make sure to test thoroughly to uncover any lurking issues.
- Make less work for yourself. the small reproducible input is key to making debugging and tracing easier. If you find the bug with a very large input, it can become unwieldy to trace through it! Similarly, narrowing down what code is causing the issue can let you focus on just that code block rather than the entire program.
- Good style leads to easier debugging. This program was almost certainly easier to debug because of the decomposition, commenting, and overall organization. Imagine if the code was poorly commented, or all contained within one function. It would be much more time-consuming!
- A good debugger is systematic. Every computer scientist at every level spends time debugging. The key to advancing your skills is to know how to spend your time as effectively as possible to narrow in and squash those bugs!
Lab Exercises
1) GDB: Pointers and Arrays (15 minutes)
First, we'll play around with pointer and array syntax, and how the two are similar and different. To do this, we'll use the provided code.c program, which is a not-terribly-useful program that exhibits various behaviors of arrays and pointers.
1a) Arrays vs Pointers in main()
- Build the program, start it under
gdb, and set a breakpoint atmain. When you hit the breakpoint, useinfo localsto see the state of the uninitialized stack variables. This command lists the values of all local variables in the current function. Step through the initialization statements and useinfo localsagain. Remember that whengdbreports that execution is at line N, this is before line N has executed. -
The expressions below all refer to the local variable
arr. Q: For each expression, first try to figure out what the result of the expression should be, and then evaluate the expression ingdbto confirm that your understanding is correct.(gdb) p *arr (gdb) p arr[1] (gdb) p &arr[1] (gdb) p *(arr + 2) (gdb) p &arr[3] - &arr[1] (gdb) p sizeof(arr) (gdb) p arr = arr + 1The first 5 evaluate to the same result, but the last two evaluate instead to
8and the address ofptr+ 4 bytes, respectively. -
The
mainfunction initializesptrtoarr. The name of a stack array and a pointer to that array are almost interchangeable, but not entirely. Try re-evaluating the above expressions withptrsubstituted forarr. The first five evaluate identically, but the last two produce different results forptrthanarr. Q: What is reported for the size of an array? the size of a pointer? The last one is the trickiest to understand. Why is it allowable to assign toptrbut notarr?Everything is the same between arrays and pointers except
sizeofand the assignment (=) and&operators.sizeof(arr)gives the size of the locally declared stack array, in Bytes, whereassizeof(ptr)gives the size of the pointer variable; since pointers store addresses, this is always 8B on a 64-bit machine. Reassigning pointers (e.g.,ptr) is simply updating the address stored inptr, whereas reassigning arrays (e.g,arr) will throw an error because array variables reference a specific area in memory. -
Execute
p ptr = ptr - 1to resetptrto its original value. &arr[0]stores the same address as&ptr[0], but&arrisn't the same address as&ptr. Q: Why not? NOTE: this discrepancy is key - it will almost certainly come up on future assignments!When used in pointer arithmetic,
arris often resolved to the address of the first element inarr. As a result,&ptr[0]and&arr[0]are both resolved as the address of the first element (specifically,&ptr[0]and&arr[0]are "syntactic sugar" for&(*(ptr + 0))and&(*(arr + 0)), respectively). However, arrays aren't pointers! While&on a pointer works as expected,&arrevaluates to the base address of the array. Reinforce your thoughts ingdb: you'll see that printing&arrwill reveal a different type from&ptr.
1b) Arrays and pointers as parameters in binky()
- Use the gdb
stepcommand to advance into the callbinky(arr, ptr).stepis likenext, but instead of executing the entire line and moving to the next line, it steps into the execution of the line. - Once inside
binky, useinfo argsto see the values of the two parameters. Print any expression you can think of onaandband they will evaluate to the same result. This includes the last two expressions from above:sizeofreports the same size foraandband assignment is permissible for either. Q: What happens in parameter passing to make this so? Try drawing a picture of the state of memory to shed light on the matter.When passed in as parameters, arrays are passed as pointers to the first element of the array. Furthermore, declaring a parameter as
int param[]vsint *paramhas zero functional effect. Thereforeint a[]andint *bare both pointers toints and store the same address: the address of the first element inarr.
1c) Pointers, double pointers, and gdb stack frames in winky()
- Set a breakpoint on
change_charand continue until this breakpoint is hit by executing the GDBccommand (for "continue"). When stopped there in gdb, useinfo args. The arguments shown are from the function call ("frame") forchange_char. The default frame of reference is the currently executing function. - Use
backtraceto show the sequence of function calls that led to where the code is currently executing. You can select a different frame of reference with the gdbframecommand. Frames are numbered starting from 0 for the innermost frame, and the numbers are displayed in the output of thebacktracecommand. Try the commandframe 1to select the frame outsidechange_charand and then useinfo localsto see the state fromwinky. The gdb commandupis shorthand for selecting the frame that is one higher than the current one, anddownis shorthand for selecting the frame that is one lower than the current one. Typeframe 0ordownto return to the line where the debugger is paused. - Step through
change_charand examine the state before and after each line. Useinfo argsto show the inner frame andupandinfo localsto show what's happening in the outer frame. Carefully observe the effect of each assignment statement.*s = 'j';does change the string fromwinky, whiles = "Leland";does not because it is changing the function's own copy ofs. Note that we could passwordinstead ofpwbecause both would be passed as pointers to the first element. - Step through the call to
change_ptrand make the same observations. Which of the assignment statements had a persistent effect inwinky, and which did not? Q: Can you explain why?The first two lines both change the data from
winky, but the last does not because it is changingwinky's own copy ofp_str. Note that we could not pass&wordinstead of&pwbecausewordis an array, butpwis a pointer.&worddeceptively still gives the address of the first array element, and is therefore not a pointer to the pointer to the first array element.
If you don't understand or can't explain the results you observe, stop here and discuss them with your labmates and lab leader. Having a solid model of what is going on under the hood is an important step toward understanding the commonalities and subtle differences between arrays and pointers.
2) Code Study: Heap Use (15 minutes)
Next, we'll take a look at code that dynamically allocates memory on the heap. Specifically, the function join(strings, strings_count) in the provided heap.c program returns a string which is the concatenation of all strings_count strings in strings. The returned string is heap-allocated and becomes the client's responsibility to free when done. A core function that makes this implementation possible is the realloc function.
1 char *join(char *strings[], int strings_count) {
2 char *result = strdup(strings[0]);
3 assert(result != NULL);
4
5 for (int i = 1; i < strings_count; i++) {
6 result = realloc(result, strlen(result) + strlen(strings[i]) + 1);
7 assert(result != NULL);
8 strcat(result, strings[i]);
9 }
10 return result;
11 }
At the outset of the loop, the total size needed is not known, so it cannot do the allocation upfront. Instead it makes an initial allocation and enlarges it for each additional string. This resize-as-you-go approach is an idiomatic use case for realloc.
- Q: Is the size argument to
reallocthe number of additional bytes to add to the memory block, or the total number of bytes including any previously allocated part?It's the total number of bytes, including any previously-allocated part.
- The variable
resultis initialized tostrdup(strings[0]). Q: Why does it make a copy instead of just assigningresult = strings[0]?We are building up a new heap-allocated string, so we can't build on an existing string pointer - it may not be heap-allocated, for instance, and we need our own copy in memory.
- The
reallocmanual page says the return value fromrealloc(ptr, newsize)may be the same asptr, but also may be different. In the case where it is different,reallochas copied the contents from the old location to the new. The old location is also freed. Change the code to callreallocwithout using its return value (i.e. don't re-assignresult, just drop the result). Re-compile and you should get a warning from the compiler. Q: What error is that warning trying to caution you against? Arealloccall that doesn't catch its return value is never the right call.The compiler warns against an unused return value - this is bad for realloc because it may cause an immediate memory leak if the memory has been moved to a new location!
- Try running the code; if you run
./heapwith one or more arguments, it will join all of the arguments together using thejoinfunction.
3) Valgrind: Heap Errors and Memory Leaks (20 minutes)
Last week's lab introduced you to Valgrind. This tool will be increasingly essential as we advance to writing C code with heavy use of pointers and dynamic memory. In particular, Valgrind can help detect misuse of heap memory (e.g. writing past what you've allocated, freeing twice, etc.) and can detect when you forget to free heap memory that you've allocated. Let's take a look at how Valgrind detects each of these kinds of issues.
3a) Heap Errors (10 minutes)
First, Valgrind is great at detecting memory issues relating to dynamic memory, such as writing past what you've allocated, accessing freed memory, freeing memory twice, etc. The buggy.c program has some planted errors that misuse heap memory to let you see how Valgrind handles and reports these errors.
Start by taking a few minutes to review the example error. Keep the CS107 Valgrind resource handy.
Error 1: Valgrind tutorial
Review the program in buggy.c and consider error #1. When the program is invoked as ./buggy 1 argument, it will copy argument into a heap-allocated space of 8 characters. If the argument is too long, this code will write past the end of the allocated space - a "buffer overflow" (because the write overflows past the end of a buffer). This is similar to errors we've seen in the past about writing past stack memory used to store a string, but now the overflow is happening on the heap. What are the consequences of this error? Let's observe.
- Run
./buggy 1 lelandto see that the program runs correctly when the name fits. Now try longer names./buggy 1 stanfordand./buggy 1 lelandstanford. Surprisingly, these also seem to "work", apparently getting lucky when the overrun is still relatively small. Pushing further to./buggy 1 lelandstanfordjunioruniversity, you'll eventually get the crash you expect. - Many crashes have nothing more to say that just "segmentation fault", but this particular crash gives a dump of the program state. This error-reporting is coming from inside the
freefunction of the standard library. The dump is not presented in a particularly user-friendly form, and rather than trying to decipher it, just take it as an indication of a memory problem, for which we will turn to Valgrind for further help. - Try
valgrind ./buggy 1 stanfordand review Valgrind's report to see what help it can offer. Whether the write goes too far by 1 byte or 1000 bytes, Valgrind will helpfully report the error at the moment of transgression. Hooray!
Errors 2 and 3: Independent investigation
Now it's time for you to try investigating some heap misuses. Work on investigating errors 2 and 3 as a group, for a few minutes each, answering the following questions:
Error 2: ./buggy 2
- Review the code to see what the error is. Q: What is the error?
This function allocates memory for a string, frees it, and then tries to print it out the already-freed string.
- Run it (e.g.
./buggy 2). Q: Is there an observable result of the error?
Depending on your system, you could see the original string contents; other systems may see an empty string.
- Run it under Valgrind. Q: What is the terminology that the Valgrind report uses and how does it relate back to the root cause?
Valgrind reports (1) "invalid read of size ..." and the line where the read is happening; (2) "Address ... is 2 bytes inside a block of size 14 free'd" and the
freeline; and (3) "Block was alloc'd at" and the originalstrdupline.
Error 3: ./buggy 3
- Review the code to see what the error is. Q: What is the error?
This function allocates memory on the heap for a string, creates another pointer to the same memory, and then frees both pointers. This is a common point of confusion: even though you may have multiple pointers to heap memory, you should only free it once, because there is still only one block of memory allocated.
- Run it (e.g.
./buggy 3). Q: Is there an observable result of the error?
The program crashes and prints a cryptic error message: "double free or corruption (fasttop)", a backtrace, a memory map, and "Aborted (core dumped)." Not the most useful!
- Run it under Valgrind. Q: What is the terminology that the Valgrind report uses and how does it relate back to the root cause?
Valgrind reports (1) "Invalid free() / delete / delete[] / realloc()" and the line with the second
free; (2) "Address ... is 0 bytes inside a block of size 19 free'd" and the firstfree; and (3) " Block was alloc'd at" and the originalstrdupline.
Key Takeaways: Be forewarned that memory errors can be very elusive. A program might crash immediately when the error is made, but the more insidious errors silently destroy something that only shows up much later, making it hard to connect the observed problem with the original cause. The most frustrating errors are those that "get lucky" and cause no observable trouble at all, but lie in wait to surprise you at the most inopportune time. Make a habit of using Valgrind early and often in your development to detect and eradicate memory errors!
3b) Memory Leaks (10 minutes)
While not memory errors, memory leaks are another type of issue that Valgrind can help detect. A memory leak is when you allocate memory on the heap, but do not free it. These rarely (if ever) cause crashes, and for this reason we recommend that you do not worry about freeing memory until your program is completely written. Then, you can go back and deallocate your memory as appropriate, ensuring correctness at each step. Memory leaks are an issue, however, because your program should be responsible for cleaning up any memory it allocates but no longer needs. In particular, for larger programs, if you never free any memory and allocate an extremely large amount, you may run out of memory in the heap!
The leaks.c program has some planted memory leaks to let you see how Valgrind handles and reports these leaks. Start by taking a few minutes to review the example leak:
Leak 1: Valgrind tutorial
- Review the program in
leaks.cand consider leak #1. When the program is invoked as./leaks 1, it will allocate 8 bytes on the heap, but then immediately return, causing the program to lose the address of this heap memory. A memory leak! The program terminates fine, however. Let's see how Valgrind can help us detect it. - Try
valgrind ./leaks 1and review Valgrind's report to see what help it can offer. You'll notice that Valgrind reports a "LEAK SUMMARY" at the bottom, as well as a total heap usage summary, which shows what memory was still in use at exit (that should have been freed). You can also see the number of reported allocations and frees to not line up, indicating a leak. Helpful! - For even more memory leak information, make sure to use the
--leak-check=fulland--show-leak-kinds=allflags when you run valgrind. Run it again like this:valgrind --leak-check=full --show-leak-kinds=all ./leaks 1. This shows even more information, including where the leaked memory was originally allocated! We recommend always running with these flags to get the most information. Make sure to put these flags before the command to run your actual program. Specifically, the following is not equivalent:valgrind ./leaks 1 --leak-check=full --show-leak-kinds=all, as this passes these flags to your program, instead of valgrind! - Q: What is the difference between "definitely lost", "indirectly lost", etc.? Check out our Valgrind guide for an overview.
We'll let you read the resource guide. :-)
- The total heap usage includes more than just your explicit allocations. For example, other functions, like
strdup, may allocate/free memory internally as well!
Leaks 2 and 3: Independent investigation
Now it's time for you to try investigating some memory leaks. Work on investigating leaks 2 and 3 as a group, for a few minutes each, answering the following questions:
Leak 2: ./leaks 2
- Review the code, and then run it (e.g.
./leaks 2). Notice how the program always runs fine. Q: What is the issue with the code?This function itself does not leak memory, but its caller fails to free it, and thus the memory is still leaked.
- Now run it under Valgrind. Q: What is the terminology that the Valgrind report uses and how does it relate back to the root cause?
"14 bytes in 1 blocks are definitely lost in loss record 1 of 1" .
mainhad a local copy of the pointer to the allocated memory and whenmainexited, the local copy was removed. Valgrind tells us what this pointer inmainwas:return_value.
Leak 3: ./leaks 3
- Review the code, and then run it (e.g.
./leaks 3). Notice how the program always runs fine. Q: What is the issue with the code?This function leaks memory by first allocating memory and then reassigning the pointer, thus losing access to the allocated memory.
- Now run it under Valgrind. Q: What is the terminology that the Valgrind report uses and how does it relate back to the root cause?
"14 bytes in 1 blocks are definitely lost in loss record 1 of 1". Valgrind tells us which line the original allocation happened (with the
strdup) but cannot report the subsequent reassignment line; that's up to us as programmers to identify!
Key Takeaways: A memory leak is when you allocate memory on the heap, but do not free it. We recommend not worrying about freeing memory until your program is functionally complete. Leaks could happen if you forget to free memory within a function, return allocated memory and the caller does not free it, etc. Note that leaks do not only have to come from memory you allocate using malloc. For example, functions like strdup allocate memory that it is the caller's responsibility to free!
As a final note, often the backtrace for a Valgrind-reported error or leak will show a trace to somewhere within a library function such as strcpy or strdup. It may be tempting to conclude that the problem is in the library, and is not ours to fix, but this reasoning is almost certainly not true - instead, it's likely an issue with how we are using that library function in our code (the parameters we pass in, not freeing values it allocated, etc.).
[Optional] Extra Problems
Finished with lab and itching to further exercise your pointer and heap skills? Check out our extra problems! (Extra problems solution)
Recap
Nice work on the lab! It's okay if you don't completely finish all of the exercises during lab; your sincere participation for the full lab period is sufficient for credit. However, we highly encourage you to finish on your own whatever is need to solidify your knowledge. Also take a chance to reflect on what you got what from this lab and whether you feel ready for what comes next! The takeaways from lab3 should be continued development of your gdb and Valgrind skills, as well as your skills reading and writing code with heavy use of pointers. Arrays and pointers are ubiquitous in C and a good understanding of them is essential. Here are some questions to verify your understanding and get you thinking further about these concepts:
- If
ptris declared as anchar *, what is the effect ofptr++? What ifptris declared as aint *?Incrementing increments by the size of the pointee - so for a
char *, it increments by 1 byte, but for anint *, 4 bytes. - Although
&applied to the name of a stack-allocated array (e.g.&buffer) is a legal expression and has a defined meaning, it isn't really sensible. Explain why such use may indicate an error/misunderstanding on the part of the programmer.&on arrays in most cases is a no-op: it evaluates to the address the array, which is equivalent to what the array evaluates to already by default in pointer expressions (the address of the first element in the array). Note that&on an array (e.g.,&arrforchar arr[5]) does not match the type of&arr[0](the first ischar (*)[5]) and the second is (char *), so despite having the same runtime behavior, you may see compiler warnings when you use the former. - The argument to malloc is
size_t(unsigned). Consider the erroneous callmalloc(-1), which seems to make no sense at all. The call compiles with nary a complaint -- why is it "ok"? What happens when it executes?The value is casted to a size_t and treated as a very large number! Running this shows that malloc returns NULL.
- What is the purpose of the
reallocfunction? What happens if you attempt to realloc a non-malloc-ed pointer, such as a string constant?realloc lets us resize memory allocations. It is undefined behavior if we realloc an invalid value.
- What is the difference between a memory error and a memory leak?
The core cause of a memory error is accessing (reading or writing) memory that does not belong to you. For instance, if you create a string that is not null-terminated and call strlen() on it, this causes a memory error because it will cause strlen to search through memory that doesn’t belong to that string in search of a null terminator. The core cause of a memory leak is not freeing memory you have allocated on the heap. For instance, if you call malloc() or strdup(), you must free the return value of that function when you are done with it.
-
Your coworker suggests you use the function below as a "safer" free function that prevents accidentally using a freed pointer.
void free_and_null(void *ptr) { free(ptr); ptr = NULL; }Will this function correctly free the client's pointer? Will it correctly set it to NULL? Explain.
It will free it, but it will not set it to NULL, as it only sets
free_and_null's own copy of the pointer to NULL. You would need to pass a pointer to the pointer to NULL it out as well.
