Software Testing Strategies

Written by Julie Zelenski, with modifications by Nick Troccoli

Succeeding on an assignment's functionality tests is a measure of your achievement in implementing the program requirements, but is also a reflection of your testing efforts in handling all variations, edge cases, and invalid inputs. To achieve a polished and robust submission, testing should be an integral part of your development, something you pay attention to early and often, not just at the end before submitting. See below for several strategies to ensure good testing.

Black-Box Testing

Black-box testing treats the program as a "black box", that is, without considering the code paths or internal structures. Working from a specification that dictates the expected behavior, you run experiments on the program to observe whether it behaves correctly. The idea is to brainstorm a broad set of inputs/interactions (variations for invoking the program, different files to feed as inputs, ways of interacting with the program as the user to achieve different outcomes) and construct small test cases for each. Achieving comprehensive black-box coverage requires creativity and sometimes even a bit of deviousness to expose any bugs.

For example, consider testing the command-line usage of a program. The "usage" here means the range of ways the user can invoke the program with various command-line arguments. What happens if the user invokes the program with a missing argument? What if an argument is of the wrong type? What if the arguments are ordered incorrectly? What if a value for the argument is non-sensical, such as a negative size or a non-existent file? Each of these can be tested with an individual case to verify the program gracefully handles the total range of possible invocations.

For file-based inputs, you can construct cases that isolate certain behaviors. Consider a program which reads a file and finds the longest word in it. Try creating a test input file where the longest word appears first, another in the middle, or another as the last word. And why not also try an empty file and a file with only a single word? How about a file where there is a tie for longest? Does the spec say there is a limit on the maximum length? Try files with a longest word of length max -1, max, and max +1 to observe the behavior at that fringe. When building a test case focused on a particular issue, you should come up with the most minimal case that reproduces the desired behavior and avoid unnecessary interference; this way, you can ensure you are testing the intended behavior, and easily debug any issues found in testing.

One of the limitations of black-box testing is that it is difficult to be confident you have covered all the code paths without knowledge of the code internals. For example, maybe the program above handles files longer than a megabyte with a completely distinct code path from smaller files, but a true outsider wouldn't have reason to even suspect this. When you are acting as both the tester and the author of the code, your insider information allows you to improve your test suite by adding white-box testing.

White-Box Testing

White-box testing relies on knowledge of the design internals and code paths. As you are writing the code, you can be thinking and planning for the test cases needed to exercise all the code paths. One helpful way to think about coverage is by mapping to control flow. If a function has an if/else, this suggests there will be two paths to verify, one through the if and another through the else. You can similarly use your knowledge of the essential special cases to identify other non-overlapping paths. For example, deleting from a linked list might be broken down into the cases of deleting the first node, the last node, and a middle node.

White-box testing may be done by writing testing code and/or using the debugger to make directed calls to a function being tested. Sometimes there are temporary changes you can make that force the code to thoroughly test certain functionality. For example, consider testing a hashtable. Configuring the hashtable to initialize to a very small number of buckets, then adding a lot of entries, will force it to repeatedly test the internal rehashing mechanism. Alternatively, changing the hash function to map every key to the same code (say zero) forces all entries to be aggregated in a single bucket and now all operations will be exercising on a single large list rather than many singleton lists.

One disadvantage of white-box testing is the same oversight that allowed you to introduce an error into the code is also likely to cause you to overlook testing for it. For example, if you didn't consider that one of the arguments might be an empty string, your code may not be written to correctly handle it, nor would you be likely to devise a test case of such an input.

Stress Testing

Early in development, you will use small, focused tests to verify the basics are working in isolation, but later on, you need to mix in larger, unfocused inputs that scale up the size and bring in more complex interactions. Those larger inputs might be created by hand or generated using an automated or randomized program (the idea of randomly generating inputs is known as fuzz testing). However, the nature of the larger stress tests often makes them unwieldy when debugging. If one of your stress tests uncovers a new bug, for debugging purposes you may want to first try to narrow the case down to a smaller test that still exhibits the same behavior.

Regression Testing

One reality of software is that fixing a bug can sometimes lead to breaking something else that was previously working. For this reason, you should preserve test inputs and testing code rather than discarding them when you are done using them. Keeping them around means you can easily repeat those tests as you continue in development and immediately spot when you've accidentally taken a step backward.

Test-Driven Development

A great testing strategy when working on assignments is using test-driven development, which goes like this:

identify a small, concrete task (bug to fix, feature to add, desired change in behavior)
construct tests for the desired outcome and verify the current code fails these tests
modify the code to complete this task
re-run your tests and verify they now succeed
test the rest of the system to verify you didn't inadvertently break something else

You change only a small amount of code at once and validate your results with carefully constructed tests before and after. This keeps your development process moving forward while ensuring you have a functional program at each step.

Testing Tools

There are a variety of tools you can leverage to streamline and automate testing:

The gdb debugger is an excellent investment. For example, evaluating calls to functions within the debugger to do quick manual tests can be a big help.
Be sure to use Valgrind early and often. It is invaluable in spotting memory errors and reporting leaks that would be near impossible to find manually.
Our sanity check tool allows for simple comparison of the output your program with that of the sample solution, using our provided test inputs, as well as your own custom inputs. See the working on assignments page for more information.
(The Unix environment) has a variety of small useful tools, such as those that do file/text-processing tasks (diff, sort, grep, sed, wc, and so on) that you can combine.

The Testing Mindset

Program testing can be used to show the presence of bugs, but never to show their absence.
--Edsger Dijkstra

Sometimes the biggest testing hurdle comes in the reluctance to even undertake the hunt. Let's be honest, the point of testing is to find flaws, and once found you will feel compelled to fix them! But finding and fixing bugs ensures you have completed a solid program you can be proud of.