Analyzing Code Coverage with gcov
Before releasing any amount of code, developers usually test their work to tune performance and prove that the software works as intended. But often, validation is quite difficult, even if the application is simple.
For example, the venerable Unix/Linux ls utility is conceptually quite simple, yet its many options and the myriad vagaries of the underlying file system make validating ls quite a challenge.
To help validate the operation of their code, developers often rely on test suites to either simulate or recreate operational scenarios. If the test suite is thorough, all of the features of the code can be exercised and be shown to work.
But how thorough is thorough? In theory, a completely thorough test suite would test all circumstances, validate all of the results, and exercise every single line of code, demonstrating that no code is "dead." (As Stephen Friedl pointed out in last month's column, dead code is a favorite hiding place for pesky bugs.) Validating results can be done in any number of ways since output is typically tangible in one form or another, but how do you make sure that all of your code was executed? Use GNU's gcov.
Like an X-ray machine, gcov peers into your code and reports on its inner workings. And gcov is easy to use: simply compile your code with gcc and two extra options, and your code will automatically generate data that highlights statement-by-statement, run-time coverage. Best of all, gcov is readily available: if you have gcc installed, you also have gcov-- gcov is a standard part of the GNU development tools.
This month, let's look at code coverage analysis and how to use gcov to help improve the quality of your code and the quality and thoroughness of your test suites.
What is Code Coverage Analysis?
As mentioned above, it's ideal to find dead code and get rid of it. In some cases, it may be appropriate to remove dead code because it's unneeded or obsolete. In other cases, the test suite itself may have to be expanded to be more thorough. Code coverage analysis is the (often iterative) process of finding and targeting "dead" or unexercised code, and is characterized by the following steps:
1. Find the areas of a program not exercised by the test suite.
2. Create additional test cases to exercise the dead code, thereby increasing code coverage.
3. Determine a quantitative measure of code coverage, which is an indirect measure of quality.
Code coverage analysis is also useful to identify which test cases are appropriate to run when changes are made to a program and to identify which test cases do not increase coverage.
Types of Code Coverage
There are many different types of code coverage that can be measured by gcov. To be brief, let's discuss just two of them: branch coverage and loop coverage.
Branch coverage verifies that every branch has been taken in all directions. Similarly, loop coverage tries to verify that all paths through a loop have been tried. Loop coverage sounds complex, but actually can be verified by satisfying just three conditions:
1. The loop condition yields false so the body is not executed.
2. The loop condition is true the first time, then false, so execution of the body happens only once.
3. The loop condition is true at least two times, causing the loop to execute twice.
For example, in the following code snippet...
...the if statement must be tested with an odd and even number. The for statement must be tested with two numbers, such that the condition (number < 9) is true and false, respectively. Therefore, the following three tests would achieve complete test coverage for the routine listed above:
gcc Options Needed for gcov
Before programs can use gcov, they must first be compiled with two gcc options: -fprofile-arcs and -ftestcoverage. These options cause the compiler to insert additional code into the object files. Then, when the code runs, it generates two files, sourcename.bb and sourcename.bbg, where sourcename is the name of your source code file.
The .bb file has a list of source files, functions within the the file, and line numbers corresponding to each block in the source file. The *.bbg file contains a list of the program flow arcs for all of the functions. Executing a gcov-enabled program also causes the dumping of counter information into a sourcename.da file when the program exits.
gcov uses the *.bbg, *.bb, and *.da files to reconstruct program flow and create a listing of the code that highlights the number of times each line was executed. Let's try using gcov.
Compile the file sample.c shown in Listing One with the options -fprofile-arcs, -ftest-coverage, and -g.
Now we're ready to see how much coverage each test case provides. Run the sample application with input of 1000.
The application displays "Creating an 1000 by 1000 array," and creates a new file called sample.da. Next, run gcov on the source code (if your application has more than one source file, run gcov on all of the source files)...
gcov emits "69.23 % of 26 source lines executed in file sample.c." This gcov command also creates the file sample.c.gcov, shown in Listing Two. In the listing, a ###### marker indicates that the associated line of source code hasn't been executed.
Next, run the sample program with no input.
The application displays "Usage: ./sample Enter arraysize value." Next, run gcov again.
76.92 % of 26 source lines executed
Now run the sample program with the parameter 0.
The application should display "Array size must be larger than 0" message. Again, run gcov.
84.62 % of 26 source lines executed in sample.c
Now comes the interesting part of testing this program. There are two malloc() error conditions; both must be tested to get 100% coverage of this code. Let's use the gdbdebugger to simulate the malloc() failures. Let's set a break point and then jump to the error condition.
The list command displays the line numbers for the source.
Use the break command to set a break point on line number 13.
Then start the program with run.
36 if (array[x] == NULL)
(gdb) jump 31
Malloc failed for array size
Once again, run gcov.
One more test to run. Follow the steps shown above to set a break point on line 13. Run the program with run and then jump to line 37.
(gdb) list 30
36 if (array[x] == NULL)
(gdb) jump 37
Finally, run gcov one last time.
100 % of 26 source lines executed in file sample.c
Listing Three shows no lines flagged with #####, so all lines of this program have been executed. The number before each line of code tells how many times it was executed.
Coverage is Just One Measure
gcov determines how well your test suites exercise your code. One indirect benefit of gcovis that its output can be used to identify which test case provides coverage for each source file. With that information, you can run a subset of the test suite to validate changes in the program. Thorough code coverage during testing is one measurement of software quality.
Steve Best works in the Linux Technology Center of IBM in Austin, Texas. He is currently working on Journaled File System (JFS) for Linux project. Steve has done extensive work in operating system development with a focus in the areas of file systems, internationalization, and security. He can be reached at firstname.lastname@example.org. You can download the code used in this article from http://www.linux-mag.com/downloads/2003-07/compile.