Using Code Coverage to Improve Orcas Quality
Hi, my name is Ann Zhou and I’m an SDET (Software Design Engineer in Test) tech lead on the Visual C++ Libraries team. Today I would like to share with you something that we have been doing to improve the quality of our testing in Orcas.
After we shipped VS2005, the Developer Division and the VC++ team decided to take a look at the quality of our engineering work. One of the criteria we chose to use is code coverage, i.e. the percentage of the code in the product that has been exercised by automated tests. There are many ways that can be used to measure code coverage, including source code line coverage, binary block coverage, etc. We chose to focus on binary block coverage which is the percentage of basic blocks in the binary that are exercised by automated tests. A basic block is a set of contiguous code that runs sequentially. A block has exactly one entry point and one exit point.
Our code coverage goal was set to 70% block coverage. You may ask, why not higher, or even 100%? This goal was set after carefully considering the variety of ways in which there might be dead or extremely difficult to hit code in binary. Here are just a few of the situations that would make getting higher code coverage hard:
· Internal helper functions or functions from another library may not be exercised fully in the binary. The rest of the blocks in those functions are dead code.
· The virtual inheritance hierarchy could bring lots of dead code from parent classes when its child class is being used in the binary.
· Some error conditions are extremely difficult to hit and hitting those code does not bring too much value to our testing.
· Some compiler generated code is extremely difficult to hit.
· Currently our code coverage tool only supports x86. So 64-bit specific code will be reported as uncovered even if they do.
· There are some legacy code in the source base that may be difficult to factor out.
At the beginning of the code coverage effort, some of our binaries had really low code coverage percentages. After almost a year of hard work, we are proud to say that we have reached the 70% block coverage goal for most of the binaries we ship. Some of them even reached 80% or above. Here are what we’ve done in order to achieve this.
· Get rid of dead code in the binaries by examining build options. E.g. we noticed that some of our debug binaries were not built with the /Gy compiler option, which caused the linker not able to get rid of unused functions even when the linker option /opt:ref is used.
· Get rid of dead code in the binaries by using dynamic linking of external libraries whenever feasible.
· Get rid of dead code in the source code. Due to the long history of our source code, there are code that were written many years ago but are no longer used. We did some clean-up of our source code to get rid of those dead code. E.g. Future VC releases will not support Win9x platforms and we’ve got rid of those code that are Win9x specific.
· For difficult-to-hit error conditions, we used some mechanism called “error injection” to trigger the error conditions so that we can test the behavior of our product under those error conditions. e.g. for memory allocation failure, we intercept the memory allocation routine to call another function that returns an error code indicating that the machine is out of memory.
· For functions that throw assertions under certain error conditions, we don’t want the test to terminate whenever an assertion occurs because otherwise there would be no way to validate the test result. So we redirect the assertion pop-ups to the debug windows and use a combination of assertion flags/counters to validate that the proper assertions occur under the error conditions.
· Of course, we wrote thousands of new tests to cover those code that were previously ignored. When doing this, we made sure that the new tests not only exercise the code, but also do the proper validation that the product is doing the right thing. If we don’t take care of the validation part, a high code coverage number would be meaningless, or even dangerous because it would give a false sense of testing quality.
e.g. For functions that throw assertions under certain error conditions, we don’t want the test to terminate whenever an assertion occurs because otherwise there would be no way to validate the test result. So we redirect the assertion pop-ups to the debug windows and use a combination of assertion flags/counters to validate that the proper assertions occur under the error conditions.
Looking back, we found the code coverage effort was a very rewarding experience.
· Most importantly, high code coverage percentage gives us a high confidence that our product is well tested.
· The code coverage effort gives both the QA and Dev team a better and deeper understanding of our product source base because one really needs to dig into the source base in order to figure out how a particular piece of code can be exercised.
· The code coverage effort makes our source base leaner and meaner due to the removal of dead code.
Thanks for reading. I hope you find this posting interesting. If you have any questions/comments on this topic, feel free to post a comment in the VC Blog.