This year I took all three courses on Black-box Software Testing. Each of them means an investment of four weeks of my time, usually up to 2-4 hours per day. This was quite a blast, and I am happy that I made it through the courses.
One thing that stroke me in the first course was the different uses and misuses of code coverage discussed in the first part, the Foundations course. Here is a short description of things I have seen working, and not working so much.
The main point in the paper by Brian Marick on How to misuse code coverage is that you may use code coverage to measure things that your tests cover in your code, or you can use it as a hint for where you haven’t tested anything – and decide in brain-on mode whether to cover that particular area or not.
The former point is a measurement, and might become a management metric. Especially in Theory X thinking code coverage becomes a metric for the performance of individuals or teams. This has some drawbacks, as Kaner and Bond point out in their classic paper on Software Engineering Metrics – What do they measure and how do we know?. The main distinction is that measuring code coverage is a second-order measurement for something else. It’s a substitution, as Kahneman describes in Thinking Fast and Slow in the sense that we are answering a different question. The question we are trying to answer is whether this code is sufficiently tested, and how maintainable it might be in the long-run.
When putting in place a code coverage metric I have seen dysfunctions as Kaner and Bond describe them. Recently I heard from a company that demanded 100% code coverage. Though Robert C. Martin demands that this should be a goal for any great development team, when put into a measurement metric or in a contract, it more often than not becomes a problem. I also heard about a company where the development team was given the goal to reach at least 50% code coverage. This setting led to generated tests that aim for code coverage, but not for useful tests that help you drive and maintain the application in the long run.
On the other hand, when using code coverage to check what has not been covered with automated checks, we get a clearer picture about our situation. With our brains switched on we can distinguish between code that should better get some more tests to cover that particular complicated flow through the application, or whether to leave that empty try-catch-block as it is right now.
In the past I have applied code coverage more often successful from inside the team as a measurement to measure pieces of code that we did not (yet) cover. Think about it as an information radiator. Code coverage makes then clear which pieces of your code are not covered, yet, and you can think through whether you need more tests. Code coverage then becomes a first-order measurement, which the team uses to bring forward their development efforts.
Code coverage can be used in useful and misuseful ways. How are you using code coverage in your company?