Fight starter #66: The Exercise Suite
It's no fun to write about stuff that everyone agrees on, so I'm going to write about something that almost everyone seems to disagree with me about:
100% code coverage is almost always possible. It's a valuable target, and it's a worthwhile goal in and of itself.
The primary argument that gets dragged out repeatedly is that 100% code coverage doesn't indicate anything about the quality of your tests whatsoever. It's certainly possible to get code 100% covered without writing a single assert, or by writing incorrect or incomplete tests. If you don't actually assert everything you care about properly, code coverage doesn't seem to be very valuable at all.
While there is a lot of truth to this, unless you have dishonest developers, that code coverage percent does indicate an attempt at quality tests. And really, an honest attempt is all we can ever ask for.
I'm willing to hypothesize though, for the sake of argument that somehow the class under test somehow is 100% covered by wrong tests, incomplete tests, or tests with no asserts whatsoever. Let's call it an Exercise Suite instead, because that's all it does; it exercises the code.
I'm going to still contend that that coverage metric is a quality indicator. The important thing to notice is that it means you've somehow managed to exercise 100% of your codebase. This is actually a huge accomplishment that means a few things:
(1) You don't have dead code in your class-under-test.
If you can't exercise every branch in a given class, you've got dead code. Dead code should always be removed to keep from cluttering up future maintenance efforts. (Think you might need it again? That's what version control systems are for -- You can always go back and get it again.)
(2) Your codebase can be loaded into a test harness.
This is actually one of the hardest parts about testing codebases written in procedural languages, or languages that allow scripts, or "includes". The contents of a given file can't be brought under test reliably and repeatably unless it is possible to load it into the test harness without it actually doing anything until it's exercised by the tests. If you've ever tried to get a legacy PHP project under test, you'll know what I'm talking about here.
(3) You've figured out how to avoid/mitigate side effects.
Code that accesses a database, the filesystem, a web service, hardware, etc is very hard to exercise quickly and without side effects. If your exercise suite has managed to get 100% coverage, while still running fast, reliably and repeatably, you've scored a major victory.
(4) You can detect a myriad of run-time errors.
"Failures" are generally what you have when a test fails. You care about "errors" too though: Errors are all the things that can go wrong at run-time that are obvious just by executing the code with certain inputs. Exercise alone can uncover a number of these. We routinely catch null pointer exceptions in our test suite (attempts to access properties that aren't correctly initialized yet), not through failed tests, but instead simply through exercise. The test runner catches many of them, and always shoves those directly in our face to be solved.
I think my unpopular opinion comes from two things about me that puts me in a bit of a minority as far as developers go:
(1) I've spent a bunch of time actually doing TDD, and I expect unit-tests to look like they were written before the code even if they weren't. If you're writing tests before code, as TDD prescribes, you're generally not going to end up with uncovered code. At any point in the process, the only asymmetry that's likely is that there are tests that are trying to test code that doesn't yet exist. To have uncovered code in TDD generally means you've done something wrong.
(2) I'm extremely accustomed to working with dynamically-typed languages, so I've learned to rely on unit-tests for almost every aspect of verification. I use them to debug. I use them for syntax-validation. I use them to ensure I'm interacting with other objects in ways that they expect (ie, "Did I get that parameter order right in that method call?"). There is no static type-system, so I don't have any quality safe-guard other than the tests I write. In that scenario, uncovered code is a risk I don't want to take -- it could even contain syntax errors!
In short, the hypothetical Exercise Suite has a lot of value in and of itself.
July 26th, 2010 - 16:39
I agree 100%
Does 100% coverage have any value beyond the proportion of code tested with meaningful asserts validating results? I would argue that the burden of proof is not there, but lies with those who contend it doesn’t have any additional value. I would also add that for type safe languages or strict typing in general, there is another easily avoided error that 100% coverage handles for you: (where arrays start at 0, for example)
Integer i[1];
I[1] = 86;
Of course this is easy to see here with literals for element indicators. But in large code bases with loops and variables pointing to the n’th element of arrays, having tests that simply execute through code will easily weed out these issues. Pointing to a missing element in an array is a runtime bomb for even the strongest typed languages, and has absolutely nothing to do with validating return values with asserts.
~ How many lines of code does it take to break a program? ~
July 26th, 2010 - 23:21
I like that question: “How many lines of code does it take to break a program?”. That really shines a light on the mindset that’s stopping people from unit-testing more doesn’t it?