← Back to context

Comment by dnautics

5 years ago

I don't think this is always true. For example, if you are writing tests, the DRY rule of three doesn't apply. It's very okay to repeat code if it prevents a layer of indirection for the person who is reading the test.

I used to think this and have come to realise that this is definitely not true. The problem that a thorough automated test suite can cause is that it becomes very painful to refactor code.

As you add code, the best structure for that code changes and you want to refactor. I'm not just talking here about pulling some shared code into a new function, I'm talking about moving responsibilities between modules, changing which data lives in which data structures etc. These changes are the key to ensuring your code stays maintainable, and makes sense. Every unit test you add 'pins' the boundary of your module (or class or whatever is appropriate to your language). If you have lots of tests with repeated code, it can take 5 times as long to fix the tests as it can to make the actual refactors. This either means that refactors are painful which usually means that people don't do them as readily (because the subconscious cost-benefit analysis is shifted).

If - on the other hand - you treat your test suite as a bit of software to be designed and maintained like any other, then you improve this situation. Multiple tests hitting the same interface are probably doing it through a common helper function that you can adjust in one place, rather than in 20 tests. Your 'fixtures' live in one place that can be updated and are reused in multiple places. This usually means that your test suite helps more with the transition too - you get more confidence you've refactored correctly.

The other part of this problem (which is maybe more controversial?) is that I try not to rely too much on lots of unit tests, and lean more on testing sets of modules together. These tests prove that modules interact with each other correctly (which unit tests do not), and are also changed less when you refactor (and give confidence you didn't break anything when you refactor).

  • I was mostly referring to integration tests. And yes, there are basics like fixtures which do get DRY'd out but they really need to be as unambiguous as possible in their mental model `insert(<table>,[<c:v>])` for a database entry, e.g.

    I guess my point was not that you never DRY in tests, just that you should be very picky about when to DRY, more so than in code, and that is necessarily in opposition to the advice in OP.

I puzzled about that for years and concluded that tests are a completely different kind of system, best thought of as executable requirements or executable documentation. For tests, you don't want a well-factored graph of abstractions—you want a flat set of concrete examples, each independently understandable. Duplication helps with that, and since the tests are executable, the downsides of duplication don't bite as hard.

A test suite with a lot of factored-out common bits makes the tests harder to understand. It's similar to the worked examples in a math textbook. If half a dozen similar examples factored out all the common bits (a la "now go do sub-example 3.3 and come back here", and so on), they would be harder to understand than repeating the similar steps each time. They would also start to use up the brain's capacity for abstraction, which is needed for understanding the math that the exercises illustrate.

These are two different cognitive styles: the top-down abstract approach of definitions and proofs, and the bottom-up concrete approach of examples and specific data. The brain handles these differently and they complement one another nicely as long as you keep them distinct. Most of us secretly 'really' learn the abstractions via the examples. Something clicks in your head as you grok each example, which gives you a mental model for 'free', which then allows you to understand the abstract description as you read it. Good tests do something like this for complex software.

Years ago when I used to consult for software teams, I would sometimes see test systems that had been abstracted into monstrosities that were as complicated as the production systems they were trying to test, and even harder to understand, because they weren't the focus of anybody's main attention. No one really cares about it, and customers don't depend on it working, so it becomes a twilight zone. Bugs in such test layers were hard to track down because no one was fresh on how they worked. Sometimes it would turn out that the production system wasn't even being tested—only the magic in the monster middle layer.

An example would be factory code to initialize objects for testing, which gradually turns into a complex network of different sorts of factory routines, each of which contribute some bit and not others. Then one day there's a problem because object A needs something from both factory B and factory C, but other bits aren't compatible, so let's make a stub bit instead and pass that in... All of this builds up ad hoc into one of those AI-generated paintings that look sort of like reality but also like a nightmare or a bad trip. The solution in such cases was to gradually dissolve the middle layer by making the tests as 'naked' as possible, and the best technique we had for that was to shamelessly duplicate whatever data and even code we needed to into each concrete test. But the same technique would be disastrous in the production system.