← Back to context

Comment by 0xffff2

5 years ago

N pieces of code that are mostly the same but have subtle differences isn't repetition and probably shouldn't be combined into a single function, especially if the process of doing so it non-obvious.

> N pieces of code that are mostly the same but have subtle differences isn’t repetition

Often, they are.

IME, a very common pattern is divergent copypasta where – because there is no connection once the copying occurs – a bug is noticed and fixed in one place and not in others, later noticed separately and fixed a slighly different way in some of the others, in still others a different thing that needs done in the same function gets inserted in between parts of the copypasta, etc. IT’s still essentially duplication – more over its still the same logical function being performed different places, but in slightly different ways, creating a singificant multiple on maintainance cost, which – not literal code duplication in and of itself, is the actual problem addressed with DRY, which is explicitly not about duplication of code but single source of truth of knowledge: “Every piece of knowledge must have a single, unambiguous, authoritative representation within a system”. Divergent implementations of the same logical function are different representations of the same knowledge.

Often it is. Earlier in this thread I used the example of a unit system - one example of there there can be a ton of repetition to remove, but there are fundamental differences between liters and meters that make removing the duplication hard if you didn't realize upfront you had a problem. Once you get it right converting meters to chains isn't that hard (I wonder how many reading even know chain was a unit of measure - much less have any idea how big it is), but there are a ton of choices to make it work.

I think they mean if the code does the same thing but has syntax differences. Variables are named differently, one uses a for loop while the other uses functional list operations, etc.

You never know if these subtle differences were intentional to begin with. It might have been the same once upon a time, but then during an emergency outage someone makes a quick fix in one place but forgets to update all other copies of this code.

Repeating what others already mentioned, often it can be the same thing but written in a slightly different way. Even basic stuff like string formatting vs string concatenation can make it non-obvious that two pieces of code are copies.