← Back to context

Comment by allo37

5 years ago

I think where DRY trips people up is when you have what I call "incidental repetition". Basically, two bits of logic seem to do exactly the same thing, but the contexts are slightly different. So you make a nice abstraction that works well until you need to modify the functionality in one context and not the other...

If you mostly deduplicate by writing functions, fixing this problem is never very hard: duplicate the function, rename it and change the call-site.

The interesting thing about DRY is that opinions about it seem to depend on what project you’ve worked on most recently: I inherited a codebase written by people skeptical of DRY, and we had a lot of bugs that resulted from “essential duplication”. Other people inherit code written by “architecture astronauts”, and assume that the grass is greener on the WET side of the fence.

Personally, having been in both situations, I’d almost always prefer to untangle a bad abstraction rather than maintain a WET codebase.

  • Conversely, fixing duplication is never hard. Just move the duplicated code into a function. Going in reverse can be much tougher if the function has become an abstraction, where you have to figure out what path each function call was actually taking.

    Or, put another way: I'd much rather deal with duplication than with coupling problems.

    • The problem with duplication it is hard to spot and fix. Converting liters to ml or quarts isn't hard, but the factors are different, and there is also other units. If you only do a few of these isn't a big deal, but if you suddenly realize that you have tons of different conversions scattered around and you really need to implement a good unit conversion system it will be really hard to retrofit everything. Note that even if you have a literToml, Literto Quart and MileToKm functions retrofitting the right system will be hard. (Where I work we have gone through 4 different designs of a uber unit system module before we actually got all the requirements right, and each transition was a major problem)

    • > Conversely, fixing duplication is never hard. Just move the duplicated code into a function.

      I think the single biggest factor determining the difficulty of a code change is the size of the codebase. Codebases with a lot of duplication are larger, and the scale itself makes them harder to deal with. You may not even realize when duplication exists because it may be spread throughout the code. It may be difficult to tell which code is a duplicate, which is different for arbitrary reasons, and which is different in subtle but important ways.

      Once you get to a huge sprawl of code that has a ton of mostly-pointless repetition, it is a nightmare to tame it. I would much rather be handed a smaller but more intertwined codebase that only says something once.

    • I think the opposite is true. Bad abstractions can be automatically removed with copy/paste and constant propagation. N pieces of code that are mostly the same but have subtle differences have no automatic way to combine them into a single function.

      5 replies →

    • The issue I have is that duplication is a coupling problem, but there’s no evidence in the coupled parts of the code that the coupling exists. It can be ok on a single-developer project or if confined to a single file, but my experience is that it’s a major contributor to unreliability, especially when all the original authors of the code have left.

    • If you find a bug in the duplicated part and has no idea that it was actually duplicated (or even if you do, where are they?), you still have multiple lurking bugs around.

    • Fixing duplication is never hard because by nature, duplicated code will drift over time even if it shouldn't have. So it's technically not "duplicate" anymore even if they are supposed to do the exact same thing.

      Fixing a bad abstraction is only hard because there's some weird culture about not tearing down abstractions. Rip them apart and toss them on the heap. It's a million times easier than finding duplicate code that has inadvertently drifted apart over time.

      1 reply →

Yep. Re-use is difficult. When you overdo it you cause a whole new set of problems. I once watched a person write a python decorator that tried unify HTTP header based caching with ad hoc application caching done using memcached.

When I asked exactly what they were trying to accomplish they kept saying "they didn't want caching in two places". I think anyone with experience can see that these items are unrelated beyond both having the word "cache"

This is what was actually hanging them up ... the language. Particularly the word "cache". I've seen this trap walked into over and over. Where a person tries to "unify" logic simply because two sets of logic contain some of the same words.

So much this. Especially early in project's life when you aren't sure what the contexts/data model/etc really need to be, so much logic looks the same. It becomes so hard to untangle later.

Then you copy function and modify one place? I don't get what is so hard about it. The IDE will even tell you about all places where function is called, there is no way to miss one.