Comment by anonyfox

5 years ago

> WET (Write everything twice), figure out the abstraction after you need something a 3rd time

so much this. it is _much_ easier to refactor copy pasta code, than to entangle a mess of "clean code abstractions" for things that isn't even needed _once_. Premature Abstraction is the biggest problem in my eyes.

Write Code. Mostly functions. Not too much.

I think where DRY trips people up is when you have what I call "incidental repetition". Basically, two bits of logic seem to do exactly the same thing, but the contexts are slightly different. So you make a nice abstraction that works well until you need to modify the functionality in one context and not the other...

  • If you mostly deduplicate by writing functions, fixing this problem is never very hard: duplicate the function, rename it and change the call-site.

    The interesting thing about DRY is that opinions about it seem to depend on what project you’ve worked on most recently: I inherited a codebase written by people skeptical of DRY, and we had a lot of bugs that resulted from “essential duplication”. Other people inherit code written by “architecture astronauts”, and assume that the grass is greener on the WET side of the fence.

    Personally, having been in both situations, I’d almost always prefer to untangle a bad abstraction rather than maintain a WET codebase.

    • Conversely, fixing duplication is never hard. Just move the duplicated code into a function. Going in reverse can be much tougher if the function has become an abstraction, where you have to figure out what path each function call was actually taking.

      Or, put another way: I'd much rather deal with duplication than with coupling problems.

      12 replies →

  • Yep. Re-use is difficult. When you overdo it you cause a whole new set of problems. I once watched a person write a python decorator that tried unify HTTP header based caching with ad hoc application caching done using memcached.

    When I asked exactly what they were trying to accomplish they kept saying "they didn't want caching in two places". I think anyone with experience can see that these items are unrelated beyond both having the word "cache"

    This is what was actually hanging them up ... the language. Particularly the word "cache". I've seen this trap walked into over and over. Where a person tries to "unify" logic simply because two sets of logic contain some of the same words.

  • So much this. Especially early in project's life when you aren't sure what the contexts/data model/etc really need to be, so much logic looks the same. It becomes so hard to untangle later.

  • Then you copy function and modify one place? I don't get what is so hard about it. The IDE will even tell you about all places where function is called, there is no way to miss one.

> it is _much_ easier to refactor copy pasta code

So long as it remains identical. Refactoring almost identical code requires lots of extremely detailed staring to determine whether or not two things are subtly different. Especially if you don't have good test coverage to start with.

  • I personally love playing the game called "reconcile these very-important-but-utterly-without-tests sequences of gnarly regexes that were years ago copy-pasted in seven places and since then refactored, but only in three of the seven places, and in separate efforts each time".

There's a problem with being overly zealous. It's entirely possible to write bad code, either being overly dry or copy paste galore. I think we are prone to these zealous rules because they are concrete. We want an "objective" measure to judge whether something is good or not.

DRY and WET are terms often used as objective measures of implementations, but that doesn't mean that they are rock solid foundations. What does it mean for something to be "repeated"? Without claiming to have TheGreatAnswer™, some things come to mind.

Chaining methods can be very expressive, easy to follow and maintain. They also lead to a lot of repetition. In an effort to be "DRY", some might embark on a misguided effort to combine them. Maybe start replacing

  `map(x => x).reduce(y, z => v)` 

with

  `mapReduce(x => x, (y,z) => v)`

This would be a bad idea, also known as Suck™.

But there may equally be situations where consolidation makes sense. For example, if we're in an ORM helper class and we're always querying the database for an object like so

  `objectContext.Orders.Select(e => e.id = y).Include(e => e.Customers).Include(e => e.Bills).Include(e => e.AwesomeDogs)...`

then it with make sense to consolidate that into

  `orderIncludingCustomersBillsAndDogs(id) => ...`

My $0.02:

Don't needlessly copy-pastes that which is abstractable.

Don't over abstract at the cost of simplicity and flexibility.

Don't be a zealot.

>it is _much_ easier to refactor copy pasta code

I totally agree assuming that there will be time to get to the second pass of the "write everything twice" approach...some of my least favorite refactoring work has been on older code that was liberally copy-pasted by well-intentioned developers expecting a chance to come back through later but who never get the chance. All too often the winds of corporate decision making will change and send attention elsewhere at the wrong moment, and all those copy pasted bits will slowly but surely drift apart as unfamiliar new developers come through making small tweaks.

I worked on a small team with a very "code bro" culture. No toxic, but definitely making non-PC jokes. We would often say "Ask your doctor about Premature Abstractuation" or "Bad news, dr. says this code has abstractual dysfunction" in code reviews when someone would build an AbstractFactoryFactoryTemplateConstructor for a one-off item.

When we got absorbed by a larger team and were going to have to "merge" our code review / git history into a larger org's repos, we learned that a sister team had gotten in big trouble with the language cops in HR when they discovered similar language in their git commit history. This brings back memories of my team panicked over trying to rewrite a huge amount of git history and code review stuff to sanitize our language before we were caught too.

  • Wait whaat? I am not from USA and hail from a much more blunt culture, what's wrong with those 2 statements?

    To me they're harmless jokes but maybe someone will point out they're ableist?

    • Those aren't particularly bad, but dick jokes have no place in a professional workplace imo. I'm so very weary of it. And yes, in the past I was part of the problem. Then I grew up.

    • It wasn't just these phrases, but we took them out to be safe (thinking that them being jokes about male... "performance", like from pharma ads on TV with an old man throwing a football through a tire swing).

      The biggest thing is that we used non-PC language like "retarded" very casually, not to mention casually swearing in commit messages e.g. "un-fucking up my prior commit". Our sister team got in trouble for "swears in the git commit history", so we wanted to get ahead of that if possible.

      In a healthy company culture, we'd just say "okay we'll stop using these terms", but the effort was made to erase their existence because this was a company where non-engineering people (e.g. how well managers and HR liked you) was a big factor in getting promoted.

      Once I realized how messed up that whole situation was, I left as fast as I could.

  • If you were "caught", what would happen? You probably have a couple of zoom calls and get forced to watch sensitivity training videos. Who cares.

> it is _much_ easier to refactor copy pasta code,

Its easy to refactor if its nondivergent copypasta and you do it everywhere it is used not later than the third iteration.

If the refactoring gets delayed, the code diverges because different bugs are noticed and fixed (or thr same bug is noticed and fixed different ways) in different iterations, and there are dozens of instances across the code base (possibly in different projects because it was copypastad across projects rather than refactored into a reusable library), the code has in many cases gotten intermixed with code addressing other concerns...

> Write Code. Mostly functions. Not too much.

Think about data structures (types) first. Mostly immutable structures. Then add your functions working on those structures. Not too many.

  • OMG. This is exactly my experience after trying to write code first for 10+ years. (Yes, I am a terrible [car] driver, and a totally average programmer!)

    "Bad programmers worry about the code. Good programmers worry about data structures and their relationships." - Linus Torvalds

    He wasn't kidding!

    And the bit about "immutable structures". I doubted for infinity-number-of-years ("oh, waste of memory/malloc!"). Then suddenly your code needs to be multi-threaded. Now, immutable structures looks genius!