Comment by twic
5 years ago
I don't hate the rule of 3. But i think it's missing the point.
You want to extract common code if it's the same now, and will always be the same in the future. If it's not going to be the same and you extract it, you now have the pain of making it do two things, or splitting. But if it is going to be the same and you don't extract it, you have the risk of only updating one copy, and then having the other copy do the wrong thing.
For example, i have a program where one component gets data and writes it to files of a certain format in a certain directory, and another component reads those files and processes the data. The code for deciding where the directory is, and what the columns in the files are, must be the same, otherwise the programs cannot do their job. Even though there are only two uses of that code, it makes sense to extract them.
Once you think about it this way, you see that extraction also serves a documentation function. It says that the two call sites of the shared code are related to each other in some fundamental way.
Taking this approach, i might even extract code that is only used once! In my example, if the files contain dates, or other structured data, then it makes sense to have the matching formatting and parsing functions extracted and placed right next to each other, to highlight the fact that they are intimately related.
> You want to extract common code if it's the same now, and will always be the same in the future.
I suppose I take that as a presumption before the Rule of 3 applies. I generally assume/take for granted that all "exact duplicates" that "will always be the same in future" are going to be a single shared function anyway. The duplication I'm concerned about when I think the Rule of 3 comes into play is the duplicated but diverging. ("I need to do this thing like X does it, but…")
If it's a simple divergence, you can add a flag sometimes, but the Rule of 3 suggests that sometimes duplicating it and diverging it that second time "is just fine" (avoiding potentially "flag soup") until you have a better handle on the pattern for why you are diverging it, what abstraction you might be missing in this code.
The rule of three is a guideline or principle, not a strict rule. There's nothing about it that misses the point. If, from your experience and judgement, the code can be reused, reuse it. Don't duplicate it (copy/paste or write it a second time). If, from your experience and judgement, it oughtn't be reused, but you later see that you were wrong, refactor.
In your example, it's about modularity. The directory logic makes sense as its own module. If you wrote the code that way from the start, and had already decoupled it from the writer, then reuse is obvious. But if the code were tightly coupled (embedded in some fashion) within the writer, than rewriting it would be the obvious step because reuse wouldn't be practical without refactoring. And unless you can see how to refactor it already, then writing it the second time (or third) can help you discover the actual structure you want/need.
As people become more experienced programmers, the good ones at least, already tend to use modular designs and keep things decoupled which promotes reuse versus copy/paste. In that case, the rule of three gets used less often by them because they have fewer occurrences of real duplication.
I think the point you and a lot of other commenters make is that applying hard and fast rules without referring to context is simply wrong. Surely if all we had to do was apply the rules, somebody would have long ago written a program to write programs. ;-)
You have a point for extracting exact duplicates that you know will remain the same.
But the point of the rule of 3 remains. Humans do a horrible job of abstracting from one or two examples, and the act of encountering an abstraction makes code harder to understand.