← Back to context

Comment by narnarpapadaddy

9 months ago

Implicitly, IIRC, the optimal ratio is 5-20:1. Your interface must cover 5-20 cases for it have value. Any fewer, the additional abstraction is unneeded complexity. Any more, and your abstraction is likely too broad to be useful/understandable. The example he gives specifically was considering the number of subclasses in a hierarchy.

It’s like a secret unlock code for domain modeling. Or deciding how long functions should be (5-20 lines, with exceptions).

I agree, hugely usual principle.

This is a good rule of thumb, but what would be a good response to have interfaces because, "what if a new scenario comes up in the future"?

  • The scenario NEVER comes up in the future as it was originally expected. You'll end up having to remove and refactor a lot of code. Abstractions are useful only used sparingly and when they don't account for handling something that doesn't even exist yet.

  • When doing the initial design start in the middle of the complexity to abstraction budget. If you have 100 “units of complexity” (lines of code, conditions, states, classes, use cases, whatever) try to find 10 subdivisions of 10 units each. Rarely, you’ll have a one-off. Sometimes, you’ll end up with more than 20 in a group. Mostly, you should have 5-20 groups of 5-20 units.

    If you start there, you have room for your abstraction to bend before it becomes too brittle and you need to refactor.

    Almost never is an interface worth it for 1 implementation, sometimes for 3, often for 5-20, sometimes for >20.

    The trick is recognizing both a “unit of complexity” and how many “units” a given abstraction covers. And, of course, different units might be in tension and you have to make a judgement call. It’s not a silver bullet. Just a useful (for me at least) framing for thinking about how to manage complexity.

    • Even one use case may be enough e.g., if one class accepts another then a protocol (using Python parlance) SupportsSomething could be used to decouple two classes, to carve out the exact boundary. The protocol may be used for creating a test double (a fake) too.

  • If you own the code base, refactor. It's true that, if you're offering a stable interface to users whose code you can't edit, you need to plan carefully for backward compatibility.

  • "We'll extract interfaces as and when we need them - and when we know what the requirements are we'll be more able to design interfaces that fit them. Extracting them now is premature, unless we really don't have any other feature work to be doing?"

Maybe some examples would clarify your intent, because all the candidate interpretations I can think of are absurd.

The sin() function in the C standard library covers 2⁶⁴ cases, because it takes one argument which is, on most platforms, 64 bits. Are you suggesting that it should be separated into 2⁶⁰ separate functions?

If you're saying you should pass in boolean and enum parameters to tell a subroutine or class which of your 5–20 use cases the caller needs? I couldn't disagree more. Make them separate subroutines or classes.

If you have 5–20 lines of code in a subroutine, but no conditionals or possibly-zero-iteration loops, those lines of code are all the same case. The subroutine doesn't run some of them in some cases and others in other cases.

  • That function covers 2⁶⁴ inputs, not cases. It handles only one case: converting an angular value to (half of) a cartesian coordinate.

    • Sounds like you haven't ever tried to implement it. But if the "case" you're thinking of is the "case" narnarpapadaddy was referring to, that takes us to their clause, "Any fewer [cases], the additional abstraction is unneeded complexity." This is obviously absurd when we're talking about the sin() function. Therefore, that can't possibly have been their intended meaning.

      2 replies →

  • Think of it more like a “complexity distribution.”

    Rarely, a function with a single line or an interface with a single element or a class hierarchy with a single parent and child is useful. Mostly, that abstraction is overhead.

    Often, a function with 5-20 lines or an interface 5-20 members or a class hierarchy with 5-20 children is a useful abstraction. That’s the sweet spot between too broad (function “doStuff”) and too narrow (function “callMomOnTheLandLine”).

    Sometimes, any of the above with the >20:1 complexity ratio are useful.

    It’s not a hard and fast rule. If your complexity ratio falls outside that range, think twice about your abstraction.

    • And with respect to function behavior, I’d view it through the lens of cyclomatic complexity.

      Do I need 5-20 non-trivial test cases to cover the range of inputs this function accepts?

      If yes, function is probably about the right level of behavioral complexity to add value and not overhead.

      If I need only 1 test or if I need 200 tests it’s probably doing too much or too little.

      3 replies →