Comment by fouric

5 years ago

>> If you have to dig into the details of a function to understand what it does, you have failed to sufficiently explain what the function does through its naming and set of arguments.

> This isn't always true in my experience. Often when I need to dig into the details of a function it's because how it works is more important than what it says it's doing. There are implementation concerns you can't fit into a function name.

Both of these things are not quite right. Yes, if you have to dig into the details of a function to understand what it does, it hasn't been explained well enough. No, the prototype cannot contain enough information to explain it. No, you shouldn't look at the implementation either - that leads to brittle code where you start to rely on the implementation behavior of a function that isn't part of the interface.

The interface and implementation of a function are separate. The former should be clearly-documented - a descriptive name is good, but you'll almost always also need docstrings/comments/other documentation - while you should rarely rely on details of the latter, because if you are, that usually means that the interface isn't defined clearly enough and/or the abstraction boundaries are in the wrong places (modulo things like looking under the hood to refactor, improve performance, etc - all abstractions are somewhat leaky, but you shouldn't be piercing them regularly).

> If you define complexity by the sum total of possible interactions (which is itself a problematic definition, but I'll talk about that below), then complexity always increases when you factor out functions, because the interfaces, error-handling, and boilerplate code around those functions increases the number of possible interactions happening during your function call.

This - this is what everyone who advocates for "small functions" doesn't understand.

1 comment

fouric

danShumway 5 years ago

> all abstractions are somewhat leaky, but you shouldn't be piercing them regularly).

I think this gets back to the old problem of "documentation is code that doesn't run." I'm not saying get rid of documentation -- I comment my code to an almost excessive degree, because I need to be able to remember in the future why I made certain decisions, I need to know what the list of tradeoffs were that went into a decision, I need to know if there are any potential bugs or edge-cases that I haven't tested for yet.

But what I am saying is that it is uncommon for a interface to be perfectly documented -- not just in code I write, but especially in 3rd-party libraries. It's not super-rare for me to need to dip into library source code to figure out behaviors that they haven't documented, or interfaces that changed between versions and aren't described anywhere. People struggle with good documentation.

Sometimes that's performance: if a 3rd-party library is slow, sometimes it's because of how it's implemented. I've run into that with d3 addons in the past, where changing how my data is formatted results in large performance gains, and only the implementation logic revealed that to me. Is that a leaky abstraction? Sure, I suppose, but it doesn't seem to be uncommon. Is it fragile? Sure, a bit, but I can't release charts that drop frames whenever they zoom just because I refuse to pay attention to the implementation code.

So I get what you're saying, but to me "abstractions shouldn't be leaking" is a bit like saying "code shouldn't have bugs", or "minor semvar increases should have no breaking changes." I completely agree, but... it does, and they do. Relying on undocumented behavior is a problem, but sometimes documented behavior diverges from implementation. Sometimes the abstractions are so leaky that you don't have a choice.

And that's not just a problem with 3rd-party code, because I'm also not a perfect programmer, and sometimes my own documentation on internal methods diverges from my implementation. I try very hard not to have that happen, but I also try hard to compensate for the fact that I'm a human being who makes mistakes. I try to build systems that are less work to maintain and less prone to having their documentation decay over time. I've found that in code that I'm personally writing, it can be useful to sidestep the entire problem and inline the entire abstraction. Then I don't have to worry about fragility at all.

If you're not introducing a 3rd-party library or a separate interface for every measly 50 lines of code, and instead you just embed your single-use chunk of logic into the original function you want to call it in, then you never have to worry about whether the abstraction is leaky. That can have a tangible effect on the maintainability of your program, because it reduces the number of opportunities you have to mess up an interface or its documentation.

For perfect abstractions, I agree with you. I'm not saying get rid of all abstractions. I just think that perfect abstractions are more difficult and rarer than people suppose, and sometimes for some kinds of logic, a perfect abstraction might not exist at all.