Comment by jghn
5 years ago
I have found this to be one of those A or B developer personas that are hard for someone to change, and causes much disagreement. I personally agree 100%, but have known other people who couldn't disagree more, it is what it is.
I've always felt it had a strong correlation to top-down vs bottom-up thinkers in terms of software design. The top-down folks tend to agree with your stance and the bottom-up group do not. If you're naturally going to want to understand all of the nitty gritty details you want to be able to wrap your head around those as quickly as possible. If you're willing to think in terms of the abstractions you want to remove as many of those details from sight as possible to reduce visual noise.
I wish there was an "auto-flattener"/"auto-inliner" tool that would allow you to automagically turn code that was written top-down, with lots of nicely high-level abstractions, into an equivalent code with all the actions mushed together and with infrastructure layers peeled away as much as possible.
Have you ever seen a codebase with infrastructure and piping taking about 70% of the code, with tiny pieces of business logic thrown here and there? It's impossible to figure out where the actual job is being done (and what it actually is): all you can see is just an endless chain of methods that mostly just delegate the responsibility further and further. What could've been a 100-line loop of "foreach item in worklist, do A, B, C" kind is instead split over seven tightly cooperating classes that devote 45% of their code to multiplexing/load-balancing/messaging/job-spooling/etc, another 45% to building trivial auxiliary structure and instantiating each other, and only 10% actually devoted to the actual data processing, but good luck finding those 10%, because there is a never-ending chain of calling each other: A.do_work() calls B.process_item() which calls A.on_item_processing() which calls B.on_processed()... wait, shouldn't there been some work done between "on_item_processing" and "on_processed"? Yes, it was done by an inconspicuously named "prepare_next_worklist_item" function.
Ah, and the icing on the cake: looping is actually done from the very bottom of this call chain by doing a recursive call to the top-most method which at this point is about 20 layers above the current stack frame. Just so you can walk down this path again, now with the feeling.
Your comment gives me emotional flashbacks. Years ago I took Java off my resume, because I don’t want to ever interact with this sort of thing again. (I’m sure it exists in other languages, but I’ve never seen it quite as bad as in Java.)
I think the best “clean code” programming advice is the advice writers have been saying for centuries. Find your voice. Be direct and be brief. But not too brief. Programming is a form of expression. Step 1 is to figure out what you’re trying to say (eg the business logic). Then say it in its most natural form (switch statements? If-else chain? Whatever). Then write the simplest scaffold around it you can so it gets called with the data it needs.
The 0th step is stepping away from your computer and naming what you want your program to express in the first place. I like to go for walks. Clear code is an expression of clear thoughts. You’ll usually know when you’ve found it because it will seem obvious. “Oh yeah, this code is just X. Now I just have to type it up.”
>I wish there was an "auto-flattener"/"auto-inliner" tool
I'm as big an advocate of "top-down" design as anyone, and I have also wished for such a tool. When you just want to know "what behavior comes next", all the abstractions do get in the way. The IDE should be able to "flatten" the execution path from current context and give you a linear view of the code. Sort of like a trace of a debug session, but generated on-the-fly. But still, I don't think this is the best way to write code.
Most editors have code folding. I've noticed this helps when there are comments or it's easy to figure out the branching or what not.
However, what you're asking for is a design style that's hard to implement I think without language tooling (for example identifying effectful methods).
GP is asking for the opposite. They're asking for code unfolding.
That is, given a "clean code like":
The tool would inline all the function calls. That is, for each of ProcessSth(), ValidateSthElse() and DoSth(), it would automatically perform the task of "copy the function body, paste it at the call site, and massage the caller to make it work". It's sometimes called the "inline function" refactoring - the inverse of "extract function"/"extract method" refactoring.
I'd really, really want such a tool. Particularly one where the changes were transient - not modifying the source code, just overlaying it with a read-only replacement. Also interactive. My example session is:
- Take the "clean code" function that just calls a bunch of other functions. With one key combination, inline all these functions.
- In the read-only inlined overlay, mark some other function calls and inline them too.
- Rinse, repeat, until I can read the overlay top-to-bottom and understand what the code is actually doing.
10 replies →
> I wish there was an "auto-flattener"/"auto-inliner" tool that would allow you to automagically turn code that was written top-down, with lots of nicely high-level abstractions, into an equivalent code with all the actions mushed together and with infrastructure layers peeled away as much as possible.
Learn to read assembly and knock yourself out.
That's not a very helpful response. Unless the code is compiled to native machine code and is all inlined, this won't help one bit.
3 replies →
While I think you are onto something about top-down vs. bottom-up thinkers, one of the issues with a large codebase is literally nobody can do the whole thing bottom-up. So you need some reasonable conventions and abstraction, or the whole thing falls apart under it's own weight.
Yep, absolutely.
That's another aspect of my grand unifying theory of developers. Those same personas seem to have correlations in other ways: dynamic vs static typing, languages, monolith vs micro service. How one perceives complexity, what causes one to complain about complexity, etc all vary based on these things. It's easy to arrive in circumstances where people are arguing past each other.
If you need to be able to keep all the details in your head you're going to need smaller codebases. Similar, if you're already keeping track of everything, things like static typing become less important to you. And the opposite is true.
> Those same personas seem to have correlations in other ways: dynamic vs static typing, languages, monolith vs micro service.
Your theory needs to account for progression over time. For example, the first programming languages I've learned were C++ and Java, so I believed in static typing. Then I worked a lot in PHP, Erlang and Lisp, and became a dynamic typing proponent. Later on, with much more experience behind me, I became a static typing fan again - to the point that my Common Lisp code is thoroughly typed (to the point of being non-idiomatic), and I wish C++ type system was more expressive.
Curiously, at every point of this journey, I was really sure I have it all figured out, and the kind of typing I like is the best way to manage complexity.
--
EDIT: your hypothesis about correlated "frames of mind" reminds me of a discussion I had with 'adnzzzzZ here, who also claimed something similar, but broader: https://github.com/a327ex/blog/issues/66 also touched on static/dynamic typing debate.
1 reply →
Huh. There's something to this.
I've often wondered why certain people feel so attached to static typing when in my experience it's rarely the primary source of bugs in any of the codebases I work with.
But it's true, I do generally feel like a codebase that's so complex or fractured that no one can understand any sizable chunk of it is just already going to be a disaster regardless of what kind of typing it uses. I don't hate microservices, they're often the right decision, but I feel they're almost always more complicated than a monolith would be. And I do regularly end up just reading implementation code, even in 3rd-party libraries that I use. In fact in some libraries, sometimes reading the source is quicker and more reliable than trying to find the relevant documentation.
I wouldn't extrapolate too much based on that, but it's interesting to hear someone make those connections.
10 replies →
I’m reminded of an earlier HN discussion about an article called The Wrong Abstraction, where I argued¹ that abstractions have both a benefit and a cost and that their ratio may change as a program evolves and which of those “nitty gritty details” are immediately relevant and which can helpfully be hidden behind abstractions changes.
¹ https://news.ycombinator.com/item?id=23742118
The point is that bottom-up code is a siren song. It never scales. It makes it a lot easier to get started, but given enough complexity it inevitably breaks down.
Once your codebase gets to somewhere around the 10,000 line mark, it becomes impossible for a single mind to hold the entire program in their head at a single time. The only way to survive past that point is with carefully thought out, water tight layers of abstractions. That almost never happens with bottom-up. Bottom-up is a lot like natural selection. You get a lot of kludges that work great to solve their immediate problem, but behave in undefined and unpredictable ways when you extend them outside their original environment.
Bottom-up can work when you're inside well-encapsulated modular components with bounded scope and size. But there's no way to keep those modules loosely coupled unless you have a elegant top-down architecture imposing order at the large-scale structure.
But the reverse is also true. Top-down programming doesn't really work well for smaller programs, it definitely doesn't work well when you're dealing with small, highly performance-critical or complex tasks.
So sure, I'll grant that when your program reaches the 10,000 line mark, you need to have some serious abstractions. I'll even give you that you might need to start abstracting things when a file reaches 1,000 lines.
But when we start talking about the rule of 30 -- that's not managing complexity, that's alphabetizing a sock drawer and sewing little permanent labels on each sock. That approach also doesn't scale to large programs because it makes rewrites and refactors into hell, and it makes new features extremely cumbersome to quickly iterate on. Your 10,000 line program becomes 20,000 lines because you're throwing interfaces and boilerplate all over the place.
Note that this isn't theoretical, I have worked in programs that did everything from building an abstraction layer over the database in case we wanted to use Mongo and SQL at the same time (we didn't), to having a dependency management system in place that meant we had to edit 5 files every time we wanted to add a new class, to having a page lifecycle framework that was so complicated that half of our internal support requests were trying to figure out when it was safe to start adding customer data to the page.
The benefit of a good, long, single-purpose function that contains all of its logic in one place is that you know exactly what the dependendencies are, you know exactly what the function is doing, you know that no one else is calling into the inlined logic that you're editing, and you can easily move that code around and change it without worrying about updating names or changing interfaces.
Abstract your code, but abstract your code when or shortly before you hit complexity barriers and after you have enough knowledge to make informed decisions about which abstractions will be helpful -- don't create a brand new interface every time you write a single function. It's fine to have a function that's longer than a couple hundred lines. If you're building something like a rendering or update loop, in many cases I would say it's preferable.
It's funny how these things are literally what the Clean Code book advocates for. Sure there is mention of a lot of stuff that's no longer needed and was a band aid over language deficiencies of a particular language. But the ideas are timeless and I used them before I even knew the book and I used them in Perl.
8 replies →
As mainly a bottom-up person, I completely agree with your analysis but I wonder if you might be using "top-down architecture" here in an overloaded way?
My personal style is bottom up, maximally direct code, aiming for monolithic modules under 10kloc, combined with module coupling over very narrow interfaces. Generally the narrow interfaces emerge from finding the "natural grain" of the module after writing it, not from some a priori top-down idea of how the communication pathways should be shaped.
Edit: an example of a narrow interface might be having a 10kloc quantitative trading strategy module that communicates with some larger system only by reading off a queue of things that might need to be traded, and writing to a queue of desired actions.
I never thought of things this way but it is a useful perspective.