Comment by dcolkitt

5 years ago

The point is that bottom-up code is a siren song. It never scales. It makes it a lot easier to get started, but given enough complexity it inevitably breaks down.

Once your codebase gets to somewhere around the 10,000 line mark, it becomes impossible for a single mind to hold the entire program in their head at a single time. The only way to survive past that point is with carefully thought out, water tight layers of abstractions. That almost never happens with bottom-up. Bottom-up is a lot like natural selection. You get a lot of kludges that work great to solve their immediate problem, but behave in undefined and unpredictable ways when you extend them outside their original environment.

Bottom-up can work when you're inside well-encapsulated modular components with bounded scope and size. But there's no way to keep those modules loosely coupled unless you have a elegant top-down architecture imposing order at the large-scale structure.

But the reverse is also true. Top-down programming doesn't really work well for smaller programs, it definitely doesn't work well when you're dealing with small, highly performance-critical or complex tasks.

So sure, I'll grant that when your program reaches the 10,000 line mark, you need to have some serious abstractions. I'll even give you that you might need to start abstracting things when a file reaches 1,000 lines.

But when we start talking about the rule of 30 -- that's not managing complexity, that's alphabetizing a sock drawer and sewing little permanent labels on each sock. That approach also doesn't scale to large programs because it makes rewrites and refactors into hell, and it makes new features extremely cumbersome to quickly iterate on. Your 10,000 line program becomes 20,000 lines because you're throwing interfaces and boilerplate all over the place.

Note that this isn't theoretical, I have worked in programs that did everything from building an abstraction layer over the database in case we wanted to use Mongo and SQL at the same time (we didn't), to having a dependency management system in place that meant we had to edit 5 files every time we wanted to add a new class, to having a page lifecycle framework that was so complicated that half of our internal support requests were trying to figure out when it was safe to start adding customer data to the page.

The benefit of a good, long, single-purpose function that contains all of its logic in one place is that you know exactly what the dependendencies are, you know exactly what the function is doing, you know that no one else is calling into the inlined logic that you're editing, and you can easily move that code around and change it without worrying about updating names or changing interfaces.

Abstract your code, but abstract your code when or shortly before you hit complexity barriers and after you have enough knowledge to make informed decisions about which abstractions will be helpful -- don't create a brand new interface every time you write a single function. It's fine to have a function that's longer than a couple hundred lines. If you're building something like a rendering or update loop, in many cases I would say it's preferable.

  • It's funny how these things are literally what the Clean Code book advocates for. Sure there is mention of a lot of stuff that's no longer needed and was a band aid over language deficiencies of a particular language. But the ideas are timeless and I used them before I even knew the book and I used them in Perl.

    • > these things are literally what the Clean Code book advocates for

      I'm not sure I understand what you're saying, I might be missing your point. The Clean Code book advocates that the ideal function is a single digit number of lines, double digits at the absolute most.

      In my mind, the entire process of writing functions that short involves abstracting almost everything your code does. It involves passing data around all over the place and attaching state to objects that get constructed over multiple methods.

      How do you create a low-abstraction, bottom-up codebase when every coroutine you need to write is getting turned into dozens of separate functions? I think this is showcased in the code examples that the article author critiques from Clean Code. They're littered with side effects and state mutations. This stuff looks like it would be a nightmare to maintain, because it's over-abstracted.

      Martin is writing one-line functions whose entire purpose is to call exactly one other function passing in a boolean. I don't even know if I would call that top-down programming, it feels like critiquing that kind of code or calling it characteristic of their writing style is almost unfair to top-down programmers.

      7 replies →

As mainly a bottom-up person, I completely agree with your analysis but I wonder if you might be using "top-down architecture" here in an overloaded way?

My personal style is bottom up, maximally direct code, aiming for monolithic modules under 10kloc, combined with module coupling over very narrow interfaces. Generally the narrow interfaces emerge from finding the "natural grain" of the module after writing it, not from some a priori top-down idea of how the communication pathways should be shaped.

Edit: an example of a narrow interface might be having a 10kloc quantitative trading strategy module that communicates with some larger system only by reading off a queue of things that might need to be traded, and writing to a queue of desired actions.