There's a lot of bad advice being tossed around in this thread. If you are worried about having to jump through multiple files to understand what some code is doing, you should consider that your naming conventions are the problem, not the fact that code is hidden behind functional boundaries.
Coding at scale is about managing complexity. The best code is code you don't have to read because of well named functional boundaries. Without these functional boundaries, you have to understand how every line of a function works, and then mentally model the entire graph of interactions at once, because of the potential for interactions between lines within a functional boundary. The complexity (sum total of possible interactions) grows as the number of lines within a functional boundary grows. The cognitive load to understand code grows as the number of possible interactions grow. Keeping methods short and hiding behavior behind well named functional boundaries is how you manage complexity in code.
The idea of code telling a story is that a unit of work should explain what it does through its use of well named variables, function/object names, and how data flows between function/objects. If you have to dig into the details of a function to understand what it does, you have failed to sufficiently explain what the function does through its naming and set of arguments.
This is the problem right here. I don't just read code I've written and I don't only read perfectly abstracted code. When I am stuck reading someone's code who loves the book and tries their best to follow those conventions I find it far more difficult - because I am usually reading their code to fully understand it myself (ie in a review) or to fix a bug I find it infuriating that I am jumping through dozens of files just so everything looks nice on a slide - names are great, I fully appreciate good naming but pretending that using a ton of extra files just to improve naming slightly isnt a hindrance is wild.
I will take the naming hit in return for locality. I'd like to be able to hold more than 5 lines of code in my head but leaping all over the filesystem just to see 3 line or 5 line classes that delegate to yet another class is too much.
Carmack once suggested that people in-line their functions more often, in part so they could “see clearly the full horror of what they have done” (paraphrased from memory) as code gets more complicated. Many helper functions can be replaced by comments and the code inlined. I tried this last year and it led to overall more readable code, imho.
The idea is that without proper boundaries, finding the line that needed to be changed may be a lot harder than clicking through files with an IDE. Smaller components also help with code reviews since it’s a lot easier to understand a line within the context of a component (or method name) without having to understand what the huge globs of code before it is doing. Also, like you said a lot of the times a developer has to read code they didn’t write so there are other factors to consider like how easy it is for someone from another team to make a change or whether a new employee could easily digest the code base.
I would extend this one level higher to say managing complexity is about managing risk. Risk is usually what we really care about.
From the article:
>any one person's opinions about another person's opinions about "clean code" are necessarily highly subjective.
At some point CS as a profession has to find the right balance of art and science. There's room for both. Codifying certain standards is the domain of professions (in the truest sense of the word) and not art.
Software often likens itself to traditional engineering disciplines. Those traditional engineering disciplines manage risk through codified standards built through industry consensus. Somebody may build a pressure system that doesn't conform to standards. They don't get to say "well your idea of 'good' is just an opinion so it's subjective". By "professional" standards they have built something outside the acceptable risk envelope and, if it's a regulated engineering domain, they can't use it.
This isn't to mean a coder would have to follow rigid rules constantly or that it needs a regulatory body, but that the practice of deviating from standardized best-practices should be communicated in terms of the risk rather than claiming it's just subjective.
A lot of "best practices" in engineering were established empirically, after root cause analysis of failures and successes. Software is more or less evolving along the same path (structured programming, OOP, higher-than-assembly languages, version control, documented ISAs).
Go back to earlier machines and each version had it's own assembly language and instruction set. Nobody would ever go back to that era.
OOP was pitched as a one-size-fits-all solution to all problems, and as a checklist of items that would turn a cheap offshored programmer into a real software engineer thanks to design patterns and abstractions dictated by a "Software Architect". We all know it to be false, and bordering on snake oil, but it still had some good ideas. Having a class encapsulate complexity and defining interfaces is neat. It forces to think in terms of abstractions and helps readability.
> This isn't to mean a coder would have to follow rigid rules constantly or that it needs a regulatory body, but that the practice of deviating from standardized best-practices should be communicated in terms of the risk rather than claiming it's just subjective.
As more and more years pass, I'm less and less against a regulatory body. Would help with getting rid of snake oil salesman in the industry and limit offshoring to barely qualified coders. And simplify hiring too by having a known certification that tells you someone at least meets a certain bar.
>the practice of deviating from standardized best-practices should be communicated in terms of the risk rather than claiming it's just subjective.
The problem I see with this is that programming could be described as a kind of general problem solving. Other engineering disciplines standardize methods that are far more specific, e.g. how to tighten screws.
It's hard to come up with specific rules for general problems though. Algorithms are just solution descriptions in a language the computer and your colleagues can understand.
When we look at specific domains, e.g. finance and accounting software, we see industry standards have already emerged, like dealing with fixed point numbers instead of floating point to make calculation errors predictable.
If we now start codifying general software engineering, I'm worried we will just codify subjective opinions about general problem solving. And that will stop any kind of improvement.
Instead we have to accept that our discipline is different from the others, and more of a design or craft discipline.
Yes, coding at scale is about managing complexity. No, "Keeping methods short" is not a good way to manage complexity, because...
> then mentally model the entire graph of interactions at once
...partially applies even if you have well-named functional boundaries. You said it yourself:
> The complexity (sum total of possible interactions) grows as the number of lines within a functional boundary grows. The cognitive load to understand code grows as the number of possible interactions grow.
Programs have a certain essential complexity. Making a function "simpler" means making it less complex, which means that that complexity has to go somewhere else. If you make all of your functions simple, then you simply need more functions to represent the same program, which increases the total number of possible interactions between nodes and therefore the cognitive load of understanding the whole graph/program.
Allowing more complexity in your functions makes them individually harder to understand, but reduces the total number of functions needed and therefore makes the entire program more comprehensible.
Also note that just because a function's implementation is complex doesn't mean that its interface also has to be complex.
And, functions with complex implementations are only themselves difficult to understand - functions with complex interfaces make the whole system more difficult to understand.
This is where Occam's Razor applies - do not multiply entities unnecessarily.
Having hundreds or thousands of simple functions is the opposite of this advice.
You can also consider this in more scientific terms.
Code is a mental model of a set of operations. The best possible model has as few moving parts as possible, there are as few connections between the parts as possible, each part is as simple as possible, and both the parts and the connections between them are as intuitively obvious as possible.
Making parts as simple as possible is just one design goal, and not a very satisfactory or useful one in its own terms.
All of this turns out to be incredibly hard, and is a literal IQ test. Mediocre developers will always, always create overcomplicated solutions. Top developers have a magical ability to combine a 10,000 foot overview with ground level detail, and will tear through complex problems and reduce them to elegant simplicity.
IMO we should spend less time teaching algorithms and testing algorithmic specifics, and more on analysing complex systems and implementing them with minimal, elegant, intuitive models.
>If you make all of your functions simple, then you simply need more functions to represent the same program
The semantics of the language and the structure of the code help hide irrelevant functional units from the global namespace. Methods attached to an object only need to be considered when operating on some object, for example. Private methods do not pollute the global namespace nor do they need to be present in any mental model of the application unless it is relevant to the context.
While I do think you can go too far with adding functions for its own sake, I don't see that they add to the cognitive load in the same way that possible interactions within a functional unit does. If you're just polluting a global namespace with functions and tiny objects, then that does similarly increase cognitive load and should be avoided.
> No, "Keeping methods short" is not a good way to manage complexity
Agreed
> Allowing more complexity in your functions makes them individually harder to understand
I think that that can mostly be avoided, by sometime creating local scopes {..} to avoid too much state inside a function, combined with whitespace and some section "header" comments (instead of what would have been sub function names).
Can be quite readable I think. And nice to not have to jump back and forth between myriads of files and functions
I have found this to be one of those A or B developer personas that are hard for someone to change, and causes much disagreement. I personally agree 100%, but have known other people who couldn't disagree more, it is what it is.
I've always felt it had a strong correlation to top-down vs bottom-up thinkers in terms of software design. The top-down folks tend to agree with your stance and the bottom-up group do not. If you're naturally going to want to understand all of the nitty gritty details you want to be able to wrap your head around those as quickly as possible. If you're willing to think in terms of the abstractions you want to remove as many of those details from sight as possible to reduce visual noise.
I wish there was an "auto-flattener"/"auto-inliner" tool that would allow you to automagically turn code that was written top-down, with lots of nicely high-level abstractions, into an equivalent code with all the actions mushed together and with infrastructure layers peeled away as much as possible.
Have you ever seen a codebase with infrastructure and piping taking about 70% of the code, with tiny pieces of business logic thrown here and there? It's impossible to figure out where the actual job is being done (and what it actually is): all you can see is just an endless chain of methods that mostly just delegate the responsibility further and further. What could've been a 100-line loop of "foreach item in worklist, do A, B, C" kind is instead split over seven tightly cooperating classes that devote 45% of their code to multiplexing/load-balancing/messaging/job-spooling/etc, another 45% to building trivial auxiliary structure and instantiating each other, and only 10% actually devoted to the actual data processing, but good luck finding those 10%, because there is a never-ending chain of calling each other: A.do_work() calls B.process_item() which calls A.on_item_processing() which calls B.on_processed()... wait, shouldn't there been some work done between "on_item_processing" and "on_processed"? Yes, it was done by an inconspicuously named "prepare_next_worklist_item" function.
Ah, and the icing on the cake: looping is actually done from the very bottom of this call chain by doing a recursive call to the top-most method which at this point is about 20 layers above the current stack frame. Just so you can walk down this path again, now with the feeling.
While I think you are onto something about top-down vs. bottom-up thinkers, one of the issues with a large codebase is literally nobody can do the whole thing bottom-up. So you need some reasonable conventions and abstraction, or the whole thing falls apart under it's own weight.
I’m reminded of an earlier HN discussion about an article called The Wrong Abstraction, where I argued¹ that abstractions have both a benefit and a cost and that their ratio may change as a program evolves and which of those “nitty gritty details” are immediately relevant and which can helpfully be hidden behind abstractions changes.
The point is that bottom-up code is a siren song. It never scales. It makes it a lot easier to get started, but given enough complexity it inevitably breaks down.
Once your codebase gets to somewhere around the 10,000 line mark, it becomes impossible for a single mind to hold the entire program in their head at a single time. The only way to survive past that point is with carefully thought out, water tight layers of abstractions. That almost never happens with bottom-up. Bottom-up is a lot like natural selection. You get a lot of kludges that work great to solve their immediate problem, but behave in undefined and unpredictable ways when you extend them outside their original environment.
Bottom-up can work when you're inside well-encapsulated modular components with bounded scope and size. But there's no way to keep those modules loosely coupled unless you have a elegant top-down architecture imposing order at the large-scale structure.
> The complexity (sum total of possible interactions) grows as the number of lines within a functional boundary grows.
That's only 1 part of the complexity equation.
When you have 100 lines in 1 function you know exactly the order in which each line will happen and under which conditions by just looking at it.
If you split it into 10 functions 10-lines-long each now you have 10! possible orderings of calling these functions (ignoring loops and branches). And since this ordering is separated into multiple places - you have to keep it in your mind. Good luck inventing naming that will make obvious which of the 3628800 possible orderings is happening without reading through them.
Short functions are good when they fit the problem. Often they don't.
I feel like this is only a problem if the small functions share a lot of global state. If each one acts upon its arguments and returns values without side effects, ordering is much less of an issue IMO.
>If you split it into 10 functions 10-lines-long each now you have 10! possible orderings of calling these functions (ignoring loops and branches). And since this ordering is separated into multiple places - you have to keep it in your mind. Good luck inventing naming that will make obvious which of the 3628800 possible orderings is happening without reading through them.
It's easy to make this argument in the abstract, but harder to demonstrate with a concrete example. Do you happen to have any 100 lines of code that you could provide that would show this as a challenge to compare to the refactored code?
You're missing likely missing one or more techniques that make this work well:
1. Depth first function ordering, so the execution order of the lines in the function is fairly similar to that of the expanded 100 lines. This makes top to bottom readability reasonable.
2. Explicit naming of the functions to make it clear what they do, not just part1(); part2() etc.
3. Similar levels of abstraction in each function (e.g. not having both a for loop, several if statements based on variables defined in the funtion, and 3 method calls, instead having 4-5 method calls doing the same thing).
4. Explicit pre/post conditions in each method are called out due to the passing in of parameters and the return values. This more effectively helps a reader understand the lifecycle of relevant variables etc.
In your example of 100 lines, the counterpoint is that now I have a method that has at least 100 ways it could work / fail. By breaking that up, I have the ability to reason about each use case / failure mode.
I am surprised that this is the top answer (Edit: at the moment, was)
How does splitting code into multiple functions suddenly change the order of the code?
I would expect that these functions would be still called in a very specific order.
And sometimes it does not even make sense to keep this order.
But here is a little example (in a made up pseudo code):
function positiveInt calcMeaningOfLife(positiveInt[] values)
positiveInt total = 0
positiveInt max = 0
for (positiveInti=0; i < values.length; i++)
total = total + values[i]
max = values[i] > max ? values[i] : max
return total - max
===>
function positiveInt max(positiveInt[] values)
positiveInt max = 0
for (positiveInt i=0; i < values.length; i++)
max = values[i] > max ? values[i] : max
return max
function positiveInt total(positiveInt[] values)
positiveInt total = 0
for (positiveInt i=0; i < values.length; i++)
total = total + values[i]
return total
function positiveInt calcMeaningOfLife(positiveInt[] values)
return total(values)-max(values)
There's certainly some difference in priorities between massive 1000-programmer projects where complexity must be aggressively managed and, say, a 3-person team making a simple web app. Different projects will have a different sweet spot in terms of structural complexity versus function complexity. I've seen code that, IMO, misses the sweet spot in either direction.
Sometimes there is too much code in mega-functions, poor separation of concerns and so on. These are easy mistakes to make, especially for beginners, so there are a lot of warnings against them.
Other times you have too many abstractions and too much indirection to serve any useful purpose. The ratio of named things, functional boundaries, and interface definitions to actual instructions can easily get out of hand when people dogmatically apply complexity-managing patterns to things that aren't very complex. Such over-abstraction can fall under YAGNI and waste time/$ as the code becomes slower to navigate, slower to understand in depth, and possibly slower to modify.
I think in software engineering we suffer more from the former problem than the latter problem, but the latter problem is often more frustrating because it's easier to argue for applying nifty patterns and levels of indirection than omitting them.
Just for a tangible example: If I have to iterate over a 3D data structure with an X Y and Z dimension, and use 3 nested loops to do so, is that too complex a function? I'd say no. It's at least as clear without introducing more functional boundaries, which is effort with no benefit.
Well named functions are only half (or maybe a quarter) of the battle. Function documentation is paramount in complex codebases, since documentation should describe various parameters in detail and outline any known issues, side-effects, or general points about calling the function. It's also a good idea to document when a parameter is passed to another function/method.
Yeah, it's a lot of work, but working on recent projects have really taught me the value of good documentation. Naming a function send_records_to_database is fine, but it can't tell you how it determines which database to send the records to, or how it deals with failed records (if at all), or various alternative use cases for the function. All of that must come from documentation (or reading the source of that function).
Plus, I've found that forcing myself to write function documentation, and justify my decisions, has resulted in me putting more consideration into design. When you have to say, "this function reads <some value> name from <environmental variable>" then you have to spend some time considering if future users will find that to be a sound decision.
> documentation should describe various parameters in detail and outline any known issues, side-effects, or general points about calling the function. It's also a good idea to document when a parameter is passed to another function/method.
I'd argue that writing that much documentation about a single function suggests that the function is a problem and the "send_records_to_database" example is a bad name. It's almost inevitable that the function doing so much and having so much behavior that needs documentation will, at some point, be changed and make the documentation subtly wrong, or at least incomplete.
Yikes, I hope I don't have to read documentation to understand how the code deals with failed records or other use cases. Good code would have the use cases separated from the send_records_to_database so it would be obvious what the records were and how failure conditions are handled.
"Plus, I've found that forcing myself to write function documentation, and justify my decisions, has resulted in me putting more consideration into design."
This, this, and... this.
Sometimes, I step back after writing documentation and realise, this is a bunch of baloney. It could be much simpler, or this is a terrible decision! My point: Writing documentation is about expressing the function a second time -- the first time was code, the second time was natural language. Yeah, it's not a perfect 1:1 (see: the law in any developed country!), but it is a good heuristic.
Documentation is only useful it is up to date and correct. I ignore documentation because I've never found the above are true.
There are contract/proof systems that seem like they might work help. At least the tool ensures it is correct. However I'm not sure if such systems are readable. (I've never used one in the real world)
> The idea of code telling a story is that a unit of work should explain what it does through its use of well named variables, function/object names, and how data flows between function/objects.
Code telling a story is a fallacy that programmers keep telling themselves and which fails to die. Code doesn't tell stories, programmers do. Code can't explain why it exists; it can't tell you about the buggy API it relies on and which makes its implementation weird and not straight-forward; it can't say when it's no longer needed.
Good names are important, but it's false that having well-chosen function and arguments names will tell a programmer everything they need to know.
Code can't tell every relevant story, but it can tell a story about how it does what it does. Code is primarily written for other programmers. Writing code in such a way that other people with some familiarity with the problem space can understand easily should be the goal. But this means telling a story to the next reader, the story of how the inputs to some functional unit are translated into its outputs or changes in state. The best way to explain this to another human is almost never the best way to explain it to a computer. But since we have to communicate with other humans and to the computer from the same code, it takes some effort to bridge the two paradigms. Having the code tell a story at the high level by way of the modules, objects and methods being called is how we bridge this gap. But there are better and worse ways to do this.
Software development is a process of translating the natural language-spec of the system into a code-spec. But you can have the natural language-spec embedded in the structure of the code to a large degree. The more, the better.
Your argument falls apart once you need to actually debug one of these monstrosities, as often the bug itself also gets spread out over half a dozen classes and functions, and it's not obvious where to fix it.
More code, more bugs. More hidden code, more hidden bugs. There's a reason those who have worked in software development longer tend to prefer less abstraction: most of them are those who have learned from their experiences, and those who aren't are "architects" optimising for job security.
If a function is only called once it should just be inline, the IDE can collapse. A descriptive comment can replace the function name. It can be a lambda with immediate call and explicit captures if you need to prevent the issue of not knowing which local variables it interacts with as the function grows significantly, or if the concern is others using leftover variables its own can go into a plain scop e. Making you have to jump to a different area of code to read just breaks up linear flow for no gain, especially when you often have to read it anyway to make sure it doesn't have global side effects, might as well read it in the single place it is used.
If it is going to be used more than once and is, then make a function (unless it is so trivial the explicit inline is more readable). If you are designing a public API where it may need to be overridden count it as more than once.
> The best code is code you don't have to read because of well named functional boundaries.
I don't know which is harder. Explaining this about code, or about tests.
The people with no sense of DevX see nothing wrong with writing tests that fail as:
Expected undefined to be "foo"
If you make me read the tests to modify your code, I'm probably going to modify the tests. Once I modify the tests, you have no idea if the new tests still cover all of the same concerns (especially if you wrote tests like the above).
Make the test red before you make it green, so you know what the errors look like.
“There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult.”
― C. A. R. Hoare
This quote does not scale. Software contains essential complexity because it was built to fulfill a need. You can make all of the beautiful, feature-impoverished designs you want - they won't make it to production, and I won't use them, because they don't do the thing.
If your software does not do the thing, then it's not useful, it's a piece of art - not an artifact of software engineering that is meant to fulfill a purpose.
But not everybody codes “at scale”. If you have a small, stable team, there is a lot less to worry about.
Secondly it is often better to start with less abstractions and boundaries, and add them when the need becomes apparent, rather than trying to remove ill conceived boundaries and abstractions that were added at earlier times.
Coding at scale is not dependent on the number of people, but on the essential complexity of the problem. One can fail at a one-man project due to lack of proper abstraction with a sufficiently complex problem. Like, try to write a compiler.
> The idea of code telling a story is that a unit of work should explain what it does through its use of well named variables, function/object names, and how data flows between function/objects. If you have to dig into the details of a function to understand what it does, you have failed to sufficiently explain what the function does through its naming and set of arguments.
That's fine in theory and I still sort-of believe that, but in practice, I came to believe most programming languages are insufficiently expressive for this vision to be true.
Take, as a random example, this bit of C++:
//...
const auto foo = Frobnicate(bar, Quuxify);
Ok, I know what Frobnification is. I know what Quuxify does, it's defined a few lines above. From that single line, I can guess it Frobs every member of bar via Quuxify. But is bar modified? Gotta check the signature of Frobnicate! That means either getting an IDE help popup, or finding the declaration.
template<typename Stuffs, typename Fn>
auto Frobnicate(const std::vector<Stuffs>&, Fn)
-> std::vector<Stuffs>;
From the signature, I can see that bar full of Bars isn't going to be modified. But then I think, is foo.size() going to be equal to bar.size()? What if bar is empty? Can Frobnicate throw an exception? Are there any special constraints on the function Fn passed to it? Does Fn have to be a funcallable thing? Can't tell that until I pop into definition of Frobnicate.
I'll omit the definition here. But now that I see it, I realize that Fn has to be a function of a very particular signature, that Fn is applied to every other element of the input vector (and not all of them, as I assumed), that the code has a bug and will crash if the input vector has less than 2 elements, and it calls three other functions that may or may not have their own restrictions on arguments, and may or may not throw an exception.
If I don't have a fully-configured IDE, I'll likely just ignore it and bear the risk. If I have, I'll routinely jump-to-definition into all these functions, quickly eye them for any potential issues... and, if I have the time, I'll put a comment on top of Frobnicate declaration, documenting everything I just learned - because holy hell, I don't want to waste my time doing the same thing next week. I would rename the function itself to include extra details, but then the name would be 100+ characters long...
Some languages are better at this than others, but my point is, until we have programming languages that can (and force you to) express the entire function contract in its signature and enforce this at compile-time, it's unsafe to assume a given function does what you think it does. Comments would be a decent workaround, if most programmers could be arsed to write them. As it is, you have to dig into the implementation of your dependencies, at least one level deep, if you want to avoid subtle bugs creeping in.
This is a good point and I agree. In fact, I think this really touches on why I always had a hard time understanding C++ code. I first learned to program with C/C++ so I have no problem writing C++, but understanding other people's code has always been much more difficult than other languages. Its facilities for abstraction were (historically) subpar, and even things like aliased variables where you have to jump to the function definition just to see if the parameter will be modified really get in the way of easy comprehension. And then the nested template definitions. You're right that how well relying on well named functional boundaries works depends on the language, and languages aren't at the point where it can be completely relied on.
This is true but having good function names will at least help you avoid going two levels deep. Or N levels. Having a vague understanding of a function call’s purpose from its name helps because you have to trim the search tree somewhere.
Though, if you’re in a nest of tiny forwarding functions, who knows how deep you’ll have to go?
> If you have to dig into the details of a function to understand what it does, you have failed to sufficiently explain what the function does through its naming and set of arguments.
Which is often unavoidable, many functions are insufficiently explained by those alone unless you want four-word camelcase monstrosities for names. The code of the function should be right-sized. Size and complexity need to be balanced there- simpler and easier-to-follow is sometimes larger. I work on compilers, query processors and compute engines, cognitive load from the subject domains are bad enough without making the code arbitrarily shaped.
[edit] oh yes, what jzoch says below. Locality helps with taming the network of complexity between functions and data.
I think we need to recognize the limits of this concept. To reach for an analogy, both Dr. Seuss and Tolstoy wrote well but I'd much rather inherit source code that reads like 10 pages of the former over 10 pages of the latter. You could be a genuine code-naming artist but at the end of the day all I want to do is render the damn HTML.
> If you have to dig into the details of a function to understand what it does, you have failed to sufficiently explain what the function does through its naming and set of arguments.
This isn't always true in my experience. Often when I need to dig into the details of a function it's because how it works is more important than what it says it's doing. There are implementation concerns you can't fit into a function name.
Additionally, I have found that function names become outdated at about the same rate as comments do. If the common criticism of code commenting is that "comments are code you don't run", function names also fall into that category.
I don't have a universal rule on this, I think that managing code complexity is highly application-dependent, and dependent on the size of the team looking at the code, and dependent on the age of the code, and dependent on how fast the code is being iterated on and rewritten. However, in many cases I've started to find that it makes sense to inline certain logic, because you get rid of the risk of names going out of date just like code comments, and you remove any ambiguity over what the code actually does. There are some other benefits as well, but they're beyond the scope of the current conversation.
Perfect abstractions are relatively rare, so in instances where abstractions are likely to be very leaky (which happens more often than people suspect), it is better to be extremely transparent about what the code is doing, rather than hiding it behind a function name.
> The complexity (sum total of possible interactions) grows as the number of lines within a functional boundary grows.
I'll also push back against this line of thought. The sum total of possible interactions do not decrease when you move code out into a separate function. The same number of lines of code still get run, and each line carries the same potential to have a bug. In fact, in many cases, adding additional interfaces between components and generalizing them can increase the number of code paths and potential failure points.
If you define complexity by the sum total of possible interactions (which is itself a problematic definition, but I'll talk about that below), then complexity always increases when you factor out functions, because the interfaces, error-handling, and boilerplate code around those functions increases the number of possible interactions happening during your function call.
> The complexity (sum total of possible interactions) grows as the number of lines within a functional boundary grows.
What I've come to understand is that complexity is relative. A solution that makes a codebase less complex for one person in an organization may make a codebase more complex for someone else in the organization who has different responsibilities over the codebase.
If you are building an application with a large team, and there are clear divisions of responsibilities, then functional boundaries are very helpful because they hide the messy details about how low-level parts of the code work.
However, if you are responsible for maintaining both the high-level and low-level parts of the same codebase, than separating that logic can sometimes make the program harder to manage, because you still have to understand how both parts of the codebase work, but now you also have understand how the interfaces and abstractions between them fit together and what their limitations are.
In single-person projects where I'm the only person touching the codebase I do still use abstractions, but I often opt to limit the number of abstractions, and I inline code more often than I would in a larger project. This is because if I'm the only person working on the code, I need to be able to hold almost the entire codebase in my head at the same time in order to make informed architecture decisions, and managing a large number of abstractions on top of their implementations makes the code harder to reason about and increases the number of things I need to remember. This was a hard-learned lesson for me, but has made (I think) an observable difference in the quality and stability of the code I write.
>> If you have to dig into the details of a function to understand what it does, you have failed to sufficiently explain what the function does through its naming and set of arguments.
> This isn't always true in my experience. Often when I need to dig into the details of a function it's because how it works is more important than what it says it's doing. There are implementation concerns you can't fit into a function name.
Both of these things are not quite right. Yes, if you have to dig into the details of a function to understand what it does, it hasn't been explained well enough. No, the prototype cannot contain enough information to explain it. No, you shouldn't look at the implementation either - that leads to brittle code where you start to rely on the implementation behavior of a function that isn't part of the interface.
The interface and implementation of a function are separate. The former should be clearly-documented - a descriptive name is good, but you'll almost always also need docstrings/comments/other documentation - while you should rarely rely on details of the latter, because if you are, that usually means that the interface isn't defined clearly enough and/or the abstraction boundaries are in the wrong places (modulo things like looking under the hood to refactor, improve performance, etc - all abstractions are somewhat leaky, but you shouldn't be piercing them regularly).
> If you define complexity by the sum total of possible interactions (which is itself a problematic definition, but I'll talk about that below), then complexity always increases when you factor out functions, because the interfaces, error-handling, and boilerplate code around those functions increases the number of possible interactions happening during your function call.
This - this is what everyone who advocates for "small functions" doesn't understand.
Finally! I'm glad to hear I'm not the only one. I've gone against 'Clean Code' zealots that end up writing painfully warped abstractions in the effort to adhere to what is in this book. It's OK to duplicate code in places where the abstractions are far enough apart that the alternative is worse. I've had developers use the 'partial' feature in C# to meet Martin's length restrictions to the point where I have to look through 10-15 files to see the full class.
The examples in this post are excellent examples of the flaws in Martin's absolutism.
"How do you develop good software? First, be a good software developer. Then develop some software."
The problem with all these lists is that they require a sense of judgement that can only be learnt from experience, never from checklists. That's why Uncle Bob's advice is simultaneously so correct, and yet so dangerous with the wrong fingers on the keyboard.
I've also never agreed completely with Uncle Bob. I was an OOP zealot for maybe a decade, and I'm now I'm a Rust convert. The biggest "feature" of Rust is that is probably brought semi-functional concepts to the "OOP masses." I found that, with Rust, I spent far more time solving the problem at hand...
Instead of solving how I am going to solve the problem at hand ("Clean Coding"). What a fucking waste of time, my brain power, and my lifetime keystrokes[1].
I'm starting to see that OOP is more suited to programming literal business logic. The best use for the tool is when you actually have a "Person", "Customer" and "Employee" entities that have to follow some form of business rules.
In contradiction to your "Uncle Sassy's" rules, I'm starting to understand where "Uncle Beck" was coming from:
1. Make it work.
2. Make it right.
3. Make it fast.
The amount of understanding that you can garner from make something work leads very strongly into figuring out the best way to make it right. And you shouldn't be making anything fast, unless you have a profiler and other measurements telling you to do so.
"Clean Coding" just perpetuates all the broken promises of OOP.
These two fall under requirements gathering. It's so often forgotten that software has a specific purpose, a specific set of things it needs to do, and that it should be crafted with those in mind.
> 3. Understand not just where you are but where you are headed
And this is the part that breaks down so often. Because software is simultaneously so easy and so hard to change, people fall into traps both left and right, assuming some dimension of extensibility that never turns out to be important, or assuming something is totally constant when it is not.
I think the best advice here is that YAGNI, don't add functionality for extension unless your requirements gathering suggests you are going to need it. If you have experience building a thing, your spider senses will perk up. If you don't have experience building the thing, can you get some people on your team that do? Or at least ask them? If that is not possible, you want to prototype and fail fast. Be prepared to junk some code along the way.
If you start out not knowing any of these things, and also never junking any code along the way, what are the actual odds you got it right?
>Write code that takes the above 3 into account and make sensible decisions. When something feels wrong ... don't do it.
The problem is that people often need specifics to guide them when they're less experienced. Something that "feels wrong" is usually due to vast experience being incorporated into your subconscious aesthetic judgement. But you can't rely on your subconscious until you've had enough trials to hone your senses. Hard rules can and often are overapplied, but its usually better than the opposite case of someone without good judgement attempting to make unguided judgement calls.
My last company was very into Clean Code, to the point where all new hires were expected to do a book club on it.
My personal take away was that there were a few good ideas, all horribly mangled. The most painful one I remember was his treatment of the Law of Demeter, which, as I recall, was so shallow that he didn't even really even thoroughly explain what the law was trying to accomplish. (Long story short, bounded contexts don't mean much if you're allowed to ignore the boundaries.) So most everyone who read the book came to earnestly believe that the Law of Demeter is about period-to-semicolon ratios, and proceeded to convert something like
val frobnitz = Frobnitz.builder()
.withPixieDust()
.withMayonnaise()
.withTarget(worstTweetEver)
.build();
into
var frobnitzBuilder = Frobnitz.builder();
frobnitzBuilder = frobnitzBuilder.withPixieDust();
frobnitzBuilder = frobnitzBuilder.withMayonnaise();
frobnitzBuilder = frobnitzBuilder.withTarget(worstTweetEver);
val frobnitz = frobnitzBuilder.build();
and somehow convince themselves that doing this was producing tangible business value, and congratulate themselves for substantially improving the long-term maintainability of the code.
Meanwhile, violations of the actual Law of Demeter ran rampant. They just had more semicolons.
On that note, I've never seen an explanation of Law of Demeter that made any kind of sense to me. Both the descriptions I read and the actual uses I've seen boiled down to the type of transformation you just described, which is very much pointless.
> Long story short, bounded contexts don't mean much if you're allowed to ignore the boundaries.
I'd like to read more. Do you know of a source that covers this properly?
I love how this is clearly a contextual recommendation.
I'm not a software developer, but a data scientist. In pandas, to write your manipulations in this chained methods fashing is highly encouraged IMO. It's even called "pandorable" code
I think following some ideas in the book, but ignoring others like the ones applicable for the law of demeter can be a recipe for a mess. The book is very opinionated, but if followed well I think it can produce pretty dead simple code. But at the same time, just like with any coding, experience plays massively into how well code is written. Code can be written well when using his methods or when ignoring his methods and it can be written badly when trying to follow some of his methods or when not using his methods at all.
>his treatment of the Law of Demeter, which, as I recall, was so shallow that he didn't even really even thoroughly explain what the law was trying to accomplish.
oof. I mean, yeah, at least explain what the main thing you’re talking about is about, right? This is a pet peeve.
> It's OK to duplicate code in places where the abstractions are far enough apart that the alternative is worse.
I don't recall where I picked up from, but the best advice I've heard on this is a "Rule of 3". You don't have a "pattern" to abstract until you reach (at least) three duplicates. ("Two is a coincidence, three is pattern. Coincidences happen all the time.") I've found it can be a useful rule of thumb to prevent "premature abstraction" (an understandable relative of "premature optimization"). It is surprising sometimes the things you find out about the abstraction only happen when you reach that third duplicate (variables or control flow decisions, for instance, that seem constants in two places for instance; or a higher level idea of why the code is duplicated that isn't clear from two very far apart points but is clearer when you can "triangulate" what their center is).
I don't hate the rule of 3. But i think it's missing the point.
You want to extract common code if it's the same now, and will always be the same in the future. If it's not going to be the same and you extract it, you now have the pain of making it do two things, or splitting. But if it is going to be the same and you don't extract it, you have the risk of only updating one copy, and then having the other copy do the wrong thing.
For example, i have a program where one component gets data and writes it to files of a certain format in a certain directory, and another component reads those files and processes the data. The code for deciding where the directory is, and what the columns in the files are, must be the same, otherwise the programs cannot do their job. Even though there are only two uses of that code, it makes sense to extract them.
Once you think about it this way, you see that extraction also serves a documentation function. It says that the two call sites of the shared code are related to each other in some fundamental way.
Taking this approach, i might even extract code that is only used once! In my example, if the files contain dates, or other structured data, then it makes sense to have the matching formatting and parsing functions extracted and placed right next to each other, to highlight the fact that they are intimately related.
Also known as, “AbstractMagicObjectFactoryBuilderImpl” that builds exactly one (1) factory type that generates exactly (1) object type with no more than 2 options passed into the builder and 0 options passed into the factory. :-)
The Go proverb is "A little copying is better than a little dependency." Also don't deduplicate 'text' because it's the same, deduplicate implementations if they match in both mechanism 'what' it does as well as their semantic usage. Sometimes the same thing is done with different intents which can naturally diverge and the premature deduplication is debt.
I'm coming to think that the rule of three is important within a fairly constrained context, but that other principle is worthwhile when you're working across contexts.
For example, when I did work at a microservices shop, I was deeply dissatisfied with the way the shared utility library influenced our code. A lot of what was in there was fairly throw-away and would not have been difficult to copy/paste, even to four or more different locations. And the shared nature of the library meant that any change to it was quite expensive. Technically, maybe, but, more importantly, socially. Any change to some corner of the library needed to be negotiated with every other team that was using that part of the library. The risk of the discussion spiraling away into an interminable series of bikesheddy meetings was always hiding in the shadows. So, if it was possible to leave the library function unchanged and get what you needed with a hack, teams tended to choose the hack. The effects of this phenomenon accumulated, over time, to create quite a mess.
At a previous company, there was a Clean Code OOP zealot. I heard him discussing with another colleague about the need to split up a function because it was too long (it was 10 lines). I said, from the sidelines, "yes, because nothing enhances readability like splitting a 10 line function into 10, 1-line functions". He didn't realize I was being sarcastic and nodded in agreement that it would be much better that way.
There seems to be a lot of overlap between the Clean Coders and the Neo Coders [0]. I wish we could get rid of both.
[0] People who strive for "The One" architecture that will allow any change no matter what. Seriously, abstraction out the wazoo!
Honestly. If you're getting data from a bar code scanner and you think, "we should handle the case where we get data from a hot air balloon!" because ... what if?, you should retire.
The problem is that `partial` in C# should never even have been considered as a "solution" to write small, maintainable classes. AFAIK partial was introduced for code-behind files, not to structure human written code.
Anyways, you are not alone with that experience - a common mistake I see, no matter what language or framework, is that people fall for the fallacy "separation into files" is the same as "separation of concerns".
Seriously? That's an abuse of partial and just a way of following the rules without actually following them. That code must have been fun to navigate...
Many years ago I worked on a project that had a hard “no hard coded values” rule, as requested by the customer. The team routinely wrote the equivalent to
const char_a = “a”;
And I couldn’t get my manager to understand why this was a problem.
> where I have to look through 10-15 files to see the full class
The Magento 2 codebase is a good example of this. It's both well written and horrible at the same time. Everything is so spread out into constituent technical components, that the code loses the "narrative" of what's going on.
I started OOP in '96 and I was never able to wrap my head around the code these "Clean Code" zealots produced.
Case in point: Bob Martin's "Video Store" example.
My best guess is that clean code, to them, was as little code on the screen as possible, not necessarily "intention revealing code either", instead everything is abstracted until it looks like it does nothing.
I have had the experience of trying to understand how a feature in a C++ project worked (both Audacity and Aegisub I think) only to find that I actually could not find where anything was implemented, because everything was just a glue that called another piece of glue.
Also sat in their IRC channel for months and the lead developer was constantly discussing how he'd refactor it to be cleaner but never seemed to add code that did something.
SOLID code is a very misleading name for a technique that seems to shred the code into confetti.
I personally don't feel all that productive spending like half my time just navigating the code rather than actually reading it, but maybe it's just me.
What is mostly surprising I find most of developers are trying to obey the "rules". Code containing even minuscule duplication must be DRYied, everyone agrees that code must be clean and professional.
Yet it is never enough, bugs are showing up and stuff that was written by others is always bad.
I start thinking that 'Uncle Bob' and 'Clean code' zealots are actually harmful, because it prevents people from taking two steps back and thinking about what they are doing. Making microservices/components/classes/functions that end up never reused and making DRY holy grail.
Personally I am YAGNI > DRY and a lot of times you are not going to need small functions or magic abstractions.
I think the problem is not the book itself, but people thinking that all the rules apply to all the code, al the time. A length restriction is interesting because it makes you think if maybe you should spit your function into more than one, as you might be doing too much in one place. Now, if splitting will make things worse, then just don't.
In C# and .NET specifically, we find ourselves having a plethora of services when they are "human-readable" and short.
A service has 3 "helper" services it calls, which may, in turn have helper services, or worse, depend on a shared repo project.
The only solution I have found is to move these helpers into their own project, and mark the helpers as internal. This achieves 2 things:
1. The "sub-services" are not confused as stand-alone and only the "main/parent" service can be called.
2. The "module" can now be deployed independently if micro-services ever become a necessity.
I would like feedback on this approach. I do honestly thing files over 100 lines long are unreadable trash, and we have achieved a lot be re-using modular services.
We are 1.5 years into a project and our code re-use is sky-rocketing, which allows us to keep errors low.
Of course, a lot of dependencies also make testing difficult, but allow easier mocks if there are no globals.
>I would like feedback on this approach. I do honestly thing files over 100 lines long are unreadable trash
Dunno if this is the feedback you are after, but I would try to not be such an absolutist. There is no reason that a great 100 line long file becomes unreadable trash if you add one line.
What do you say to convince someone? It’s tricky to review a large carefully abstracted PR that introduces a bunch of new logic and config with something like: “just copy paste lol”
Yes, it's the first working implementation before good boundaries are not yet known. After a while it becomes familiar and natural conceptual boundaries arise that leads to 'factoring' and shouldn't require 'refactoring' because you prematurely guessed the wrong boundaries.
I'm all for the 100-200 line working version--can't say I've had a 500. I did once have a single SQL query that was about 2 full pages pushing the limits of DB2 (needed multiple PTFs just to execute it)--the size was largely from heuristic scope reductions. In the end, it did something in about 3 minutes that had no previous solution.
> WET (Write everything twice), figure out the abstraction after you need something a 3rd time
so much this. it is _much_ easier to refactor copy pasta code, than to entangle a mess of "clean code abstractions" for things that isn't even needed _once_. Premature Abstraction is the biggest problem in my eyes.
I think where DRY trips people up is when you have what I call "incidental repetition". Basically, two bits of logic seem to do exactly the same thing, but the contexts are slightly different. So you make a nice abstraction that works well until you need to modify the functionality in one context and not the other...
So long as it remains identical. Refactoring almost identical code requires lots of extremely detailed staring to determine whether or not two things are subtly different. Especially if you don't have good test coverage to start with.
There's a problem with being overly zealous. It's entirely possible to write bad code, either being overly dry or copy paste galore. I think we are prone to these zealous rules because they are concrete. We want an "objective" measure to judge whether something is good or not.
DRY and WET are terms often used as objective measures of implementations, but that doesn't mean that they are rock solid foundations. What does it mean for something to be "repeated"? Without claiming to have TheGreatAnswer™, some things come to mind.
Chaining methods can be very expressive, easy to follow and maintain. They also lead to a lot of repetition. In an effort to be "DRY", some might embark on a misguided effort to combine them. Maybe start replacing
`map(x => x).reduce(y, z => v)`
with
`mapReduce(x => x, (y,z) => v)`
This would be a bad idea, also known as Suck™.
But there may equally be situations where consolidation makes sense. For example, if we're in an ORM helper class and we're always querying the database for an object like so
I totally agree assuming that there will be time to get to the second pass of the "write everything twice" approach...some of my least favorite refactoring work has been on older code that was liberally copy-pasted by well-intentioned developers expecting a chance to come back through later but who never get the chance. All too often the winds of corporate decision making will change and send attention elsewhere at the wrong moment, and all those copy pasted bits will slowly but surely drift apart as unfamiliar new developers come through making small tweaks.
I worked on a small team with a very "code bro" culture. No toxic, but definitely making non-PC jokes. We would often say "Ask your doctor about Premature Abstractuation" or "Bad news, dr. says this code has abstractual dysfunction" in code reviews when someone would build an AbstractFactoryFactoryTemplateConstructor for a one-off item.
When we got absorbed by a larger team and were going to have to "merge" our code review / git history into a larger org's repos, we learned that a sister team had gotten in big trouble with the language cops in HR when they discovered similar language in their git commit history. This brings back memories of my team panicked over trying to rewrite a huge amount of git history and code review stuff to sanitize our language before we were caught too.
> it is _much_ easier to refactor copy pasta code,
Its easy to refactor if its nondivergent copypasta and you do it everywhere it is used not later than the third iteration.
If the refactoring gets delayed, the code diverges because different bugs are noticed and fixed (or thr same bug is noticed and fixed different ways) in different iterations, and there are dozens of instances across the code base (possibly in different projects because it was copypastad across projects rather than refactored into a reusable library), the code has in many cases gotten intermixed with code addressing other concerns...
I think this ties in to something I've been thinking, though it might be project specific.
Good code should be written to be easy to delete.
'Clever' abstractions work against this. We should be less precious about our code and realise it will probably need to change beyond all recognition multiple times. Code should do things simply so the consequences of deleting it are immediately obvious. I think your recommendations fit with this.
Aligns with my current meta-principle, which is that good code is malleable (easily modified, which includes deletion). A lot of design principles simply describe this principle from different angles. Readable code is easy to modify because you can understand it. Terse code is more easily modified because there’s less of it (unless you’ve sacrificed readability for terseness). SRP limits the scope of changes and thus enhances modifiability. Code with tests is easier to modify because you can refactor with less fear. Immutability makes code easier to modify because you don’t have to worry about state changes affecting disparate parts of the program.
Etc... etc...
(Not saying that this is the only quality of good code or that you won’t have to trade some of the above for performance or whatnot at times).
The unpleasant implication of this is that code has a tendency towards becoming worse over time. Because the code that is good enough to be easy to delete or change is, and the code that is too bad to be touched remains.
> * WET (Write everything twice), figure out the abstraction after you need something a 3rd time
There are two opposite situations. One is when several things are viewed as one thing while they're actually different (too much abstraction), and another, where a thing is viewed as different things, when it's actually a single one (when code is just copied over).
In my experience, the best way to solve this is to better analyse and understand the requirements. Do these two pieces of code look the same because they actually mean thing in the meaning of the product? Or they just happen to look the same at this particular moment in time, and can continue to develop in completely different directions as the product grows?
I read Clean Code in 2010 and trying out and applying some of the principles really helped to make my code more maintainable.
Now over 10 years later I have come to realize that you cannot set up too many rules on how to structure and write code. It is like forcing all authors to apply the same writing style or all artists to draw their paintings with the exact same technique.
With that analogy in mind, I think that one of the biggest contributors to messy code is having a lot of developers, all with different preferences, working in the same code base. Just imagine having 100 different writers trying to write a book, this is the challenge we are trying to address.
I'm not sure that's really true. Any publication with 100 different writers almost certainly has some kind of style guide that they all have to follow.
If it's really abstractable it shouldn't be difficult to refactor. It should literally be a substitution. If it's not, then you have varied cases that you'd have to go back and tinker with the abstraction to support.
It's a similar design and planning principle to building sidewalks. You have buildings but you don't know exactly the best paths between everything and how to correctly path things out. You can come up with your own design but people will end up ignoring them if they don't fit their needs. Ultimately, you put some obvious direct connection side walks and then wait to see the paths people take. You've now established where you need connections and how they need to be formed.
I do a lot of prototyping work and if I had to sit down and think out a clean abstraction everytime I wanted to get for a functional prototype, I'd never have a functional prototype--plus I'd waste a lot of cognitive capacity on an abstraction instead of solving the problem my code is addressing. It's best, from my experience, to save that time and write messy code but tuck in budget to refactor later (the key is you have to actually refactor later not just say you will).
Once you've built your prototype, iterated on it several times had people break designs forcing hacked out solutions, and now have something you don't touch often, you usually know what most the product/service needs to look like. You then abstract that out and get 80-90% of what you need if there's real demand.
The expanded features beyond that can be costly if they require significant redesign but at that point, you hopefully have a stable enough product it can warrant continued investment to refactor. If it doesn't, you saved yourself a lot of time and energy worrying trying to create a good abstract design that tends to fail multiple times at early stages. There's a balance point of knowing when to build technical debt, when to pay it off, and when to nullify it.
Again, the critical trick is you have to actually pay off the tech debt if that time comes. The product investor can't look bright eyed and linearly extrapolate progress so far thinking they saved a boatload of capital, they have to understand shortcuts were taken and the rationale was to fix them if serious money came along or chuck them in the bin if not.
WET is great until JSON token parsing breaks and a junior dev fixes it in one place and then I am fixing the same exact problem somewhere else and moving it into a shared file. If it's the exact same functionality, move it into a service/helper.
How do you deal with other colleagues that have all the energy and time to push for these practices and I feel makes things worse than the current state?
Explain that the wrong abstraction makes code more complicated than copy-paste and that before you can start factoring out common code you need to be sure the relationship is fundamental and not coincidental.
> figure out the abstraction after you need something a 3rd time
That's still too much of a "rule".
Whenever I feel (or know) two functions are similar, the factors that determine if I should merge them:
- I see significant benefit too doing so, usually the benefit of a utility that saves writing the same thing in the future, or debugging the same/similar code repeatedly.
- How likely the code is to diverge. Sometimes I just mark things for de-duping, but leave it around a while to see if one of the functions change.
- The function is big enough it cannot just be in-lined where it is called, and the benefit of de-duplication is not outweighed by added complexity to the call stack.
Documentation is rarely adequately maintained, and nothing enforces that it stay accurate and maintained.
Comments in code can lie (they're not functional); can be misplaced (in most languages, they're not attached to the code they document in any enforced way); are most-frequently used to describe things that wouldn't require documenting if they were just named properly; are often little more than noise. Code comments should be exceedingly-rare, and only used to describe exception situations or logic that can't be made more clear through the use of better identifiers or better-composed functions.
External documentation is usually out-of-sight, out-of-mind. Over time, it diverges from reality, to the point that it's usually misleading or wrong. It's not visible in the code (and this isn't an argument in favor of in-code comments). Maintaining it is a burden. There's no agreed-upon standard for how to present or navigate it.
The best way to document things is to name identifiers well, write functions that are well-composed and small enough to understand, stick to single-responsibility principles.
API documentation is important and valuable, especially when your IDE can provide it readily at the point of use. Whenever possible, it should be part of the source code in a formal way, using annotations or other mechanisms tied to the code it describes. I wish more languages would formally include annotation mechanisms for this specific use case.
One of the most difficult to argue comments in code reviews: “let’s make it generic in case we need it some place else”.
First of all, chances that we need it some place else aren’t exactly high, unless you are writing a library and code explicitly design to be shared. And even if such need arises, chances of getting it right generalizing from one example are slim.
Regarding the book though, I have participated in one of the workshops with the author and he seemed to be in favor of WER and against “architecting” levels of abstraction before having concrete examples.
You can disagree over what exactly is clean code. But you will learn to distinguish what dirty code is when you try to maintain it.
As a person that has had to maintain dirty code over the years, hearing someone saying dirty code doesn't exist is really frustrating. Noone wants to clean up your code, but doing it is better than allowing the code to become unmaintainable, that's why people bring up that book. If you do not care about what clean code is, stop making life difficult for people that do.
I think it's more that clean code doesn't exist because there's no objective measure of this (and those services that claim there are are just as dangerous as Clean Code, the book); anyone can come along and find something about the code that could be tidied up. And legacy is legacy, it's a different problem space to the one a greenfield project exists in.
> As a person that has to maintain dirty code
This is a strange credential to present and then use as a basis to be offended. Are you saying that you have dirty code and have to keep it dirty?
This is what I'm doing even while creating new code. There's a few instances for example where the "execution" is down to a single argument - one of "activate", "reactivate" and "deactivate". But I've made them into three distinct, separate code paths so that I can work error and feedback messages into everything without adding complexity via arguments.
I mean yes it's more verbose, BUT it's also super clear and obvious what things do, and they do not leak the underlying implementation.
I’ve never heard the term WET before but that’s exactly what I do.
The other key thing I think is not to over-engineer abstractions you don’t need yet. But to try and leave ‘seams’ where it’s obvious how to tease code about if you need to start building abstractions.
My experience interviewing recently a number of consultants with only a few years experience was the more they mumbled clean code the less they knew what they were doing.
That's not what WET means. The GP is saying you shouldn't isolate logic in a function until you've cut-and-pasted the logic in at least two places and plan to do so in a third.
> Put logic closest to where it needs to live (feature folders)
Can you say more about this?
I think I may have stumbled on a similar insight myself. In a side project (a roguelike game), I've been experimenting with a design that treats features as first-class, composable design units. Here is a list of the subfolder called game-features in the source tree:
actions
collision
control
death
destructibility
game-feature.lisp
hearing
kinematics
log
lore
package.lisp
rendering
sight
simulation
transform
An extract from the docstring of the entire game-feature package:
"A Game Feature is responsible for providing components, events,
event handlers, queries and other utilities implementing a given
aspect of the game. It's primarily a organization tool for gameplay code.
Each individual Game Feature is represented by a class inheriting
from `SAAT/GF:GAME-FEATURE'. To make use of a Game Feature,
an object of such class should be created, preferably in a
system description (see `SAAT/DI').
This way, all rules of the game are determined by a collection of
Game Features loaded in a given game.
Game Features may depend on other Game Features; this is represented
through dependencies of their classes."
The project is still very much work-in-progress (procrastinating on HN doesn't leave me much time to work on it), and most of the above features are nowhere near completion, but I found the design to be mostly sound. Each game feature provides code that implements its own concerns, and exports various functions and data structures for other game features to use. This is an inversion of traditional design, and is more similar to the ECS pattern, except I bucket all conceptually related things in one place. ECS Components and Systems, utility code, event definitions, etc. that implement a single conceptual game aspect live in the same folder. Inter-feature dependencies are made explicit, and game "superstructure" is designed to allow GFs to wire themselves into appropriate places in the event loop, datastore, etc. - so in game startup code, I just declare which features I want to have enabled.
(Each feature also gets its set of integration tests that use synthetic scenarios to verify a particular aspect of the game works as I want it to.)
One negative side effect of this design is that the execution order of handlers for any given event is hard to determine from code. That's because, to have game features easily compose, GFs can request particular ordering themselves (e.g. "death" can demand its event handler to be executed after "destructibility" but before "log") - so at startup, I get an ordering preference graph that I reconcile and linearize (via topological sorting). I work around this and related issues by adding debug utilities - e.g. some extra code that can, after game startup, generate a PlantUML/GraphViz picture of all events, event handlers, and their ordering.
(I apologize for a long comment, it's a bit of work I always wanted to talk about with someone, but never got around to. The source of the game isn't public right now because I'm afraid of airing my hot garbage code.)
I'd be interested in how you attempt this. Is it all in lisp?
It might be hard to integrate related things, e.g. physical simulation/kinematics <- related to collisions, and maybe sight/hearing <- related to rendering; Which is all great if information flows one way, as a tree, but maybe complicated if it's a graph with intercommunication.
I thought about this before, and figured maybe the design could be initially very loose (and inefficient), but then a constraint-solver could wire things up as needed, i.e. pre-calculate concerns/dependencies.
Another idea, since you mention "logs" as a GF: AOP - using " join points" to declaratively annotate code. This better handles code that is less of a "module" (appropriate for functions and libraries) and more of a cross-cutting "aspect" like logging. This can also get hairy though: could you treated "(bad-path) exception handling" as an aspect? what about "security"?
I've gone down roads similar to this. Long story short - the architecture solves for a lower priority class of problem, w/r to games, so it doesn't pay a great dividend, and you add a combination of boilerplate and dynamism that slows down development.
Your top issue in the runtime game loop is always with concurrency and synchronization logic - e.g. A spawns before B, if A's hitbox overlaps with B, is the first frame that a collision event occurs the frame of spawning or one frame after? That's the kind of issue that is hard to catch, occurs not often, and often has some kind of catastrophic impact if handled wrongly. But the actual effect of the event is usually a one-liner like "set a stun timer" - there is nothing to test with respect to the event itself! The perceived behavior is intimately coupled to when its processing occurs and when the effects are "felt" elsewhere in the loop - everything's tied to some kind of clock, whether it's the CPU clock, the rendered frame, turn-taking, or an abstracted timer. These kinds of bugs are a matter of bad specification, rather than bad implementation, so they resist automated testing mightily.
The most straightforward solution is, failing pure functions, to write more inline code(there is a John Carmack posting on inline code that I often use as a reference point). Enforce a static order of events as often as possible. Then debugging is always a matter of "does A happen before B?" It's there in the source code, and you don't need tooling to spot the issue.
The other part of this is, how do you load and initialize the scene? And that's a data problem that does call for more complex dependency management - but again, most games will aim to solve it statically in the build process of the game's assets, and reduce the amount of game state being serialized to save games, reducing the complexity surface of everything related to saves(versioning, corruption, etc). With a roguelike there is more of an impetus to build a lot of dynamic assets(dungeon maps, item placements etc.) which leads to a larger serialization footprint. But ultimately the focus of all of this is on getting the data to a place where you can bring it back up and run queries on it, and that's the kind of thing where you could theoretically use SQLite and have a very flexible runtime data model with a robust query system - but fully exploiting it wouldn't have the level of performance that's expected for a game.
Now, where can your system make sense? Where the game loop is actually dynamic in its function - i.e. modding APIs. But this tends to be a thing you approach gradually and grudgingly, because modders aren't any better at solving concurrency bugs and they are less incentivized to play nice with other mods, so they will always default to hacking in something that stomps the state, creating intermittent race conditions. So in practice you are likely to just have specific feature points where an API can exist(e.g. add a new "on hit" behavior that conditionally changes the one-liner), and those might impose some generalized concurrency logic.
The other thing that might help is to have a language that actually understands that you want to do this decoupling and has the tooling built in to do constraint logic programming and enforce the "musts" and "cannots" at source level. I don't know of a language that really addresses this well for the use case of game loops - it entails having a whole general-purpose language already and then also this other feature. Big project.
I've been taking the approach instead of aiming to develop "little languages" that compose well for certain kinds of features - e.g. instead of programming a finite state machine by hand for each type of NPC, devise a subcategory of state machines that I could describe as a one-liner, with chunks of fixed-function behavior and a bit of programmability. Instead of a universal graphics system, have various programmable painter systems that can manipulate cursors or selections to describe an image. The concurrency stays mostly static, but the little languages drive the dynamic behavior, and because they are small, they are easy to provide some tooling for.
I think one should always be careful not to throw out the baby with the bathwater[0].
Do I force myself to follow every single crazy rule in Clean Code? Heck no. Some of them I don't agree with. But do I find myself to be a better coder because of what I learned from Bob Martin? Heck yes. Most of the points he makes are insightful and I apply them daily in my job.
Being a professional means learning from many sources and knowing that there's something to learn from each of them- and some things to ignore from each of them. It means trying the things the book recommends, and judging the results yourself.
So I'm going to keep recommending Clean Code to new developers, in the hopes that they can learn the good bits, and learn to ignore the bad bits. Because so far, I haven't found a book with more good bits (from my perspective) and fewer bad bits (from my perspective).
I'm completely with you here. Until I read Clean Code, I could never really figure out why my personal projects were so unreadable a year later but the code I read at work was so much better even though it was 8 years old. Sure, I probably took things too far for a while and made my functions too small, or my classes too small, or was too nitpicky on code reviews. But I started to actually think about where I should break a function. I realized that a good name could eliminate almost all the comments I had been writing before, leaving only comments that were actually needed. And as I learned how to break down my code, I was forced to learn how to use my IDE to navigate around. All of a sudden new files weren't a big deal, and that opened up a whole new set of changes that I could start making.
I see a lot of people in here acting like all the advice in Clean Code is obviously true or obviously false, and they claim to know how to write a better book. But, like you, I will continue to recommend Clean Code to new developers on my team. It's the fastest way (that I've found so far, though I see other suggestions in the comments here) to get someone to transition from writing "homework code" (that never has to be revisited) to writing maintainable code. Obviously, there are bad parts of Clean Code, but if that new dev is on my team, I'll talk through why certain parts are less useful than others.
Perfect, Its definitly my personal impression, but while reading the post it looks like the author was looking for a "one size fits all" book and was dissapointed they did not find it.
And to be honest that book will never exist, every knowledge contributes to growing as a professional, just make sure to understand, discuss, and use it (or not) for a real reason, not just becaue its on book A or B.
Its not like people need to choose one book and follow it blindly for the rest of their lives, read more books :D
In my opinion the problem is not that the rules are not one size fits all, but that they are so misguided that Martin himself couldn't come up with a piece of code where they would lead to a good result.
One mistake I think people like the author make is treating these books as some sort of bible that you must follow to the letter. People who
evangelised TDD were the worst offenders of this. "You HAVE to do it like this, it's what the book says!"
You're not supposed to take it literally for every project, these are concepts that you need to adapt to your needs. In that sense I think the book still holds up.
For me this maps so clearly to the Dreyfus model of skill acquisition. Novices need strict rules to guide their behavior. Experts are able to use intuition they have developed. When something new comes along, everyone seems like a novice for a little while.
The Dreyfus model identifies 5 skill levels:
Novice
Wants to achieve a goal, and not particularly interested in learning.
Requires context free rules to follow.
When something unexpected happens will get stuck.
Advanced Beginner
Beginning to break away from fixed rules.
Can accomplish tasks on own, but still has difficulty troubleshooting.
Wants information fast.
Competent
Developed a conceptual model of task environment.
Able to troubleshoot.
Beginning to solve novel problems.
Seeks out and solve problems.
Shows initiative and resourcefulness.
May still have trouble determining which details to focus on when solving a problem.
Proficient
Needs the big picture.
Able to reflect on approach in order to perform better next time.
Learns from experience of others.
Applies maxims and patterns.
Expert
Primary source of knowledge and information in a field.
Constantly look for better ways of doing things.
Write books and articles and does the lecture circuit.
Work from intuition.
Knows the difference between irrelevant and important details.
> Primary source of knowledge and information in a field. Constantly look for better ways of doing things. Write books and articles and does the lecture circuit.
Meh. I'm probably being picky, but it doesn't surprise me that a Thought Leader would put themselves and what they do as Thought Leader in the Expert category. I see them more as running along a parallel track. They write books and run consulting companies and speak at conferences and create a brand, and then there are those of us who get good at writing code because we do it every day, year after year. Kind of exactly the difference between sports commentators and athletes.
The problem is that the book presents things that are at best 60/40 issues as hard rules, which leads novices++ follow them to the detriment of everything else.
Uncle Bob himself acts like it is a bible, so if you buy into the rest of his crap then you'll likely buy into that too.
If treated as guidelines you are correct Clean Code is only eh instead of garbage. But taken in the full context of how it is presented/intended to be taken by the author it is damaging to the industry.
I've read his blog and watched his videos. While his attitude comes off as evangelical, his actual advice is very often "Do it when it makes sense", "There are exceptions - use engineering judgment", etc.
Yup. I see the book as guide to a general goal, not a specific objective that can be defined. To actually reach that goal is sometimes completely impossible and in many other cases it introduces too much complexity.
However, in most cases heading towards that goal is a beneficial thing--you just have to recognize when you're getting too close and bogging down in complying with every detail.
I still consider it the best programming book I've ever read.
I understand that the people that follow Clean Code religiously are annoying, but the author seems to be doing the same thing in reverse: because some advice is nuanced or doesn't apply all the time then we should stop recommending the book and forget it altogether.
I agree with the sentiment that you don't want to over abstract, but Bob doesn't suggest that (as far as I know). He suggests extract till you drop, meaning simplify your functions down to doing one thing and one thing only and then compose them together.
Hands down, one of the best bits I learned from Bob was the "your code should read like well-written prose." That has enabled me to write some seriously easy to maintain code.
That strikes me as being too vague to be of practical use. I suspect the worst programmers can convince themselves their code is akin to poetry, as bad programmers are almost by definition unable to tell the difference. (Thinking back to when I was learning programming, I'm sure that was true of me.) To be valuable, advice needs to be specific.
If you see a pattern of a junior developer committing unacceptably poor quality code, I doubt it would be productive to tell them Try to make it read more like prose. Instead you'd give more concrete advice, such as choosing good variable names, or the SOLID principles, or judicious use of comments, or sensible indentation.
Perhaps I'm missing something though. In what way was the code should read like well-written prose advice helpful to you?
Specifically in relation to naming. I was caught up in the dogma of "things need to be short" (e.g., using silly variable names like getConf instead of getWebpackConfig). The difference is subtle, but that combined with reading my code aloud to see if it reads like a sentence ("prose") is helpful.
"This module is going to generate a password reset token. First, it's going to make sure we have an emailAddress as an input, then it's going to generate a random string which I'll refer to as token, and then I want to set that token on the user with this email address."
I'm in the "code should read like poetry" camp. Poetry is the act of conveying meaning that isn't completely semantic - meter and rhyme being the primary examples. In code, that can mean maintaining a cadence of variable names, use of whitespace that helps illuminate structure, or writing blocks or classes where the appearance of the code itself has some mapping to what it does. You can kludge a solution together, or craft a context in which the suchness of what you are trying to convey becomes clear in a narrative climax.
Code should be simple and tight and small. It should also, however, strive for an eighth grade reading level.
You shouldn't try to make your classes so small that you're abusing something like nested ternary operators which are difficult to read. You shouldn't try to break up your concepts so much that while the sentences are easy, the meaning of the whole class becomes muddled. You should stick with concepts everyone knows and not try to invent your own domain specific language in every class.
Less code is always more, right up until it becomes difficult to read, then you've gone too far. On the other hand if you extract a helper method from a method which read fine to begin with, then you've made the code harder to read, not easier, because its now bigger with an extra concept. But if that was a horrible conditional with four clauses which you can express with a "NeedsFrobbing" method and a comment about it, then carry on (generating four methods from that conditional to "simplify" it is usually worse, though, due to the introduction of four concepts that could be often better addressed with just some judicious whitespace to separate them).
And I need to learn how to write in English more like Hemmingway, particularly before I've digested coffee. That last paragraph got away from me a bit.
Absolutely this. Code should tell a story, the functions and objects you use are defined by the context of the story at that level of description. If you have to translate between low-level operations to reconstruct the high level behavior of some unit of work, you are missing some opportunities for useful abstractions.
Coding at scale is about managing complexity. The best code is code you don't have to read because of well named functional boundaries. Natural language is our facility for managing complexity generally. It shouldn't be surprising that the two are mutually beneficial.
I tried to write code with small functions and was dissuaded from doing that at both my teams over the past few years. The reason is that it can be hard to follow the logic if it's spread out among several functions. Jumping back and forth breaks your flow of thought.
I think the best compromise is small summary comments at various points of functions that "hold the entire thought".
The point of isolating abstractions is that you don't have to jump and back and forth. You look at a function, and you understand from its contract and calling convention you immediately know what it does. The specific details aren't relevant for the layer of abstraction you're looking at.
Because of well structured abstractions, thoughtful naming conventions, documentation where required, and extensive testing you trust that the function does what it says. If I'm looking at a function like commitPending(), I simply see writeToDisk() and move on. I'm in the object representation layer, and jumping down into the details of the I/O layers breaks flow by moving to a different level of abstraction. The point is I trust writeToDisk() behaves reasonably and safely, and I don't need to inspect its contents, and definitely don't want to inline its code.
If you find that you frequently need to jump down the tree from sub-routine to sub-routine to understand the high level code, then that's a definite code smell. Most likely something is fundamentally broken in your abstraction model.
Check out the try/catch and logging pattern I use in the linked post. I added that specifically so I could identify where errors were ocurring without having to guess.
When I get the error in the console/browser, the path to the error is included for me like "[generatePasswordResetToken.setTokenOnUser] Must pass value to $set to perform an update."
With that, I know exactly where the error is ocurring and can jump straight into debugging it.
Nice! However, none of this is required for this endpoint. Here's why:
1. The connect action could be replaced by doing the connection once on app startup.
2. The validation could be replaced with middleware like express-joi.
3. The stripe/email steps should be asynchronous (ex: simple crons). This way, you create the user and that's it. If Stripe is down, or the email provider is down, you still create the user. If the server restarts while someone calls the endpoint, you don't end up with a user with invalid Stripe config. You just create a user with stripeSetup=false and welcomeEmailSent=false and have some crons that every 5 seconds query for these users and do their work. Also, ensure you query for false and not "not equal to true" here as it's not efficient.
Off topic but is connecting to Mongo on every API hit best practice? I abstract my connection to a module and keep that open for the life of the application.
Yes, that one did a lot to me too. Especially when business logic gets complicated, I want to be able to skip parts by roughly reading meaning of the section without seeing details.
One long stream of commands is ok to read, if you are author or already know what it should do. But otherwise it forces you to read too many irrelevant details on a way toward what you need.
Robert Martin and his aura always struck me as odd. In part because of how revered he always was at organizations I worked. Senior developers would use his work to end arguments, and many code reviews discussions would be judged by how closely they adhere to Clean Code.
Of course reading Clean Code left me more confused than enlightened due precisely to what he presents as good examples of Code. The author of the article really does hit the nail on the head about Martin's code style - it's borderline unreadable a lot of times.
Who the f. even is Robert Martin?! What has he built? As far as I am able to see he is famous and revered because he is famous and revered.
He ran a consultancy and knew how to pump out books into a world of programmers that wanted books
I was around in the early 90s through to the early 2000s when a lot of the ideas came about slowly got morphed by consultants who were selling this stuff to companies as essentially "religion". The nuanced thoughts of a lot of the people who had most of the original core ideas is mostly lost.
It's a tricky situation, at the core of things, there are some really good ideas, but the messaging by people like "uncle bob" seem to fail to communicate the mindset in a way that develops thinking programmers. Mainly because him, and people like Ron Jerfferies, really didn't actually build anything serious once they became consultants and started giving out all these edicts. If you watched them on forums/blogs at the time, they were really not that good. There were lots of people who were building real things and had great perspectives, but their nuanced perspectives were never really captured into books, and it would be hard to as it is more about the mentality of using ideas and principles and making good pragmatic choices and adapting things and not being limited by "rules" but about incorporating the essence of the ideas into your thinking processes.
So many of those people walked away from a lot of those communities when it morphed into "Agile" and started being dominated by the consultants.
10 or so years ago when I first got into development I looked to people like Martin's for how I should write code.
But I had more and more difficulty reconciling bizarrely optimistic patterns with reality. This from the article perfectly sums it up:
>Martin says that functions should not be large enough to hold nested control structures (conditionals and loops); equivalently, they should not be indented to more than two levels.
Back then as now I could not understand how one person can make such confident and unambiguous statements about business logic across the spectrum of use cases and applications.
It's one thing to say how something should be written in ideal circumstances, it's another to essentially say code is spaghetti garbage because it doesn't precisely align to a very specific dogma.
This is the point that I have the most trouble understanding in critiques of Fowler, Bob, and all writers who write about coding: in my reading, I had always assumed that they were writing about the perfect-world ideal that needs to be balanced with real-world situations. There's a certain level of bluster and over-confidence required in that type of technical writing that I understood to be a necessary evil in order to get points across. After all, a book full of qualifications will fail to inspire confidence in its own advice.
This is true only for people first coming to development. If you're just starting your journey, you are likely looking for quantifiable absolutes as to what is good and what isn't.
After you're a bit more seasoned, I think qualified comments are probably far more welcome than absolutes.
> After all, a book full of qualifications will fail to inspire confidence in its own advice.
I don't think that's true at all. One of the old 'erlang bibles' is "learn you some erlang" and it full of qualifications titled "don't drink the kool-aid" (notably not there in the haskell inspiration for the book). It does not fail to inspire confidence to have qualifications scattered throughout and to me it actually gives me MORE confidence that the content is applicable and the tradeoffs are worth it.
Now, some people will claim that having 8-character indentations makes
the code move too far to the right, and makes it hard to read on a
80-character terminal screen. The answer to that is that if you need
more than 3 levels of indentation, you're screwed anyway, and should fix
your program.
The big problem that I have with Clean Code -- and with its sequel, Clean Architecture -- is that for its most zealous proponents, it has ceased to be a means to an end and has instead become an end in itself. So they'll justify their approach by citing one or other of the SOLID principles, but they won't explain what benefit that particular SOLID principle is going to offer them in that particular case.
The point that I make about patterns and practices in programming is that they need to justify their existence in terms of value that they provide to the end user, to the customer, or to the business. If they can't provide clear evidence that they actually provide those benefits, or if they only provide benefits that the business isn't asking for, then they're just wasting time and money.
One example that Uncle Bob Martin hammers home a lot is separation of concerns. Separation of concerns can make your code a lot easier to read and maintain if it's done right -- unit testing is one good example here. But when it ceases to be a means to an end and becomes an end in itself, or when it tries to solve problems that the business isn't asking for, it degenerates into speculative generality. That's why you'll find project after project after project after project after project with cumbersome and obstructive data access layers just because you "might" want to swap out your database for some unknown mystery alternative some day.
I don’t disagree with the overall message or choice of examples behind this post, but one paragraph stuck out to me:
> Martin says that it should be possible to read a single source file from top to bottom as narrative, with the level of abstraction in each function descending as we read on, each function calling out to others further down. This is far from universally relevant. Many source files, I would even say most source files, cannot be neatly hierarchised in this way.
The relevance is a fair criticism but most programs in most languages can in fact be hierarchized this way, with the small number of mutually interdependent code safely separated. Many functional languages actually enforce this.
As an F# developer it can be very painful to read C# programs even though I often find C# files very elegant and readable: it just seems like a book, presented out of order, and without page numbers. Whereas an .fsproj file provides a robust reading order.
> "with the level of abstraction in each function descending as we read on, each function calling out to others further down." ...
> Many functional languages actually enforce this.
Don't they enforce the opposite? In ML languages (I don't know F# but I thought it was an ML dialect), you can generally only call functions that were defined previously.
Of course, having a clear hierarchy is nice whether it goes from most to least abstract, or the other way around, but I think Martin is recommending the opposite from what you are used to.
Hmm, perhaps I am misreading this? Your understanding of ML languages is correct. I have always found “Uncle Bob” condescending and obnoxious so I can’t speak to the actual source material.
I am putting more emphasis on the “reading top-to-bottom” aspect and less on the level of abstraction itself (might be why I’m misreading it). My understanding was that Bob sez a function shouldn’t call any “helper” functions until the helpers have been defined - if it did, you wouldn’t be able to “read” it. But with your comment, maybe he meant that you should define your lower-level functions as prototypes, implement the higher-level functions completely, then fill in the details for the lower functions at the bottom. Which is situationally useful but yeah, overkill as a hard rule.
In ML and F# you can certainly call interfaces before providing an implementation, as long as you define the interface first. Whereas in C# you can define the interface last and call it all you want beforehand. This is what I find confusing, to the point of being bad practice in most cases.
So even if I misread specifically what (the post said) Bob was saying, I think the overall idea is what Bob had in mind.
> In ML languages, you can generally only call functions that were defined previously.
Hum... At least not in Haskell.
Starting with the mostly dependent code makes a large difference in readability. It's much better to open your file and see what are the overall functions. The alternative is browsing to find it, even when it's on the bottom. Since you read functions from the top to the bottom, locating the bottom of the function isn't much of a help to read it.
1 - The dependency order does not imply on any ordering in abstraction. Both can change in opposite directions just as well as on the same.
We follow this approach closely - the problem is that people confuse helper services for first-order services and call them directly leading to confusion. I don't know how to avoid this without moving the "main" service to a separate project and having `internal` helper services. DI for class libraries in .NET Core is also hacky if you don't want to import every single service explicitly.
Is there a reason why private/internal qualifiers aren’t sufficient? Possibly within the same namespace / partial class if you want to break it up?
As I type this out, I suppose “people don’t use access modifiers when they should” is a defensible reason.... I also think the InternalsVisibleTo attribute should be used more widely for testing.
> But mixed into the chapter there are more questionable assertions. Martin says that Boolean flag arguments are bad practice, which I agree with, because an unadorned true or false in source code is opaque and unclear versus an explicit IS_SUITE or IS_NOT_SUITE... but Martin's reasoning is rather that a Boolean argument means that a function does more than one thing, which it shouldn't.
I see how this can be polemic because most code is littered w/ flags, but I tend to agree that boolean flags can be an anti-pattern (even though it's apparently idiomatic in some languages).
Usually the flag is there to introduce a branching condition (effectively breaking "a function should do one thing") but don't carry any semantic on it's own. I find the same can be achieved w/ polymorphism and/or pattern-matching, the benefit being now your behaviour is part of the data model (the first argument) which is easier to reason about, document, and extend to new cases (don't need to keep passing flags down the call chain).
As anything, I don't think we can say "I recommend / don't recommend X book", all knowledge and experience is useful. Just use your judgment and don't treat programming books as a holy book.
> Usually the flag is there to introduce a branching condition (effectively breaking "a function should do one thing")...
But if you don't let the function branch, then the parent function is going to have to decide which of two different functions to call. Which is going to require the parent function to branch. Sooner or later, someone has to branch. Put the branch where it makes the most sense, that is, where the logical "one-ness" of the function is preserved even with the branch.
> I find the same can be achieved w/ polymorphism and/or pattern-matching, the benefit being now your behaviour is part of the data model (the first argument) which is easier to reason about, document, and extend to new cases (don't need to keep passing flags down the call chain).
You just moved the branch. Polymorphism means that you moved the branch to the point of construction of the object. (And that's a perfectly fine way to do it, in some cases. It's a horrible way to try to deal with all branches, though.) Pattern-matching means that you moved the branch to when you created the data. (Again, that can be a perfectly fine way to do it, in some cases.)
> As anything, I don't think we can say "I recommend / don't recommend X book", all knowledge and experience is useful. Just use your judgment and don't treat programming books as a holy book.
People don't want to go through the trouble of reading several opposing points of view and synthesize that using their own personal experience. They want to have a book tell them everything they need to do and follow that blindly, and if that ever bites them back then that book was clearly trash. This is the POV the article seems to be written from IMHO.
As far as the boolean flag argument goes, I've seen it justified in terms of data-oriented design, where you want to lift your data dependencies to the top level as much as possible. If a function branches on some argument, and further up the stack that argument is constant, maybe you didn't need that branch at all if only you could invoke the right logic directly.
Notably, this argument has very little to do with readability. I do prefer consolidating data and extracting data dependencies -- I think it makes it easier to get a big-picture view, as in Brook's "Show me your spreadsheets" -- but this argument is rooted specifically in not making the machine do redundant work.
> This is done as part of an overall lesson in the virtue of inventing a new domain-specific testing language for your tests. I was left so confused by this suggestion. I would use exactly the same code to demonstrate exactly the opposite lesson. Don't do this!
This example (code is in the article) was very telling of the book author's core philosophy.
Best I can tell, the OOP movement of the 2000s (I wasn't a professional in 2008, though I was learning Java at the time) was at its heart rooted in the idea that abstractions are nearly always a win; the very idealistic perspective that anything you can possibly give a name to, should be given a name. That programmers down the line will thank you for handing them a named entity instead of perhaps even a single line of underlying code.
This philosophy greatly over-estimates the value, and greatly under-estimates the cost, of idea-creation. I don't just write some code, I create an idea, and then I write a bit of code as implementation details for it. This is a very tantalizing vision of development: all messy details are hidden away, what we're left with is a beautiful constellation of ideas in their purest form.
The problem is that when someone else has to try and make sense of your code, they first have to internalize all of your ideas, instead of just reading the code itself which may be calling out to something they already understand. It is the opposite of self-documenting code: it's code that requires its own glossary in addition to the usual documentation. "wayTooCold()" may read more naturally to the person who wrote it, but there's a fallacy where they assume that that also applies to other minds that come along and read it later.
Establishing a whole new concept with its own terminology in your code is costly. It has to be done with great care and only when absolutely necessary, and then documented thoroughly. I think as an industry we have more awareness of this nowadays. We don't just make UML diagrams and kick them across the fence for all those mundane "implementation details" to be written.
This thread is full of people saying what's wrong with the book without posing alternatives. I get that it's dogmatic, but do people seriously take it as gospel? I'd read it along with other things. Parts are great and others are not. It's not terrible.
I agree. Trying to apply the lessons in there leads to code that is more difficult to read and reason about. Making it "read like a book" and keeping functions short sound good on the surface but they lead to lines getting taken up entirely by function names and a nightmare of tracking call after call after call.
It's been years since I've read the book and I'm still having trouble with the bad ideas from there because they're so well stuck with me that I feel like I'm doing things wrong if I don't follow the guidelines in there. Sometimes I'll actually write something in a sensible way, change it to the Clean Code way, and then revert it back to where it was when I realize my own code is confusing me when written like that.
This isn't just a Robert C Martin issue. It's a cultural issue. People need to stop shaming others if their code doesn't align with Clean Code. People need to stop preaching from the book.
I make my code "read like a book" with a line comment for each algorithmic step inside a function, and adding line-ending comments to clarify. So functions are just containers of steps designed to reduce repetition, increase visibility, and minimize data passing and globals.
I recently read this cover to cover and left a negative review on Amazon. I'm happy to see I'm not the only one, and this goes into it in a whole lot more detail.
The author seems like they took a set of rules that are good for breaking beginning programmers bad habits and then applied them into the extreme. There's a whole lot of rules which aren't bad up until you try to apply them like laws of gravity that must always be followed. Breaking up big clunky methods that do way too much is great for readability, right up until you're spraying one line helper methods all over your classes and making them harder to read because now you're inventing your own domain specific language everyone has to learn (often with the wrong abstractions which get extended through the years and wind up needing a massive refactoring down the road which would have been simpler with fewer methods and abstractions involved at the start).
A whole lot of my job is taking classes, un-DRY'ing them completely so there's duplication all over the place, then extracting the right (or at least more correct) abstractions to make the whole thing simple and readable and tight.
My biggest gripe: Functions shouldn't be short, they should be of appropriate size. They should contain all the logic that isn't supposed to be exposed to the outside for someone else to call. If that means your function is 3000 lines long, so be it.
Realize that your whole program is effectively one big function and you achieve nothing by scattering its guts out into individual sub-functions just to make the pieces smaller.
If something is too convoluted and does too much, or has too much redundancy, you'll know, because it'll cause problems. It'll bother you. You shouldn't pre-empt this case by just writing small functions by default. That'll just cause its own problems.
This is an interesting article because as I was reading Martin's suggestions I agreed with every single one of them. 5 lines of code per function is ideal. Non-nested whenever possible. Don't mix query/pure and commands/impure. Then I got to the code examples and they were dreadful. Those member variables should be readonly.
Using Martin's suggestion with Functional Hexagonal Architecture would lead to beautiful code. I know because that's what I've been writing for the past 3 years.
Great! While we're on it, can we retire the gang of four as well? I mean, the authors are obviously great software engineers, and the Patterns have helped to design, build, and most importantly read, a lot of software. But as we move forward, more and more of these goals can be achieved much more elegantly and sustainably with new languages and more functional approaches. Personally, I find re-teaching junior programmers, who are still trying to make everything into a class, very tiring.
I don’t understand the amount of hate that Clean Code gets these days…it’s a relatively straightforward set of principles that can help you create a software system maintainable by humans for a very long time. Of course it’s not an engineering utopia, there’s no such thing.
I get the impression that it’s about the messengers and not the messages, and that people have had horrible learning experiences that have calcified into resistance to do with anything clean. But valuable insights are being lost, and they will have to be re-learned in a new guise at a later date.
Development trends are cyclical and even the most sound principle has an exception. Even if something is good advice 99% of the time, it will eventually be criticized with that 1% of the time being used as a counter.
For me Clean Code is not about slavishly adhering to the rules therein, but about guidelines to help make your code better if you follow them, in most circumstances. On his blog Bob Martin himself says about livable code vs pristinely clean code: "Does this rule apply to code? It absolutely does! When I write code I fight very hard to keep it clean. But there are also little places where I break the rules specifically because those breakages create affordances for transient issues."
I've found the Clean Code guidelines very useful. Your team's mileage may very. As always: Use what works, toss the rest, give back where you can.
I never recommended Clean Code, but I've become a strong advocate against it on teams that I lead after reading opinions by Bob Martin such as this one: https://blog.cleancoder.com/uncle-bob/2017/01/11/TheDarkPath.... That whole article reads as someone who is stuck in their old ways and inflexible, then given their large soapbox tries to justify their discomfort and frustration. I consider Swift, Kotlin (and Rust) to be one of the most important language evolutions that dramatically improved software quality on the projects I've worked on.
I've seen so many real world counter-examples to arguments made in that article and his other blog posts that I'm puzzled why this guy has such a large and devoted following.
Actually, I found the post you linked to fairly logical. He’s saying that humans are inherently lazy, and that a language that gives us the option between being diligent (strong types) or being reckless (opt-out of strong types) will lead to the worst form of recklessness: opting out while not writing tests, giving the misimpression of safety.
His point is that you can’t practically force programmers to be diligent through safety features of a language itself, since edge-cases require escape hatches from those safety features, and those safety hatches will be exploited by our natural tendency to avoid “punishment”.
I’m not sure I agree with his point, but I don’t find it an unreasonable position. I’d be curious if Rust has escape hatches that are easily and often abused.
My favorite example here, and a counterpoint to Bob, is Reacts’s dangerously-unsafe-html attribute. I haven’t seen it in years (to the point where I can’t recall the exact naming), and perhaps it was removed at some point. But it made the escape hatch really painful to use. And so the pain of using the escape hatch made it less painful to actually write React in the right manner. Coming from Angular, I think I struggled at first with thinking I had to write some dangerous html, but over time I forgot the choice of writing poor React code even existed.
So I guess I disagree with Bob’s post here. It is possible to have safety features in languages that are less painful than the escape-hatches from those safety features. And no suite of tests will ever be as powerful as those built-in safety features.
He actually misunderstands and mischaracterizes the features of the languages he complains about. These features remove the need for a developer to keep track of invariants in their code, so should be embraced and welcomed by lazy developers who don't have to simulate the boring parts of code in their head to make sure it works. "If it type-checks then it works" philosophy really goes a long way toward relieving developer's stress.
For example, if I'm using C or Java I have take into account that every pointer or reference can be null, at every place where they are used. I should write null checks, (or error checks say from opening a file handle) but I usually don't because I'm lazy, or I forget, or its hard to keep track of all possible error conditions. So I'm stressed during a release because I can't predict the input that may crash my code.
In a language like Swift I am forced to do a null or an error check once in the code, and for that small effort the compiler will guarantee I will never have to worry about these error conditions again. This type system means I can refactor code drastically and with confidence, and I don't have to spend time worrying about all code paths to see if one of them would result in an unexpected null reference. On a professional development team, it should be a no-brainer to adopt a new technology to eliminate all null-reference exceptions at runtime, or use a language to setup guarantees that will hold under all conditions and in the future evolution of the code.
Worse than that, he sets up a patronizing and misguided mental image of a common developer who he imagines will use a language with type safety just to disable and abuse all the safety features. Nobody does that, in my experience of professional Swift, Kotlin or rust development.
He advocates for unit tests only and above all else. That is also painfully misguided: a test tells you it passes for one given example of input. In comparison a good type system guarantees that your code will work for ALL values of a given type. Of course type systems can't express all invariants, so there is a need for both approaches. But that lack of nuance and plain bad advice turned me into an anti-UncleBob advocate.
I find that these two books are in many recommended lists, but I found them entirely unforgettable, entirely too long, and without any "meat."
So much of the advice given is trivial things you'll just figure out in the first few months of coding professionally.
Code for more than a week and you'll figure out how to name classes, how to use variables, how scope works, etc.
The code examples are only in C++, Java, and Visual Basic (ha!). Completely ignoring non-OO and dynamic languages.
Some of the advice is just bad (like prefixing global variables with g_) or incredibly outdated (like, "avoid goto"? Thanks 1968!).
Work on a single software project, or any problem ever, and you'll know that you need to define the problem first. It's not exactly sage advice.
These are cherry-picked examples, but overall Code Complete manages to be too large, go into too specific detail in some areas, while giving vague advice in others.
All books are written in a time and a few become timeless. Software books have an especially short half-life. I think Code Complete was a book Software Engineering needed in 2004, but has since dropped in value.
I will say, that Code Complete does have utility as a way to prop up your monitor for increased ergonomics, which is something you should never ignore.
I have similar issues with Clean Code. One is better off just googling "SOLID Principles" and then just programming using small interfaces more often and use fewer subclasses.
A better alternative is (from above) The Pragmatic Programmer (2019), a good style guide, and/or get your code reviewed by literally anyone.
Another thing Martin advocates for is not putting your name in comments, e.g. "Fixed a bug here; there could still be problematic interactions with subsystem foo -- ericb". He says, "Source control systems are very good at remembering who added what, when." (p. 68, 2009 edition)
Rubbish! Multiple times I've had to track down the original author of code that was auto-refactored, reformatted, changed locations, changed source control, etc. "git blame" and such are useless in these cases; it ends up being a game of Whodunit that involves hours of search, Slack pings, and what not. Just put your name in the comment, if it's documenting something substantial and is the result of your own research and struggle. And if you're in such a position, allow and encourage your colleagues to do this too.
Better put such a long explanation there that your name isn't needed any more. Because if it is your name that makes the difference chances are that you have left the company by the time someone comes across that comment and needs access to your brain.
Sometimes what is interesting is that you have found that another engineer—whom you might not know in a large enough organization—has put time and thought into the code you are working on, and you can be a lot more efficient and less likely to break things if you can talk to that engineer first. It's not always the comment itself.
Or have gotten busy since. I worked at Coinbase in 2019 and saw a comment at the top of a file saying that something should probably be changed. I git-blamed and saw it was written six years earlier by Brian Armstrong.
I think most of what martin says is rubbish, but this is not. I have never had `git blame` fail...ever. I know what user is responsible for every line of code. Doing this is contemporaneous. Its right up there with commenting out blocks of code so you don't lose them.
I don't know what to say, this is a real problem I have encountered in actual production code multiple times. Any code that lives longer than your company's source control of choice, style of choice, or code structure of choice is vulnerable. Moreover, what's the harm? It's more information, not less.
The parent's comment holds when reformatting, especially in languages with suspect formatting practices like golang, where visibility rules are dictated by the case of the first letter (wat?) or how it attempts to align groups of constants or struct fields depending on the length of the longest field name. Ends up in completely unnecessary changes that divert away from the main diff.
What's the downside of adding a few extra characters!?
Of course, this view is already available to people: `git blame` - and it's the same for comments, so there is no need.
The exception is "notes to future self" during the development of a feature (to be removed before review), in which case the most useful place for them to appear is at the _start_ of the comment with a marker:
// TODO(jen20): also implement function X for type Y
I think your comment is controversial, for a number of reasons. One, I think nobody should own code. Code should be obvious, tested, documented and reviewed (bringing the number of people involved to at least two), the story behind it should be either in the git comments or referenced to e.g. a task management system. Code ownership just creates islands.
I mean by all means assign a "domain expert" to a PART of your code, but no individual segment of code should belong to anyone.
Second: There's something to be said about avoiding churn. Everybody loves refactoring and rewriting code, present company included, but it muddles the version control waters. I've seen a few github projects where the guidelines stated not to create PRs for minor refactorings, because they create churn and version control noise.
Anyway, that's all "ideal world" thinking, I know in practice it doesn't work like that.
Either the code is recent (in which case 'git blame' works better since someone changing a few characters may or may not decide to add their name to the file) or it's old and the author has either left the company or has forgotten practically everything about the code.
But sometimes it is bad, and not fixable within the author's control. I occasionally leave author notes, as a shortcut. If I'm no longer here, yeah you gotta figure it all out the hard way. But if I am, I can probably save you a week, maybe a month. And obviously if its something you can succintly describe, you'd just leave a comment. This is the domain of "Based on being here a few years on a few teams, and three services between this one, a few migrations etc etc". Some business problems have a lot of baggage that aren't easily documented or described, its the hard thing about professional development especially in a changing business. There's also cases where I _didnt'_ author the code, but did purposefully not change something that looks like it should be changed. In those cases, without my name comment, git blame wouldn't point you to me. YMMV.
A 1000 times this. We never use git blame - who cares? The code should be self-explanatory, and if it's not, the author doesn't remember why they did it 5 years down the line either.
> First, the class name, SetupTeardownIncluder, is dreadful. It is, at least, a noun phrase, as all class names should be. But it's a nouned verb phrase, the strangled kind of class name you invariably get when you're working in strictly object-oriented code, where everything has to be a class, but sometimes the thing you really need is just one simple gosh-danged function.
Moving from Java as my only language to JavaScript and Rust, this point was driven home in spades. A programming language can be dysfunctional, causing its users to implement harmful practices.
SetupDeardownIncluder is a good example of the kind of code you get when there are no free-standing functions. It's also one path on the slippery slope to FactoryFactoryManager code.
The main problem is that the intent of the code isn't even clear. Compare it with something you might write in Rust:
If you saw that function at the top line of file, or if you saw render.rs in a directory listing, you'd have a pretty good idea of what's going on before you even dug into the code.
Just randomly searching the Fitness repo, there's this:
// Copyright (C) 2003,2004,2005 by Object Mentor, Inc. All rights reserved.
// Released under the terms of the GNU General Public License version 2 or later.
package fitnesse.html;
public class HtmlPageFactory
{
public HtmlPage newPage()
{
return new HtmlPage();
}
public String toString()
{
return getClass().getName();
}
}
For me all this kind of stuff is only to sell books, conferences and consulting services, and a big headache when working in teams whose architects have bought too much into it.
The problem is not really with this book IMHO. Most of its advice and guidelines are perfectly sensible, at least for its intended domain.
The problem is people applying principles dogmatically without seeing the larger picture or considering the context purpose of the rules in the first place.
This book or any book cannot be blamed for people applying the advice blindly. But it is a pervasive problem in the industry. It runs much deeper than any particular book. I suspect it has something to do with how CS education typically happen, but I'm not sure.
Why is it that software engineering is so against comments?
I know nothing of clean code. When I read the link, I assumed that clean code meant very simple and well commented code. I hit cmd+f # and nothing came up. Not one comment saying "this function is an example of this" or "note the use of this line here, it does this" etc, on a blog no less where you'd expect to see these things. The type of stuff I put in my own code, even the code that only I read, because in two weeks I'm going to forget everything unless I write full sentences to paragraph comments, and spend way more time trying to get back in the zone than the time it took me to write those super descriptive comments in the first place.
I hate looking at other peoples scripts because, once again, they never write comments. Practically ever. What they do write is often entirely useless to the point where they shouldn't have even bothered writing those two words or whatever. Most people's code is just keyboard diarrhea of syntax and regex and patterns that you can't exactly google for, assuming whoever is looking at the code has the exact same knowledge base as you, and knows everything that you've put down into the script. Maybe it's a side effect of CS major training, where you don't write comments on your homework because the grader is going to know what is happening. Stop doing that with your code and actually make a write up to save others (and often yourself) mountains of time and effort.
> Why is it that software engineering is so against comments?
Good question. Funny thing is, I worked for a company that mandated that every method be documented, which gets you a whole bunch of "The method GetThing(name) gets a Thing, and the argument 'name' is the name of the Thing". Plus 4 lines of Doxygen boilerplate. Oof.
Of course, I've seen my share of uncommented, unreadable code. And also code with some scattered comments that have become so mangled over 10 years of careless maintenance and haphazard copy-pasting that their existence is detrimental. Many of the comments I come across that might be useful are incoherent ungrammatical ramblings. In large projects, often some of the engineers aren't native English speakers.
My point being that writing consistently useful comments (and readable, properly organized code) is hard. Very, very hard. It requires written communication skills that only a small percentage of engineers (or even humans in general) are capable of. And the demand for software engineers is too high to filter out people who aren't great at writing. So I guess many people just try to work around the problem instead.
There's something bad about going over-the-top halfway. Those sort of strict rules that everyone follows half assed are so common on software teams (and the rest of the business and society, but whatevs). It seems like they have all the downsides of both strictness and laxness. It would work better if you just let devs do their things. It would also work better if you went all the way insane. Like the first time you write some garbage docstring like that the CTO comes to your desk and tells you off. I'm not saying that would be the right move in this case, but at least it's something.
One reason is that comments get stale. People need to maintain them but probably won't. Second reason is that they think the code should be self-documenting. If it's not then you just need better names and code structure. Many books like clean code advocate this approach, and that's where I first learned the idea of don't write comments as well.
Personally now I've held both sides of the argument at different times. I think in the end it's a trade-off. There's no hard and fast rule, you need to use your best judgement about what's going to be easiest for the reader and also for the people maintaining the code. Usually I just try to strike a balance that I think my coworkers won't complain about. The other thing I've realized that makes this tricky is that people will almost always overestimate their (or others) commitment to maintaining comments, and/or overestimate how "self-documenting" their code is.
It's also probably time to stop recommending TDD, object-oriented programming, and a host of other anti-patterns in software development, and get serious about treating it like a real engineering profession instead of a circus of personality- and company-driven paradigms.
It is interesting that he uses a fitnesse example.
Years ago we started using fitnesse at a place I was working, and we needed something that was not included, I think it was being able to make a table of http basic auth tests/requests.
The code base seems large and complex at first, but I was able to very quickly add this feature with minimal changes and was pretty confident it was correct. Also, I had little experience in Java at the time. All in all it was a pretty big success.
Interesting, is probably the wrong, word. I should say interesting to me, because I had a different experience with it. And it was not any sort of theoretical analysis, it was a feature I needed to get done.
Like everything else: it's fine in moderation. Newbies should practice clean code, everybody else gets to make their own decisions. Treating anything as dogma whether it is Clean Code, TDD, Agile or whatever is the flavor of madness of the day is going to lead to suboptimal outcomes. And they're also a great way to get rid of your most productive and knowledgeable team members.
So apply with caution and care and you'll be fine.
There's a word, in other comments, that I expected to find: zealots. Zealots aren't sufficiently critical, and they don't want to think for themselves; a reasonable person should be able to, and a professional should be constantly itching to, step back, look at code, and decide whether some refactoring or rewriting is an improvement, taking a book like Clean Code as a source of general principles and good examples, not of rules.
All the "bad" examples discussed in the article are rather context dependent, representing uninspired examples or extreme tastes in the book rather than bad or obsolete ideas.
Shredding medium length meaningful operations into many very small and quite ad hoc functions can reduce redundancy at the expense of readability, which might or might not be an improvement; a little DSL that looks silly if used in a couple of test cases can be readable and efficient if more extensive usage makes it familiar; a function with boolean arguments can be an accretion of special cases, mature for refactoring or a respectable way to organize otherwise repetitive code.
Most of these types of books approach things from the wrong direction. Any recommendation should look at the way well designed, maintainable systems are actually written and draw their conclusions from there. Otherwise you allow too much theorizing to sneak in. Lots of good options to choose from and everyone will have their own pet projects, but something like SQLite is probably exemplary of what a small development team could aim for, Postgres or some sort of game engine would maybe be good for a larger example (maybe some of the big open source projects from major web companies would be better, I don't know).
There are books that have done something like this[0], but they are a bit high level. There is room for something at a lower level.
"Promote I/O to management (where it can't do any damage)" is the actionably good thing i've taken from Brandon Rhoades' talk based on this: https://www.youtube.com/watch?v=DJtef410XaM
Living in a world where people regularly write single functions that: 1. loads data from a hardcoded string path of file location 2. does all the analysis inside the same loop that the file content iteration happens in and 3. plots the results ... that cleavage plane is a meaningfully good one.
The rest of the ideas fall into "all things in moderation, including moderation", and can and should be special-cased judiciously as long as you know what you're doing. But oh god please can we stop writing _that_ function already.
Let's not throw the baby out with the bathwater. We can still measure how quickly new (average) developers become proficient, average team velocity over time, and a host of other metrics that tell us if we are increasing or decreasing the quality of our code over time. Ignoring it all because it's somewhat subjective is selfish and bad for your business.
Leave off the word "clean" or whatever... DO have metrics and don't ignore them. You have people on your team that make it easier for the others, and people who take their "wins" at the expense of their teammates' productivity.
I know that I'm late to this party, but what would Clean Coders think about algorithm heavy weight code like "TimSort.java"[1]? (This is the Java port of the famous Python stable sort.) Since Java doesn't have mutable references (pointers) or allow multiple return values, it gets very tricky to manage all the local variables across different functional scopes. I guess you could put all your locals into a struct/POJO, and then pass it around to endless tiny functions. (Honestly, the Java regex library basically does this... sucessfully.) Somehow, I feel it would be objectively worse if this algo code were split into endless 1/5/10 line functions! (Yes yes, that is an _opinion_... not a fact!)
Come to think of it, is the original C code for Python's timsort equally non-Clean Code-ish?[2] Probably not!
Articles like these make me feel better about never having read any of the 'how to code' books. Mainly substituting them by reading TheDailyWTF during the formative years.
I have the same complaint with Code Complete. I read bits in college and I'm not sure I follow most of its advice today (i.e. putting constants on the left side of a comparison).
However, the book also presents the psych study about people not remembering more than 7 (+/- 2) things at a time (therefore you should simplify your code so readers don't have to keep track of too much stuff) and it stuck with me. I must be one of the people with only 5 available slots in their brain...
That study was done for specific stimuli (words, digits), and doesn't generalize to e.g. statements. There are studies that show that rate of presentation, complexity, and processing load have an effect. However, STM capacity is obviously limited, so it's good to keep that in mind when you're worried about readability. And I think it's also safe to assume that expert programmers can "chunk" more than novices, and have a lower processing load.
> putting constants on the left side of a comparison
Yoda conditions? I hate those, they are difficult to read. Yes they are useful for languages which allow assignments in conditionals, but even then it's not really worth it. It's a very novice mistake to make. For me equality rarely appears in conditionals, it's either a numeric comparison or checking for existence.
Also, the tooling and compiler speed aren't fucked like they are in Scala or Kotlin. I like Kotlin, especially the null-safety, but the ecosystem's quality is kinda shoddy. Everything is just faster and less buggy in Java.
Honestly Clean Code probably isn't worth recommending anymore. We've taken the good bits and absorbed it into best practice. I think it has been usurped by books like "Software Engineering at Google" and "Building Secure and Reliable Systems".
I don't believe in being prescriptive to anyone about how they write their code, because I think people have different preferences and forcing someone to write small functions when they tend to write large functions well is a unique form of torture. Just leave people alone and let them do their job!
I don't think it is the perfect solution, but a lot of people assert "we can't do better, no point in trying, just write whatever you feel like" and I think that is a degenerate attitude. We CAN find better ways to construct and organize our code, and I don't think we should stop trying because people don't want to update their pull requests.
I've heard this before, and I agree, but don't let the name put you off. I agree that designing and iterating for google scale is a bad idea, but there is a lot in that book that is applicable to all software teams.
Maybe there is no such thing as clean code by following a set of rules. I know the author of the book never advocateshis book as a "bible" but it does give the reader such feeling.
There is only years, decades of deep experience into certain domains (e.g. game rendering engine programming, or, Customer Relationship backends), extra hours reading proved high quality code, countless times of reflection on existing code (that also means extra hours reviewing existing code) based on the reading and a strong will to improve them, not based on some set of rules, but based on common sense on programming and many trials of re-writing the code into another form.
I think ultimately it goes down to something similar to 10,000-hour rule: We need to put down a lot of time in X, and not only that, we also need to challenge ourselves for every step.
I think the book is still useful with all it's flaws, mainly because "overengineering code" is like exercising too much: sure, it happens to some and can be a big issue, but for the vast majority of people it's the opposite that is the problem.
What well-regarded codebases has this author written, so you can see his principles in action? OTOH, if you’re wondering about the quality of John Ousterhout’s advice in _A Philosophy of Software Design_, you can just read the Tcl source.
The article quotes a code sample from FitNesse – the author has apparently maintained that codebase since then. You can check out the code for the current version at https://github.com/unclebob/fitnesse, or browse the code in a Monaco editor using https://github1s.com/unclebob/fitnesse/blob/HEAD/src/. (I have no idea if that code is “well-regarded”, but as you wrote, you can read it for yourself.)
I'm surprised by the amount of detractors.
We know from history that any book with advice should not be taken too literal. Reading the comments here, it feels almost like I read a different book (about 10 years ago).
I did actually found the SetupTeardownIncluder class author complains about really easier to read and interpret then the original version. I know from the start what the intention was and where I should go if I have issue with some part of the code.
I dont even take issue with name. It makes it easy to find the class just by remembering what it should do. I dont really care all that much about verbs vs nouns. I want to be able to find the right place by having rough idea about what it should do. I want to get hint about class functionality from its name too.
'Clean Code' is a style, not all practices are best. I feel that a good software team understands each other's styles, therefore making it easier to read others code within the context of a style. However, when people disagree on code style it has a way of creating cliques within teams, so sometimes it's just easier to pick a style that is well documented already and be done with mainly petty disagreements. Clean code fits the definition of well documented and is a lazy way of defining a team wide style.
I am interested in reading books about software development and best practices like Clean Code and The Pragmatic Programmer [0]. I have coded for about eight years, but I would like to do it better. I would like to know your opinion about [0], since Clean Code has been significantly criticized.
How about we throw Clean Architecture in this while we're at it. And also realize that the only rule in SOLID that isn't defined subjectively or partially is the "L".
This is the first time I've heard of this book. I certainly agree some of these recommendations are way off the mark.
One guideline I've always tried to keep in mind is that statistically speaking, the number of bugs in a function goes way up when the code exceeds a page or so in length. I try to keep that in mind. I still routinely write functions well over a page in length but I give them extra care when they do, lots of comments and I make sure there's a "narrative flow" to the logic.
The big one to keep an eye on is cyclomatic complexity with respect to function length. Just 3 conditional statements in your code gives you no less than 8 ways through your code and it only goes up from there.
All of these 'clean code' style systems have the same flaw. People follow them without understanding why the system was made. It is why you see companies put in ping pong tables, but no one uses them. They saw what someone else was doing and they were successful so they copy them. Not understanding why the ping pong table was there. They ignore the reason the chesterton's fence was built. Which is just as important if you are removing it. Clean code by itself is 'ok'. I personally am not very good at that particular style of coding. I do like that it makes things very nice to decompose into testing units.
A downside to this style of coding is it can hide complexity with an even more complex framework. It seems to have a nasty side effect of smearing the code across dozens of functions/methods which is harder in some ways to get the 'big picture'. You can wander into a meeting and say 'my method has CC of 1' but the realty is that thing is called at the bottom of a for loop, inside of 2 other if conditions. But you 'pass' because your function is short.
4 line functions everywhere is insanity. yes, you should aim for short functions that do one thing, but in the real world readability and maintainability would suffer greatly if you fragment everything down to an arbitrarily small number.
Number of bugs per line also goes way up when the average length of functions goes below 5, and the effect in most studies is larger than the effect of too large functions.
> Why are we using both int[] and ArrayList<Integer>? (Answer: because replacing the int[] with a second ArrayList<Integer> causes an out-of-bounds exception.)
Isn't it because one is pre-allocated with a known size of n and the other is grown dynamically?
> And what of thread safety?
Indeed. If he had written the prime number class like the earlier example, with public static methods creating a new instance for each call and all the other methods being private instance methods, this wouldn't be an issue.
Pardon me for stating this but pundits of the Clean Code mantra I've worked with tend to be those consultants who bill enormous amount of money to ensure that they have lengthy contracts which is justified by wrapping codes in so much classes to abstract it and considered *CLEAN* and *TESTABLE*.
They will preach the awesomeness of clean code in terms of maintainability, scalability and all those fancy enter-pricey terms that at the end of the day, brought not enough value to justify their cost.
Welcome to another episode of "X, as per my definition of X, is bad - Let's talk about Y, which is another definition of X, but not the one I disagree with".
So many coding recommendations trip up when they fail to take into account Ron's First Law: all extreme positions are wrong. Functions that are too long are bad, but functions that are too short are equally bad. 2-4 lines? Good grief! That's not even enough for a single properly formatted if-then-else!
IMHO it's only when you can't see the top and bottom of a function on the screen at the same time that you should start to worry.
I don't completely disagree, but his point about the irrelevance of SOLID, OO, and Java in this supposedly grand new age of FP programming ignores that OO is still the pre-eminent paradigm for most applications and Java is remains one of the largest and most utilized languages in the world. Also, I would say that excitement around FP has waned more than it has for Java.
I often hear that people should read Clean Code and that it is necessary in large projects. I would say that there is no direct correlation between how large and complex the business logic is and the difficulty of understanding and maintaining the code. I have seen small simple applications that are not maintainable because people have followed SOLID to the extreme.
One of the biggest issues I have found is that I can sometimes not easily create a test for code that I have modified because it is part of a larger class (I'm coding in C++). This normally happens when I cannot extract the function out of the class and it relies on internal state, and the class is not already being tested.
Love to know if there is an easy way of doing this!
A lot of Robert C Martins pieces are just variations on his strong belief that ill-defined concepts like "craftsmanship" and "clean code" (which are basically just whatever his opinions are on any given day) is how to reduce defects and increase quality, not built-in safety and better tools, and if you think built-in safety and better tools are desirable, you're not a Real Programmer (tm).
I'm not the only one who is skeptical of this toxic, holier-than-thou and dangerous attitude.
Removing braces from if statements is a great example of another dangerous thing he advocates for no justifiable reason
>The current state of software safety discussion resembles the state of medical safety discussion 2, 3 decades ago (yeah, software is really really behind time). > >Back then, too, the thoughts on medical safety also were divided into 2 schools: the professionalism and the process oriented. The former school argues more or less what Uncle Bob argues: blame the damned and * who made the mistakes; be more careful, damn it. > >But of course, that stupidity fell out of favor. After all, when mistakes kill, people are serious about it. After a while, serious people realize that blaming and clamoring for care backfires big time. That's when they applied, you know, science and statistic to safety. > >So, tools are upgraded: better color coded medicine boxes, for example, or checklists in surgery. But it's more. They figured out what trainings and processes provide high impacts and do them rigorously. Nurses are taught (I am not kidding you) how to question doctors when weird things happen; identity verification (ever notice why nurses ask your birthday like a thousand times a day?) got extremely serious; etc. > >My take: give it a few more years, and software, too, probably will follow the same path. We needs more data, though.
Clean Code is not a blind religion, Uncle Bob is trying to make a point with the concepts behind the book and teaching you to consider/question if you're falling into bad code traps.
This book was written to make developers think and consider their choices, not a script for good code.
An output argument is when you pass an argument to a function, the function makes changes, and after returning you examine the argument you passed to see what happened.
Example: the caller could pass an empty list, and the method adds items to the list.
Why not return the list? Well, maybe the method computes more things than just the list.
> Why not return the list? Well, maybe the method computes more things than just the list.
Or in C you want to allocate the list yourself in a particular way and the method should not concern with doing the allocation itself. And the return value is usually the error status/code since C doesn't have exceptions.
That's a C/C++ trick where a location to dump the output is presented as an argument to the function. This makes functions un-pure and leads to all kind of nastiness such as buffer overruns and such if you are not very careful.
It's wrong to call output parameters a "C/C++ trick" because the concept really has nothing to do with C, C++, buffer overruns, purity, or "other nastiness".
The idea is that the caller tells the function its calling where to store results, rather than returning the results as values.
For example, Ada and Pascal both have 'out' parameters:
Theoretically, other than different calling syntax, there's conceptually no difference between "out" parameters and returning values.
In practice, though, many languages (C, C++, Java, Python, ...) support "out" parameters accidentally by passing references to non-constant objects, and that's where things get ugly.
Not only in C land; C# has "ref" (pass by reference, usually implying you want to overwrite it) and "out" (like ref but you _must_ set it in all code paths). Both are a bit of a code smell and you're nearly always better off with tuples.
Unfortunately in C land for all sorts of important system APIs you have to use output arguments.
An output argument (or parameter) is assigned a result. In Pascal, for instance, a procedure like ReadInteger(n) would assign the result to n. In C (which does not have variable parameters) you need to pass the address of the argument, so the function call is instead ReadInteger(&n). The example function ReadInteger has a side effect so it is therefor preferable to use an output parameter rather than to return a result.
1. Clean Code is not a fixed destination to which you'll ever arrive. It's a way of life.
2. We might not use the same methods to write clean code. But when you see clean code, you know it is clean.
3. Some traits clean code can have
- When you read the function name, you understand what the function does
- When you read the content of a function, you can understand what it is about without reading line by line.
- When you try to refactor clean code, you find yourself sometimes ending up only changing one cog in the whole system.
I worked at a large billion dollar company in the Bay Area (who is in the health space) and they religiously followed Clean Code. Their main architect was such a zealot for it.
My problem is not with the book and author itself but the seniority that peddles this as some gospel to the more junior engineers.
Clean code is not end all be all. Be your own person and develop WHAT IS RIGHT FOR YOUR ORG not peddle some book as gospel
So glad I work at a company now where we actually THINK on the right abstractions now and not peddle some book
It's an ok book to read and think about, but understand it is written by someone that hasn't really built a lot of great software, but rather is paid to consult and give out sage advice that is difficult to verify.
Read with great skepticism, but don't feel bad if you decide not to read it at all.
I'd like to say that software engineering is a lot like playing jazz. It's really hard for the beginner to know where to start, and there're also endless sources for the "right" way to do things.
In truth however, like playing jazz, the only thing that really matters is results, and even those can be subjective. You can learn and practice scales all day long, but that doesn't really tell you how to make music.
I developed a style of software engineering that works really well for me. It's fast, accurate, and naturally extensible and easily refactorable. However, for various reasons, I've never been able to explain it junior (or even senior) engineers, when asked about why I code a certain way. At a certain point, it's not the material that matters, but the audience's ability to really get what's at the heart of the lesson.
Technically, you could say something that's indisputable accurate, like, "there're only 12 notes in an (western) octave, and you just mix them until they sound good", but that's obviously true to a degree that's fundamentally unhelpful. At the same time, you could say "A good way to think about how to use a scale is to focus less on the notes that you play and more on the ones you don't". This is better advice but it may altogether be unhelpful, because it doesn't really yet strike at the true heart of what holds people back.
So at a certain point, I don't really know if anyone can be taught something as fundamentally "artful" (I.e. a hard thing largely consisting of innumerable decisions that are larger matters of taste - which is a word that should not be confused with mere "opinion") as software engineering or jazz music. This is because teaching alone is just not enough. At a certain point people just have to learn for themselves, and obviously the material that's out there is helpful, but I'm not sure if anything can ever be explained so well as to remove the need for the student at a certain point to simply "feel" what sounding good sounds like, or what good software engineering feels like.
I'll add one last thing. Going back to what I was saying about not being able to explain to "junior (or even senior)" engineers. Were the same "lesson" to happen with someone who is very, very advanced, like a seasoned principal engineer who's built and delivered many things, time and time again, across many different engineering organizations and different technologies - someone like a jazz music great for example. Anything I would have to say on my approach to such a software engineer would be treated as obvious and boring, and they'd probably much rather talk about something else. I don't say this because I mean to imply that whatever I would have to say is wrong or incorrect, but rather at a certain level of advancement, you forget everything that you know and don't remember what it took to get there. There are a few who have a specific passion for teaching, but that's orthogonal to the subject.
I think it was Bill Evans who said something like "it takes years of study and practice to learn theory and technique, but it takes still a lifetime to forget it all". When you play like you've forgotten it all, that is when you achieve that certain sound in jazz music. Parenthetically, I'll add that doesn't mean you can't sound good being less advanced, but there's a certain sound that I'm trying to tie together with this metaphor that's parallel to great software engineering from knowledge, practice, and then the experience to forget it all and simply do.
I think that's fundamentally what's at the heart of the matter, not that it takes anyone any closer to getting there. You just have to do it, because we don't really know how to teach how to do really hard things in a way that produces reproducible results.
Uncle "Literally who?" Bob claims you should separate your code into as many small functions spread across as many classes as you can and makes a living selling (proverbial) shovels. John Carmack says you should keep functions long and have the business logic all be encapsulated together for mental cohesion. Carmack makes a living writing software.
I happen to agree with you and have posted in various HN threads over the years about the research on this, which (for what it's worth) showed that longer functions were less error prone. However, the snarky and nasty way that you made the point makes the comment a bad one for HN, no matter how right you are. Can you please not post that way? We're trying for something quite different here: https://news.ycombinator.com/newsguidelines.html.
It's even more important to stick to the site guidelines when you're right, because otherwise you discredit the truth and give people a reason to reject it, which harms all of us.
> The function that is least likely to cause a problem is one that doesn't exist, which is the benefit of inlining it. If a function is only called in a single place, the decision is fairly simple.
> In almost all cases, code duplication is a greater evil than whatever second order problems arise from functions being called in different circumstances, so I would rarely advocate duplicating code to avoid a function, but in a lot of cases you can still avoid the function by flagging an operation to be performed at the properly controlled time. For instance, having one check in the player think code for health <= 0 && !killed is almost certain to spawn less bugs than having KillPlayer() called in 20 different places.
On the spectrum you've described, I'm progressively shifting from Uncle Bob's end to Carmack's the further I get into my career. I think of it as code density. I've found that high density code is often easier to grok because there's less ceremony to keep in my head (e.g. many long method names that may or may not be named well, jumping around a bunch of files). Of course, there's a point at which code becomes so dense that it again becomes difficult to grok.
Or perhaps the length of the function is orthogonal to the quality of the author's code. Make the function as long as necessary to be readable and maintainable by the people most likely to read and maintain it. But that's not a very sellable snippet, nor a rule that can be grokked in 5 minutes.
Carmack is literally the top .1% (or higher) of ability and experience. Not to mention has mostly worked in a field with different constraints than most. I don't think looking to him for general development advice is all that useful.
Read the doom source code and you can see that he didn't mess around with trying to put everything into some nonsense function just because he has some part of a larger function that can be scoped and named.
The way he wrote programs even back then is very direct. You don't have to jump around into lots of different functions and files for no reason. There aren't many specialized data structures or overly clever syntax tricks to prove how smart he is.
There also aren't attempts to write overly general libraries with the idea that they will be reused 100 times in the future. Everything just does what it needs to do directly.
The type of software John writes is different (much more conceptually challenging), and I don't recall him being as big of a proponent of TDD (which is the biggest benefit to small functions).
I think the right answer depends on a number of other factors.
The problem with Clean Code is also the problem with saying to ignore Clean Code. If you treat everything as a dogmatic rule on how to do things, you're going to have a bad time.
Because, they're more like guidelines. If you try not to repeat yourself, you'll generally wind up with better code. If you try to make your methods short, you'll generally wind up with better code.
However, if you abuse partial just to meet some arbitrary length requirement, then you haven't really understood the reason for the guideline.
But the problem isn't so much because the book has a mix of good and bad recommendations. We as an evolutionary race have been pretty good at selectively filtering out bad recommendations over the long term.
The problem is that Uncle Bob has a delusional cult following (that he deliberately cultivated), which takes everything he says at face value, and are willing to drown out any dissenting voices with a non-stop barrage of bullshit platitudes.
There are plenty of ideas in Clean Code that are great, and there are plenty that are terrible...but the religiosity of adherence to it prevents of from separating the two.
Clean Code is fine. It's a little dated, as you would expect, and for the most part, everything of value in it has been assimilated into the best practices and osmotic ether that pervades software development now. It's effectively all the same stuff as you see in Refactoring or Code Complete or Pragmatic Programmer.
I suspect a lot of backlash against it centers around Uncle Bob's less than progressive political and social stances in recent years.
I never read Clean Code and know nothing about its author so I'm willing to trust you on the first part, but the second paragraph is really uncalled for IMO. The article is long and gives precise examples of its issues with the book. Assuming an ulterior motive is unwarranted.
This article is garbage. The argument is basically like saying "famous scientist X was wrong about Y, let's stop doing science. Clearly there is no point to it."
I cannot believe what I am reading here.
My open source community knows exactly what good code looks like and we've delivered great products in very short timeframes repeatedly and often beating our own expectations.
These kinds of articles make me feel like I must have discovered something revolutionary... But in reality I'm just following some very simple principles which were invented by other people several decades ago.
Too many coders these days have been misled into all sorts of goofy trends. Most coders don't know how to code. The vast majority of the people who claim to be experts and who write books about it don't know what they're talking about. That's the real problem. The industry has been hijacked by people who simply aren't wise or clever enough to be sharing any kind of complex knowledge. There absolutely is such a thing as good code.
I'm tired of hearing developers who have never read a single word of Alan Kay (the father of OOP) tell everyone else how bad OOP is and why FP is the answer. It's like watching someone drive a nail straight into their own hand and then complain to everyone that hammer and nails are not the right tool for attaching two pieces of wood together... That instead, the answer is clearly to tie them together with a small piece of string because nobody can get hurt that way.
Just read the manual written by the inventor of the tool.
Alan Kay said "The Big Idea is Messaging"... Yet almost none of the OOP code I read designs their components in such a way that they're "communicating" together... Instead, all the components try to use methods to micromanage each other's internal state... Passing around ridiculously complex instances to each other (clearly a whole object instance is not a message).
> The argument is basically like saying "famous scientist X was wrong about Y, let's stop doing science. Clearly there is no point to it."
In my opinion the argument is more "famous paper X by scientist Y was wrong, let's stop citing it". Except that Clean Code isn't science and doesn't pretend to be.
If the article only attacked that specific book "Clean Code", then I would not be as critical. But the first line in the article suggests that it is an attack against the entire idea of writing good quality code:
'It may not be possible for us to ever reach empirical definitions of "good code" or "clean code"'
It might seem far fetched that someone might question the benefits of writing high quality code (readable, composable, maintainable, succinct, efficient...) but I've been in this industry long enough (and worked for enough different kinds of companies) to realize that there is an actual agenda to push the industry in that direction.
Some people in the corporate sphere really believe that the best way to implement software is to brute force it by throwing thousands of engineers at a giant ball of spaghetti code then writing an even more gargantuan spaghetti ball of tests to ensure that the monstrosity actually works.
Martin is, and always has been, a plagiarist, ghost-written, clueless, idiot, with a way of convincing other know-nothings that he knew something. At one time he tried to set up a reputation on StackOverflow, and was rapidly seen off.
Yeah I agree with the author, and I would go further, it's a nice list of reasons why Uncle Bob is insufferable.
Because of stuff like this:
> Martin's reasoning is rather that a Boolean argument means that a function does more than one thing, which it shouldn't.
Really? Really? Not even for dependency injection? Or, you know, you should duplicate your function into two very similar things to have one with the flag and another one without. Oh but DRY. Sure.
> . He says that an ideal function has zero arguments (but still no side effects??), and that a function with just three arguments is confusing and difficult to test
Again, really?
I find it funny who treats him as a guru. Or maybe that's the right way to treat him, like those self-help gurus with meaningless guidance and whishy-washy feel-good statements.
> Every function in this program was just two, or three, or four lines long. Each was transparently obvious. Each told a story. And each led you to the next in a compelling order.
Wow, illumination! Salvation! Right here!
Until, of course you have to actually maintain this and has to chase down 3 or 4 levels of functions deep what is it that the code is actually doing. And think of function signature for every minor thing. And passing all the arguments you need (ignoring that "perfect functions have zero arguments" above - good luck with that)
Again, it sounds like self-help BS and not much more than that.
> Until, of course you have to actually maintain this and has to chase down 3 or 4 levels of functions deep what is it that the code is actually doing.
The art is to chain your short functions like a paragraph, not to nest them a mile deep, where the "shortness" is purely an illusion and the outer ones are doing tons of things by calling the inner ones.
That's a lot harder, though.
But it fits much better with the spirit of "don't have a lot of args for your functions" - if you're making deeply nested calls, you're gonna have to pass all the arguments the inner ones need through the outer ones. Or else do something to obfuscate how much you're passing around (global deps/state, crazy amounts of deep DI, etc...) which doesn't make testing any easier.
> Really? Really? Not even for dependency injection? Or, you know, you should duplicate your function into two very similar things to have one with the flag and another one without. Oh but DRY. Sure.
I'm not sure dependency injection has anything to do with boolean flags or method args. I think the key point here is that he is a proponent of object oriented programming. I think he touches on dependency injection later in the book, but it's been a while since I've read it. He suggests your dependencies get passed at object initialization, not passed as method options. That let's you mock stuff without needing to make any modifications to the method that uses that dependency easily.
> Until, of course you have to actually maintain this and has to chase down 3 or 4 levels of functions deep what is it that the code is actually doing. And think of function signature for every minor thing. And passing all the arguments you need (ignoring that "perfect functions have zero arguments" above - good luck with that)
I myself find it easier to read and understand simple functions than large ones with multiple indentation levels. Also, it definitely does not make sense to pass many arguments along with those many small functions. He recommends making them object instance properties so that you don't need to do that.
It may not be for everyone, but I'll take reading code that follows his principles instead of code that had no thought about design put into it any day of the week. It's not some dogmatic law that should be followed in all cases, but to me it's a set of pretty great ideas to keep in mind to lay out code that is easy to maintain and test.
> I'm not sure dependency injection has anything to do with boolean flags or method args.
DI can be abused as a way to get around long function signatures. "I don't take a lot of arguments here" (I'm just inside of an object that has ten other dependencies injected in). Welcome back to testing hell.
It's actually a good marketing trick. He can sell something slightly different and more "pure" and make promises on it and then sell books, trainings and merchandises.
That's what the wellness industry do all the time.
The boolean issue is probably the one that's caused me the most pain. That contradiction with DRY has actually had me go back and forth between repeating myself and using a flag, wasting a ton of time on something incredibly pointless to be thinking that hard about. I feel like the best thing for my career would have been to not read that book right when I started my first professional programing job.
It's been a while since I've read it, but I think to handle boolean flag type logic well he suggests to rely on object subclassing instead. So, for an example that uses a dry run flag for scary operations, you can have your normal object (a) and all of its methods that actually perform those scary operations. Then you can subclass that object (a) to create a dry run subclass (b). That object b, can override only the methods that perform the scary operations that you want to dry run while at the same time using all of its non scary methods. That would let you avoid having if dry_run == true; then dry_run_method() else scary_method() scattered in lots of different methods.
It might make sense to divide your function with boolean flag into two functions and extract common code into third private function. Or may be it'll make things ugly.
I treat those books as something to show me how other people do things. I learn from it and I add it to my skill book. Then I'll apply it and see if I like it. If I don't like it in this particular case, I'll not apply it. IMO it's all about having insight into every possible solution. If you can implement something 10 different ways, you can choose the best one among them, but you have to learn those 10 different ways first.
There's a lot of bad advice being tossed around in this thread. If you are worried about having to jump through multiple files to understand what some code is doing, you should consider that your naming conventions are the problem, not the fact that code is hidden behind functional boundaries.
Coding at scale is about managing complexity. The best code is code you don't have to read because of well named functional boundaries. Without these functional boundaries, you have to understand how every line of a function works, and then mentally model the entire graph of interactions at once, because of the potential for interactions between lines within a functional boundary. The complexity (sum total of possible interactions) grows as the number of lines within a functional boundary grows. The cognitive load to understand code grows as the number of possible interactions grow. Keeping methods short and hiding behavior behind well named functional boundaries is how you manage complexity in code.
The idea of code telling a story is that a unit of work should explain what it does through its use of well named variables, function/object names, and how data flows between function/objects. If you have to dig into the details of a function to understand what it does, you have failed to sufficiently explain what the function does through its naming and set of arguments.
> you have failed to sufficiently explain
This is the problem right here. I don't just read code I've written and I don't only read perfectly abstracted code. When I am stuck reading someone's code who loves the book and tries their best to follow those conventions I find it far more difficult - because I am usually reading their code to fully understand it myself (ie in a review) or to fix a bug I find it infuriating that I am jumping through dozens of files just so everything looks nice on a slide - names are great, I fully appreciate good naming but pretending that using a ton of extra files just to improve naming slightly isnt a hindrance is wild.
I will take the naming hit in return for locality. I'd like to be able to hold more than 5 lines of code in my head but leaping all over the filesystem just to see 3 line or 5 line classes that delegate to yet another class is too much.
Carmack once suggested that people in-line their functions more often, in part so they could “see clearly the full horror of what they have done” (paraphrased from memory) as code gets more complicated. Many helper functions can be replaced by comments and the code inlined. I tried this last year and it led to overall more readable code, imho.
4 replies →
The idea is that without proper boundaries, finding the line that needed to be changed may be a lot harder than clicking through files with an IDE. Smaller components also help with code reviews since it’s a lot easier to understand a line within the context of a component (or method name) without having to understand what the huge globs of code before it is doing. Also, like you said a lot of the times a developer has to read code they didn’t write so there are other factors to consider like how easy it is for someone from another team to make a change or whether a new employee could easily digest the code base.
12 replies →
>Coding at scale is about managing complexity.
I would extend this one level higher to say managing complexity is about managing risk. Risk is usually what we really care about.
From the article:
>any one person's opinions about another person's opinions about "clean code" are necessarily highly subjective.
At some point CS as a profession has to find the right balance of art and science. There's room for both. Codifying certain standards is the domain of professions (in the truest sense of the word) and not art.
Software often likens itself to traditional engineering disciplines. Those traditional engineering disciplines manage risk through codified standards built through industry consensus. Somebody may build a pressure system that doesn't conform to standards. They don't get to say "well your idea of 'good' is just an opinion so it's subjective". By "professional" standards they have built something outside the acceptable risk envelope and, if it's a regulated engineering domain, they can't use it.
This isn't to mean a coder would have to follow rigid rules constantly or that it needs a regulatory body, but that the practice of deviating from standardized best-practices should be communicated in terms of the risk rather than claiming it's just subjective.
A lot of "best practices" in engineering were established empirically, after root cause analysis of failures and successes. Software is more or less evolving along the same path (structured programming, OOP, higher-than-assembly languages, version control, documented ISAs).
Go back to earlier machines and each version had it's own assembly language and instruction set. Nobody would ever go back to that era.
OOP was pitched as a one-size-fits-all solution to all problems, and as a checklist of items that would turn a cheap offshored programmer into a real software engineer thanks to design patterns and abstractions dictated by a "Software Architect". We all know it to be false, and bordering on snake oil, but it still had some good ideas. Having a class encapsulate complexity and defining interfaces is neat. It forces to think in terms of abstractions and helps readability.
> This isn't to mean a coder would have to follow rigid rules constantly or that it needs a regulatory body, but that the practice of deviating from standardized best-practices should be communicated in terms of the risk rather than claiming it's just subjective.
As more and more years pass, I'm less and less against a regulatory body. Would help with getting rid of snake oil salesman in the industry and limit offshoring to barely qualified coders. And simplify hiring too by having a known certification that tells you someone at least meets a certain bar.
14 replies →
>the practice of deviating from standardized best-practices should be communicated in terms of the risk rather than claiming it's just subjective.
The problem I see with this is that programming could be described as a kind of general problem solving. Other engineering disciplines standardize methods that are far more specific, e.g. how to tighten screws.
It's hard to come up with specific rules for general problems though. Algorithms are just solution descriptions in a language the computer and your colleagues can understand.
When we look at specific domains, e.g. finance and accounting software, we see industry standards have already emerged, like dealing with fixed point numbers instead of floating point to make calculation errors predictable.
If we now start codifying general software engineering, I'm worried we will just codify subjective opinions about general problem solving. And that will stop any kind of improvement.
Instead we have to accept that our discipline is different from the others, and more of a design or craft discipline.
3 replies →
> At some point CS as a profession has to find the right balance of art and science.
That seems like such a hard problem. Why not tackle a simpler one?
2 replies →
Yes, coding at scale is about managing complexity. No, "Keeping methods short" is not a good way to manage complexity, because...
> then mentally model the entire graph of interactions at once
...partially applies even if you have well-named functional boundaries. You said it yourself:
> The complexity (sum total of possible interactions) grows as the number of lines within a functional boundary grows. The cognitive load to understand code grows as the number of possible interactions grow.
Programs have a certain essential complexity. Making a function "simpler" means making it less complex, which means that that complexity has to go somewhere else. If you make all of your functions simple, then you simply need more functions to represent the same program, which increases the total number of possible interactions between nodes and therefore the cognitive load of understanding the whole graph/program.
Allowing more complexity in your functions makes them individually harder to understand, but reduces the total number of functions needed and therefore makes the entire program more comprehensible.
Also note that just because a function's implementation is complex doesn't mean that its interface also has to be complex.
And, functions with complex implementations are only themselves difficult to understand - functions with complex interfaces make the whole system more difficult to understand.
This is where Occam's Razor applies - do not multiply entities unnecessarily.
Having hundreds or thousands of simple functions is the opposite of this advice.
You can also consider this in more scientific terms.
Code is a mental model of a set of operations. The best possible model has as few moving parts as possible, there are as few connections between the parts as possible, each part is as simple as possible, and both the parts and the connections between them are as intuitively obvious as possible.
Making parts as simple as possible is just one design goal, and not a very satisfactory or useful one in its own terms.
All of this turns out to be incredibly hard, and is a literal IQ test. Mediocre developers will always, always create overcomplicated solutions. Top developers have a magical ability to combine a 10,000 foot overview with ground level detail, and will tear through complex problems and reduce them to elegant simplicity.
IMO we should spend less time teaching algorithms and testing algorithmic specifics, and more on analysing complex systems and implementing them with minimal, elegant, intuitive models.
3 replies →
>If you make all of your functions simple, then you simply need more functions to represent the same program
The semantics of the language and the structure of the code help hide irrelevant functional units from the global namespace. Methods attached to an object only need to be considered when operating on some object, for example. Private methods do not pollute the global namespace nor do they need to be present in any mental model of the application unless it is relevant to the context.
While I do think you can go too far with adding functions for its own sake, I don't see that they add to the cognitive load in the same way that possible interactions within a functional unit does. If you're just polluting a global namespace with functions and tiny objects, then that does similarly increase cognitive load and should be avoided.
> No, "Keeping methods short" is not a good way to manage complexity
Agreed
> Allowing more complexity in your functions makes them individually harder to understand
I think that that can mostly be avoided, by sometime creating local scopes {..} to avoid too much state inside a function, combined with whitespace and some section "header" comments (instead of what would have been sub function names).
Can be quite readable I think. And nice to not have to jump back and forth between myriads of files and functions
I have found this to be one of those A or B developer personas that are hard for someone to change, and causes much disagreement. I personally agree 100%, but have known other people who couldn't disagree more, it is what it is.
I've always felt it had a strong correlation to top-down vs bottom-up thinkers in terms of software design. The top-down folks tend to agree with your stance and the bottom-up group do not. If you're naturally going to want to understand all of the nitty gritty details you want to be able to wrap your head around those as quickly as possible. If you're willing to think in terms of the abstractions you want to remove as many of those details from sight as possible to reduce visual noise.
I wish there was an "auto-flattener"/"auto-inliner" tool that would allow you to automagically turn code that was written top-down, with lots of nicely high-level abstractions, into an equivalent code with all the actions mushed together and with infrastructure layers peeled away as much as possible.
Have you ever seen a codebase with infrastructure and piping taking about 70% of the code, with tiny pieces of business logic thrown here and there? It's impossible to figure out where the actual job is being done (and what it actually is): all you can see is just an endless chain of methods that mostly just delegate the responsibility further and further. What could've been a 100-line loop of "foreach item in worklist, do A, B, C" kind is instead split over seven tightly cooperating classes that devote 45% of their code to multiplexing/load-balancing/messaging/job-spooling/etc, another 45% to building trivial auxiliary structure and instantiating each other, and only 10% actually devoted to the actual data processing, but good luck finding those 10%, because there is a never-ending chain of calling each other: A.do_work() calls B.process_item() which calls A.on_item_processing() which calls B.on_processed()... wait, shouldn't there been some work done between "on_item_processing" and "on_processed"? Yes, it was done by an inconspicuously named "prepare_next_worklist_item" function.
Ah, and the icing on the cake: looping is actually done from the very bottom of this call chain by doing a recursive call to the top-most method which at this point is about 20 layers above the current stack frame. Just so you can walk down this path again, now with the feeling.
19 replies →
While I think you are onto something about top-down vs. bottom-up thinkers, one of the issues with a large codebase is literally nobody can do the whole thing bottom-up. So you need some reasonable conventions and abstraction, or the whole thing falls apart under it's own weight.
15 replies →
I’m reminded of an earlier HN discussion about an article called The Wrong Abstraction, where I argued¹ that abstractions have both a benefit and a cost and that their ratio may change as a program evolves and which of those “nitty gritty details” are immediately relevant and which can helpfully be hidden behind abstractions changes.
¹ https://news.ycombinator.com/item?id=23742118
The point is that bottom-up code is a siren song. It never scales. It makes it a lot easier to get started, but given enough complexity it inevitably breaks down.
Once your codebase gets to somewhere around the 10,000 line mark, it becomes impossible for a single mind to hold the entire program in their head at a single time. The only way to survive past that point is with carefully thought out, water tight layers of abstractions. That almost never happens with bottom-up. Bottom-up is a lot like natural selection. You get a lot of kludges that work great to solve their immediate problem, but behave in undefined and unpredictable ways when you extend them outside their original environment.
Bottom-up can work when you're inside well-encapsulated modular components with bounded scope and size. But there's no way to keep those modules loosely coupled unless you have a elegant top-down architecture imposing order at the large-scale structure.
11 replies →
I never thought of things this way but it is a useful perspective.
> The complexity (sum total of possible interactions) grows as the number of lines within a functional boundary grows.
That's only 1 part of the complexity equation.
When you have 100 lines in 1 function you know exactly the order in which each line will happen and under which conditions by just looking at it.
If you split it into 10 functions 10-lines-long each now you have 10! possible orderings of calling these functions (ignoring loops and branches). And since this ordering is separated into multiple places - you have to keep it in your mind. Good luck inventing naming that will make obvious which of the 3628800 possible orderings is happening without reading through them.
Short functions are good when they fit the problem. Often they don't.
I feel like this is only a problem if the small functions share a lot of global state. If each one acts upon its arguments and returns values without side effects, ordering is much less of an issue IMO.
19 replies →
>If you split it into 10 functions 10-lines-long each now you have 10! possible orderings of calling these functions (ignoring loops and branches). And since this ordering is separated into multiple places - you have to keep it in your mind. Good luck inventing naming that will make obvious which of the 3628800 possible orderings is happening without reading through them.
It's easy to make this argument in the abstract, but harder to demonstrate with a concrete example. Do you happen to have any 100 lines of code that you could provide that would show this as a challenge to compare to the refactored code?
You're missing likely missing one or more techniques that make this work well:
1. Depth first function ordering, so the execution order of the lines in the function is fairly similar to that of the expanded 100 lines. This makes top to bottom readability reasonable.
2. Explicit naming of the functions to make it clear what they do, not just part1(); part2() etc.
3. Similar levels of abstraction in each function (e.g. not having both a for loop, several if statements based on variables defined in the funtion, and 3 method calls, instead having 4-5 method calls doing the same thing).
4. Explicit pre/post conditions in each method are called out due to the passing in of parameters and the return values. This more effectively helps a reader understand the lifecycle of relevant variables etc.
In your example of 100 lines, the counterpoint is that now I have a method that has at least 100 ways it could work / fail. By breaking that up, I have the ability to reason about each use case / failure mode.
8 replies →
I am surprised that this is the top answer (Edit: at the moment, was)
How does splitting code into multiple functions suddenly change the order of the code?
I would expect that these functions would be still called in a very specific order.
And sometimes it does not even make sense to keep this order.
But here is a little example (in a made up pseudo code):
===>
Better? No?
7 replies →
There's certainly some difference in priorities between massive 1000-programmer projects where complexity must be aggressively managed and, say, a 3-person team making a simple web app. Different projects will have a different sweet spot in terms of structural complexity versus function complexity. I've seen code that, IMO, misses the sweet spot in either direction.
Sometimes there is too much code in mega-functions, poor separation of concerns and so on. These are easy mistakes to make, especially for beginners, so there are a lot of warnings against them.
Other times you have too many abstractions and too much indirection to serve any useful purpose. The ratio of named things, functional boundaries, and interface definitions to actual instructions can easily get out of hand when people dogmatically apply complexity-managing patterns to things that aren't very complex. Such over-abstraction can fall under YAGNI and waste time/$ as the code becomes slower to navigate, slower to understand in depth, and possibly slower to modify.
I think in software engineering we suffer more from the former problem than the latter problem, but the latter problem is often more frustrating because it's easier to argue for applying nifty patterns and levels of indirection than omitting them.
Just for a tangible example: If I have to iterate over a 3D data structure with an X Y and Z dimension, and use 3 nested loops to do so, is that too complex a function? I'd say no. It's at least as clear without introducing more functional boundaries, which is effort with no benefit.
Well named functions are only half (or maybe a quarter) of the battle. Function documentation is paramount in complex codebases, since documentation should describe various parameters in detail and outline any known issues, side-effects, or general points about calling the function. It's also a good idea to document when a parameter is passed to another function/method.
Yeah, it's a lot of work, but working on recent projects have really taught me the value of good documentation. Naming a function send_records_to_database is fine, but it can't tell you how it determines which database to send the records to, or how it deals with failed records (if at all), or various alternative use cases for the function. All of that must come from documentation (or reading the source of that function).
Plus, I've found that forcing myself to write function documentation, and justify my decisions, has resulted in me putting more consideration into design. When you have to say, "this function reads <some value> name from <environmental variable>" then you have to spend some time considering if future users will find that to be a sound decision.
> documentation should describe various parameters in detail and outline any known issues, side-effects, or general points about calling the function. It's also a good idea to document when a parameter is passed to another function/method.
I'd argue that writing that much documentation about a single function suggests that the function is a problem and the "send_records_to_database" example is a bad name. It's almost inevitable that the function doing so much and having so much behavior that needs documentation will, at some point, be changed and make the documentation subtly wrong, or at least incomplete.
6 replies →
Yikes, I hope I don't have to read documentation to understand how the code deals with failed records or other use cases. Good code would have the use cases separated from the send_records_to_database so it would be obvious what the records were and how failure conditions are handled.
1 reply →
"Plus, I've found that forcing myself to write function documentation, and justify my decisions, has resulted in me putting more consideration into design."
This, this, and... this.
Sometimes, I step back after writing documentation and realise, this is a bunch of baloney. It could be much simpler, or this is a terrible decision! My point: Writing documentation is about expressing the function a second time -- the first time was code, the second time was natural language. Yeah, it's not a perfect 1:1 (see: the law in any developed country!), but it is a good heuristic.
Documentation is only useful it is up to date and correct. I ignore documentation because I've never found the above are true.
There are contract/proof systems that seem like they might work help. At least the tool ensures it is correct. However I'm not sure if such systems are readable. (I've never used one in the real world)
2 replies →
> The idea of code telling a story is that a unit of work should explain what it does through its use of well named variables, function/object names, and how data flows between function/objects.
Code telling a story is a fallacy that programmers keep telling themselves and which fails to die. Code doesn't tell stories, programmers do. Code can't explain why it exists; it can't tell you about the buggy API it relies on and which makes its implementation weird and not straight-forward; it can't say when it's no longer needed.
Good names are important, but it's false that having well-chosen function and arguments names will tell a programmer everything they need to know.
>Code doesn't tell stories, programmers do. Code can't explain why it exists;
Code can't tell every relevant story, but it can tell a story about how it does what it does. Code is primarily written for other programmers. Writing code in such a way that other people with some familiarity with the problem space can understand easily should be the goal. But this means telling a story to the next reader, the story of how the inputs to some functional unit are translated into its outputs or changes in state. The best way to explain this to another human is almost never the best way to explain it to a computer. But since we have to communicate with other humans and to the computer from the same code, it takes some effort to bridge the two paradigms. Having the code tell a story at the high level by way of the modules, objects and methods being called is how we bridge this gap. But there are better and worse ways to do this.
Software development is a process of translating the natural language-spec of the system into a code-spec. But you can have the natural language-spec embedded in the structure of the code to a large degree. The more, the better.
13 replies →
> Code doesn't tell stories, programmers do
It is like saying the books do not tell stories, writers do.
1 reply →
Is code just a byproduct of specs then? Any thoughts on literate programming?
1 reply →
Your argument falls apart once you need to actually debug one of these monstrosities, as often the bug itself also gets spread out over half a dozen classes and functions, and it's not obvious where to fix it.
More code, more bugs. More hidden code, more hidden bugs. There's a reason those who have worked in software development longer tend to prefer less abstraction: most of them are those who have learned from their experiences, and those who aren't are "architects" optimising for job security.
If a function is only called once it should just be inline, the IDE can collapse. A descriptive comment can replace the function name. It can be a lambda with immediate call and explicit captures if you need to prevent the issue of not knowing which local variables it interacts with as the function grows significantly, or if the concern is others using leftover variables its own can go into a plain scop e. Making you have to jump to a different area of code to read just breaks up linear flow for no gain, especially when you often have to read it anyway to make sure it doesn't have global side effects, might as well read it in the single place it is used.
If it is going to be used more than once and is, then make a function (unless it is so trivial the explicit inline is more readable). If you are designing a public API where it may need to be overridden count it as more than once.
Some of the above is language dependent.
4 replies →
> The best code is code you don't have to read because of well named functional boundaries.
I don't know which is harder. Explaining this about code, or about tests.
The people with no sense of DevX see nothing wrong with writing tests that fail as:
If you make me read the tests to modify your code, I'm probably going to modify the tests. Once I modify the tests, you have no idea if the new tests still cover all of the same concerns (especially if you wrote tests like the above).
Make the test red before you make it green, so you know what the errors look like.
Oh god. Or just the tests that are walls of text, mixes of mocks and initializers and constructors and method calls.
Like good god, extract that boiler plate into a function. Use comments and white space to break it up and explain the workflow.
2 replies →
> Make the test red before you make it green, so you know what the errors look like.
Oh! I like this. I never considered this particular reason why making tests fail first might be a good idea.
“There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult.” ― C. A. R. Hoare
this quote scales
This quote does not scale. Software contains essential complexity because it was built to fulfill a need. You can make all of the beautiful, feature-impoverished designs you want - they won't make it to production, and I won't use them, because they don't do the thing.
If your software does not do the thing, then it's not useful, it's a piece of art - not an artifact of software engineering that is meant to fulfill a purpose.
1 reply →
But not everybody codes “at scale”. If you have a small, stable team, there is a lot less to worry about.
Secondly it is often better to start with less abstractions and boundaries, and add them when the need becomes apparent, rather than trying to remove ill conceived boundaries and abstractions that were added at earlier times.
Coding at scale is not dependent on the number of people, but on the essential complexity of the problem. One can fail at a one-man project due to lack of proper abstraction with a sufficiently complex problem. Like, try to write a compiler.
> The idea of code telling a story is that a unit of work should explain what it does through its use of well named variables, function/object names, and how data flows between function/objects. If you have to dig into the details of a function to understand what it does, you have failed to sufficiently explain what the function does through its naming and set of arguments.
That's fine in theory and I still sort-of believe that, but in practice, I came to believe most programming languages are insufficiently expressive for this vision to be true.
Take, as a random example, this bit of C++:
Ok, I know what Frobnification is. I know what Quuxify does, it's defined a few lines above. From that single line, I can guess it Frobs every member of bar via Quuxify. But is bar modified? Gotta check the signature of Frobnicate! That means either getting an IDE help popup, or finding the declaration.
From the signature, I can see that bar full of Bars isn't going to be modified. But then I think, is foo.size() going to be equal to bar.size()? What if bar is empty? Can Frobnicate throw an exception? Are there any special constraints on the function Fn passed to it? Does Fn have to be a funcallable thing? Can't tell that until I pop into definition of Frobnicate.
I'll omit the definition here. But now that I see it, I realize that Fn has to be a function of a very particular signature, that Fn is applied to every other element of the input vector (and not all of them, as I assumed), that the code has a bug and will crash if the input vector has less than 2 elements, and it calls three other functions that may or may not have their own restrictions on arguments, and may or may not throw an exception.
If I don't have a fully-configured IDE, I'll likely just ignore it and bear the risk. If I have, I'll routinely jump-to-definition into all these functions, quickly eye them for any potential issues... and, if I have the time, I'll put a comment on top of Frobnicate declaration, documenting everything I just learned - because holy hell, I don't want to waste my time doing the same thing next week. I would rename the function itself to include extra details, but then the name would be 100+ characters long...
Some languages are better at this than others, but my point is, until we have programming languages that can (and force you to) express the entire function contract in its signature and enforce this at compile-time, it's unsafe to assume a given function does what you think it does. Comments would be a decent workaround, if most programmers could be arsed to write them. As it is, you have to dig into the implementation of your dependencies, at least one level deep, if you want to avoid subtle bugs creeping in.
This is a good point and I agree. In fact, I think this really touches on why I always had a hard time understanding C++ code. I first learned to program with C/C++ so I have no problem writing C++, but understanding other people's code has always been much more difficult than other languages. Its facilities for abstraction were (historically) subpar, and even things like aliased variables where you have to jump to the function definition just to see if the parameter will be modified really get in the way of easy comprehension. And then the nested template definitions. You're right that how well relying on well named functional boundaries works depends on the language, and languages aren't at the point where it can be completely relied on.
This is true but having good function names will at least help you avoid going two levels deep. Or N levels. Having a vague understanding of a function call’s purpose from its name helps because you have to trim the search tree somewhere.
Though, if you’re in a nest of tiny forwarding functions, who knows how deep you’ll have to go?
5 replies →
Function names are comments, and have similar failure modes.
Comments that are limited to only a 2 or 3 dozen characters at most, so worse than comments ime.
1 reply →
But it's easier to notice they're outdated, because you don't see them only when looking at the implementation.
> If you have to dig into the details of a function to understand what it does, you have failed to sufficiently explain what the function does through its naming and set of arguments.
Which is often unavoidable, many functions are insufficiently explained by those alone unless you want four-word camelcase monstrosities for names. The code of the function should be right-sized. Size and complexity need to be balanced there- simpler and easier-to-follow is sometimes larger. I work on compilers, query processors and compute engines, cognitive load from the subject domains are bad enough without making the code arbitrarily shaped.
[edit] oh yes, what jzoch says below. Locality helps with taming the network of complexity between functions and data.
[edit] oh no, here come the downvotes!
> ...many functions are insufficiently explained by [naming and set of arguments] alone unless you want four-word camelcase monstrosities for names.
Come now, is four words really all that "monstrously" much?
> The code of the function should be right-sized.
Feels like that should go for its name too.
> Size and complexity need to be balanced there- simpler and easier-to-follow is sometimes larger.
The longer the code, the longer the name?
2 replies →
I think we need to recognize the limits of this concept. To reach for an analogy, both Dr. Seuss and Tolstoy wrote well but I'd much rather inherit source code that reads like 10 pages of the former over 10 pages of the latter. You could be a genuine code-naming artist but at the end of the day all I want to do is render the damn HTML.
> If you have to dig into the details of a function to understand what it does, you have failed to sufficiently explain what the function does through its naming and set of arguments.
This isn't always true in my experience. Often when I need to dig into the details of a function it's because how it works is more important than what it says it's doing. There are implementation concerns you can't fit into a function name.
Additionally, I have found that function names become outdated at about the same rate as comments do. If the common criticism of code commenting is that "comments are code you don't run", function names also fall into that category.
I don't have a universal rule on this, I think that managing code complexity is highly application-dependent, and dependent on the size of the team looking at the code, and dependent on the age of the code, and dependent on how fast the code is being iterated on and rewritten. However, in many cases I've started to find that it makes sense to inline certain logic, because you get rid of the risk of names going out of date just like code comments, and you remove any ambiguity over what the code actually does. There are some other benefits as well, but they're beyond the scope of the current conversation.
Perfect abstractions are relatively rare, so in instances where abstractions are likely to be very leaky (which happens more often than people suspect), it is better to be extremely transparent about what the code is doing, rather than hiding it behind a function name.
> The complexity (sum total of possible interactions) grows as the number of lines within a functional boundary grows.
I'll also push back against this line of thought. The sum total of possible interactions do not decrease when you move code out into a separate function. The same number of lines of code still get run, and each line carries the same potential to have a bug. In fact, in many cases, adding additional interfaces between components and generalizing them can increase the number of code paths and potential failure points.
If you define complexity by the sum total of possible interactions (which is itself a problematic definition, but I'll talk about that below), then complexity always increases when you factor out functions, because the interfaces, error-handling, and boilerplate code around those functions increases the number of possible interactions happening during your function call.
> The complexity (sum total of possible interactions) grows as the number of lines within a functional boundary grows.
What I've come to understand is that complexity is relative. A solution that makes a codebase less complex for one person in an organization may make a codebase more complex for someone else in the organization who has different responsibilities over the codebase.
If you are building an application with a large team, and there are clear divisions of responsibilities, then functional boundaries are very helpful because they hide the messy details about how low-level parts of the code work.
However, if you are responsible for maintaining both the high-level and low-level parts of the same codebase, than separating that logic can sometimes make the program harder to manage, because you still have to understand how both parts of the codebase work, but now you also have understand how the interfaces and abstractions between them fit together and what their limitations are.
In single-person projects where I'm the only person touching the codebase I do still use abstractions, but I often opt to limit the number of abstractions, and I inline code more often than I would in a larger project. This is because if I'm the only person working on the code, I need to be able to hold almost the entire codebase in my head at the same time in order to make informed architecture decisions, and managing a large number of abstractions on top of their implementations makes the code harder to reason about and increases the number of things I need to remember. This was a hard-learned lesson for me, but has made (I think) an observable difference in the quality and stability of the code I write.
>> If you have to dig into the details of a function to understand what it does, you have failed to sufficiently explain what the function does through its naming and set of arguments.
> This isn't always true in my experience. Often when I need to dig into the details of a function it's because how it works is more important than what it says it's doing. There are implementation concerns you can't fit into a function name.
Both of these things are not quite right. Yes, if you have to dig into the details of a function to understand what it does, it hasn't been explained well enough. No, the prototype cannot contain enough information to explain it. No, you shouldn't look at the implementation either - that leads to brittle code where you start to rely on the implementation behavior of a function that isn't part of the interface.
The interface and implementation of a function are separate. The former should be clearly-documented - a descriptive name is good, but you'll almost always also need docstrings/comments/other documentation - while you should rarely rely on details of the latter, because if you are, that usually means that the interface isn't defined clearly enough and/or the abstraction boundaries are in the wrong places (modulo things like looking under the hood to refactor, improve performance, etc - all abstractions are somewhat leaky, but you shouldn't be piercing them regularly).
> If you define complexity by the sum total of possible interactions (which is itself a problematic definition, but I'll talk about that below), then complexity always increases when you factor out functions, because the interfaces, error-handling, and boilerplate code around those functions increases the number of possible interactions happening during your function call.
This - this is what everyone who advocates for "small functions" doesn't understand.
1 reply →
Finally! I'm glad to hear I'm not the only one. I've gone against 'Clean Code' zealots that end up writing painfully warped abstractions in the effort to adhere to what is in this book. It's OK to duplicate code in places where the abstractions are far enough apart that the alternative is worse. I've had developers use the 'partial' feature in C# to meet Martin's length restrictions to the point where I have to look through 10-15 files to see the full class. The examples in this post are excellent examples of the flaws in Martin's absolutism.
You were never alone Juggles. We've been here with you the whole time.
I have witnessed more people bend over backwards and do the most insane things in the name of avoiding "Uncle Bob's" baleful stare.
It turns out that following "Uncle Sassy's" rules will get you a lot further.
1. Understand your problem fully
2. Understand your constraints fully
3. Understand not just where you are but where you are headed
4. Write code that takes the above 3 into account and make sensible decisions. When something feels wrong ... don't do it.
Quality issues are far more often planning, product management, strategic issues than something as easily remedied as the code itself.
"How do you develop good software? First, be a good software developer. Then develop some software."
The problem with all these lists is that they require a sense of judgement that can only be learnt from experience, never from checklists. That's why Uncle Bob's advice is simultaneously so correct, and yet so dangerous with the wrong fingers on the keyboard.
10 replies →
I've also never agreed completely with Uncle Bob. I was an OOP zealot for maybe a decade, and I'm now I'm a Rust convert. The biggest "feature" of Rust is that is probably brought semi-functional concepts to the "OOP masses." I found that, with Rust, I spent far more time solving the problem at hand...
Instead of solving how I am going to solve the problem at hand ("Clean Coding"). What a fucking waste of time, my brain power, and my lifetime keystrokes[1].
I'm starting to see that OOP is more suited to programming literal business logic. The best use for the tool is when you actually have a "Person", "Customer" and "Employee" entities that have to follow some form of business rules.
In contradiction to your "Uncle Sassy's" rules, I'm starting to understand where "Uncle Beck" was coming from:
1. Make it work.
2. Make it right.
3. Make it fast.
The amount of understanding that you can garner from make something work leads very strongly into figuring out the best way to make it right. And you shouldn't be making anything fast, unless you have a profiler and other measurements telling you to do so.
"Clean Coding" just perpetuates all the broken promises of OOP.
[1]: https://www.hanselman.com/blog/do-they-deserve-the-gift-of-y...
1 reply →
> 1. Understand your problem fully
> 2. Understand your constraints fully
These two fall under requirements gathering. It's so often forgotten that software has a specific purpose, a specific set of things it needs to do, and that it should be crafted with those in mind.
> 3. Understand not just where you are but where you are headed
And this is the part that breaks down so often. Because software is simultaneously so easy and so hard to change, people fall into traps both left and right, assuming some dimension of extensibility that never turns out to be important, or assuming something is totally constant when it is not.
I think the best advice here is that YAGNI, don't add functionality for extension unless your requirements gathering suggests you are going to need it. If you have experience building a thing, your spider senses will perk up. If you don't have experience building the thing, can you get some people on your team that do? Or at least ask them? If that is not possible, you want to prototype and fail fast. Be prepared to junk some code along the way.
If you start out not knowing any of these things, and also never junking any code along the way, what are the actual odds you got it right?
1 reply →
>Write code that takes the above 3 into account and make sensible decisions. When something feels wrong ... don't do it.
The problem is that people often need specifics to guide them when they're less experienced. Something that "feels wrong" is usually due to vast experience being incorporated into your subconscious aesthetic judgement. But you can't rely on your subconscious until you've had enough trials to hone your senses. Hard rules can and often are overapplied, but its usually better than the opposite case of someone without good judgement attempting to make unguided judgement calls.
1 reply →
> and make sensible decisions
well there goes the entire tech industry
My last company was very into Clean Code, to the point where all new hires were expected to do a book club on it.
My personal take away was that there were a few good ideas, all horribly mangled. The most painful one I remember was his treatment of the Law of Demeter, which, as I recall, was so shallow that he didn't even really even thoroughly explain what the law was trying to accomplish. (Long story short, bounded contexts don't mean much if you're allowed to ignore the boundaries.) So most everyone who read the book came to earnestly believe that the Law of Demeter is about period-to-semicolon ratios, and proceeded to convert something like
into
and somehow convince themselves that doing this was producing tangible business value, and congratulate themselves for substantially improving the long-term maintainability of the code.
Meanwhile, violations of the actual Law of Demeter ran rampant. They just had more semicolons.
On that note, I've never seen an explanation of Law of Demeter that made any kind of sense to me. Both the descriptions I read and the actual uses I've seen boiled down to the type of transformation you just described, which is very much pointless.
> Long story short, bounded contexts don't mean much if you're allowed to ignore the boundaries.
I'd like to read more. Do you know of a source that covers this properly?
17 replies →
I love how this is clearly a contextual recommendation. I'm not a software developer, but a data scientist. In pandas, to write your manipulations in this chained methods fashing is highly encouraged IMO. It's even called "pandorable" code
3 replies →
Every time that I see a builder pattern, I see a failure in adopting modern programming languages. Use named parameters, for f*cks sake!
I think following some ideas in the book, but ignoring others like the ones applicable for the law of demeter can be a recipe for a mess. The book is very opinionated, but if followed well I think it can produce pretty dead simple code. But at the same time, just like with any coding, experience plays massively into how well code is written. Code can be written well when using his methods or when ignoring his methods and it can be written badly when trying to follow some of his methods or when not using his methods at all.
>his treatment of the Law of Demeter, which, as I recall, was so shallow that he didn't even really even thoroughly explain what the law was trying to accomplish.
oof. I mean, yeah, at least explain what the main thing you’re talking about is about, right? This is a pet peeve.
wow this is a nightmare
> It's OK to duplicate code in places where the abstractions are far enough apart that the alternative is worse.
I don't recall where I picked up from, but the best advice I've heard on this is a "Rule of 3". You don't have a "pattern" to abstract until you reach (at least) three duplicates. ("Two is a coincidence, three is pattern. Coincidences happen all the time.") I've found it can be a useful rule of thumb to prevent "premature abstraction" (an understandable relative of "premature optimization"). It is surprising sometimes the things you find out about the abstraction only happen when you reach that third duplicate (variables or control flow decisions, for instance, that seem constants in two places for instance; or a higher level idea of why the code is duplicated that isn't clear from two very far apart points but is clearer when you can "triangulate" what their center is).
I don't hate the rule of 3. But i think it's missing the point.
You want to extract common code if it's the same now, and will always be the same in the future. If it's not going to be the same and you extract it, you now have the pain of making it do two things, or splitting. But if it is going to be the same and you don't extract it, you have the risk of only updating one copy, and then having the other copy do the wrong thing.
For example, i have a program where one component gets data and writes it to files of a certain format in a certain directory, and another component reads those files and processes the data. The code for deciding where the directory is, and what the columns in the files are, must be the same, otherwise the programs cannot do their job. Even though there are only two uses of that code, it makes sense to extract them.
Once you think about it this way, you see that extraction also serves a documentation function. It says that the two call sites of the shared code are related to each other in some fundamental way.
Taking this approach, i might even extract code that is only used once! In my example, if the files contain dates, or other structured data, then it makes sense to have the matching formatting and parsing functions extracted and placed right next to each other, to highlight the fact that they are intimately related.
4 replies →
> “premature abstraction”
Also known as, “AbstractMagicObjectFactoryBuilderImpl” that builds exactly one (1) factory type that generates exactly (1) object type with no more than 2 options passed into the builder and 0 options passed into the factory. :-)
The Go proverb is "A little copying is better than a little dependency." Also don't deduplicate 'text' because it's the same, deduplicate implementations if they match in both mechanism 'what' it does as well as their semantic usage. Sometimes the same thing is done with different intents which can naturally diverge and the premature deduplication is debt.
I'm coming to think that the rule of three is important within a fairly constrained context, but that other principle is worthwhile when you're working across contexts.
For example, when I did work at a microservices shop, I was deeply dissatisfied with the way the shared utility library influenced our code. A lot of what was in there was fairly throw-away and would not have been difficult to copy/paste, even to four or more different locations. And the shared nature of the library meant that any change to it was quite expensive. Technically, maybe, but, more importantly, socially. Any change to some corner of the library needed to be negotiated with every other team that was using that part of the library. The risk of the discussion spiraling away into an interminable series of bikesheddy meetings was always hiding in the shadows. So, if it was possible to leave the library function unchanged and get what you needed with a hack, teams tended to choose the hack. The effects of this phenomenon accumulated, over time, to create quite a mess.
3 replies →
> I don't recall where I picked up from, but the best advice I've heard on this is a "Rule of 3"
I know this as AHA = Avoid Hasty Abstractions:
- https://kentcdodds.com/blog/aha-programming
- https://sandimetz.com/blog/2016/1/20/the-wrong-abstraction
> It's OK to duplicate code in places where the abstractions are far enough apart that the alternative is worse.
Something I’ve mentioned to my direct reports during code reviews: Sometimes, code is duplicated because it just so happens to do something similar.
However, these are independent widgets and changes to one should not affect the other; in other words, not suitable for abstraction.
This type of reasoning requires understanding the problem domain (i.e., use-case and the business functionality ).
I've gone through java code where i need to open 15 different files, with one lined pieces of code, just to find out it's a "hello world" class.
I like abstraction as much as the next guy but this is closer to obfuscation than abstraction.
At a previous company, there was a Clean Code OOP zealot. I heard him discussing with another colleague about the need to split up a function because it was too long (it was 10 lines). I said, from the sidelines, "yes, because nothing enhances readability like splitting a 10 line function into 10, 1-line functions". He didn't realize I was being sarcastic and nodded in agreement that it would be much better that way.
Spaghetti code is bad, but lasagna code is just as bad IMO.
1 reply →
There seems to be a lot of overlap between the Clean Coders and the Neo Coders [0]. I wish we could get rid of both.
[0] People who strive for "The One" architecture that will allow any change no matter what. Seriously, abstraction out the wazoo!
Honestly. If you're getting data from a bar code scanner and you think, "we should handle the case where we get data from a hot air balloon!" because ... what if?, you should retire.
I like to say "the machine that does everything, does nothing".
The problem is that `partial` in C# should never even have been considered as a "solution" to write small, maintainable classes. AFAIK partial was introduced for code-behind files, not to structure human written code.
Anyways, you are not alone with that experience - a common mistake I see, no matter what language or framework, is that people fall for the fallacy "separation into files" is the same as "separation of concerns".
Seriously? That's an abuse of partial and just a way of following the rules without actually following them. That code must have been fun to navigate...
Many years ago I worked on a project that had a hard “no hard coded values” rule, as requested by the customer. The team routinely wrote the equivalent to
And I couldn’t get my manager to understand why this was a problem.
9 replies →
Can I please forward your contact info to my developer? Maybe you can do a better job convincing him haha ;)
> where I have to look through 10-15 files to see the full class
The Magento 2 codebase is a good example of this. It's both well written and horrible at the same time. Everything is so spread out into constituent technical components, that the code loses the "narrative" of what's going on.
I started OOP in '96 and I was never able to wrap my head around the code these "Clean Code" zealots produced.
Case in point: Bob Martin's "Video Store" example.
My best guess is that clean code, to them, was as little code on the screen as possible, not necessarily "intention revealing code either", instead everything is abstracted until it looks like it does nothing.
I have had the experience of trying to understand how a feature in a C++ project worked (both Audacity and Aegisub I think) only to find that I actually could not find where anything was implemented, because everything was just a glue that called another piece of glue.
Also sat in their IRC channel for months and the lead developer was constantly discussing how he'd refactor it to be cleaner but never seemed to add code that did something.
SOLID code is a very misleading name for a technique that seems to shred the code into confetti.
I personally don't feel all that productive spending like half my time just navigating the code rather than actually reading it, but maybe it's just me.
1 reply →
People to people dealt this fate ...
What is mostly surprising I find most of developers are trying to obey the "rules". Code containing even minuscule duplication must be DRYied, everyone agrees that code must be clean and professional.
Yet it is never enough, bugs are showing up and stuff that was written by others is always bad.
I start thinking that 'Uncle Bob' and 'Clean code' zealots are actually harmful, because it prevents people from taking two steps back and thinking about what they are doing. Making microservices/components/classes/functions that end up never reused and making DRY holy grail.
Personally I am YAGNI > DRY and a lot of times you are not going to need small functions or magic abstractions.
I think the problem is not the book itself, but people thinking that all the rules apply to all the code, al the time. A length restriction is interesting because it makes you think if maybe you should spit your function into more than one, as you might be doing too much in one place. Now, if splitting will make things worse, then just don't.
In C# and .NET specifically, we find ourselves having a plethora of services when they are "human-readable" and short.
A service has 3 "helper" services it calls, which may, in turn have helper services, or worse, depend on a shared repo project.
The only solution I have found is to move these helpers into their own project, and mark the helpers as internal. This achieves 2 things:
1. The "sub-services" are not confused as stand-alone and only the "main/parent" service can be called. 2. The "module" can now be deployed independently if micro-services ever become a necessity.
I would like feedback on this approach. I do honestly thing files over 100 lines long are unreadable trash, and we have achieved a lot be re-using modular services.
We are 1.5 years into a project and our code re-use is sky-rocketing, which allows us to keep errors low.
Of course, a lot of dependencies also make testing difficult, but allow easier mocks if there are no globals.
>I would like feedback on this approach. I do honestly thing files over 100 lines long are unreadable trash
Dunno if this is the feedback you are after, but I would try to not be such an absolutist. There is no reason that a great 100 line long file becomes unreadable trash if you add one line.
2 replies →
Partial classes is an ugly hack to mix human and machine generated source code. IMHO it should be avoided
> I've had developers use the 'partial' feature in C# to meet Martin's length restrictions
That is not the fault of this book or any book. The problem is people treating the guidelines as rituals instead of understanding their purpose.
What do you say to convince someone? It’s tricky to review a large carefully abstracted PR that introduces a bunch of new logic and config with something like: “just copy paste lol”
... and here I was thinking I was alone!
Sometimes you really just do need a 500 line function.
Yes, it's the first working implementation before good boundaries are not yet known. After a while it becomes familiar and natural conceptual boundaries arise that leads to 'factoring' and shouldn't require 'refactoring' because you prematurely guessed the wrong boundaries.
I'm all for the 100-200 line working version--can't say I've had a 500. I did once have a single SQL query that was about 2 full pages pushing the limits of DB2 (needed multiple PTFs just to execute it)--the size was largely from heuristic scope reductions. In the end, it did something in about 3 minutes that had no previous solution.
Nah mate, you never do. Nor 500 1-liners.
Yes let's please do this. I'm tried of this book being brought up at work.
My clean code book:
* Put logic closest to where it needs to live (feature folders)
* WET (Write everything twice), figure out the abstraction after you need something a 3rd time
* Realize there's no such thing as "clean code"
> WET (Write everything twice), figure out the abstraction after you need something a 3rd time
so much this. it is _much_ easier to refactor copy pasta code, than to entangle a mess of "clean code abstractions" for things that isn't even needed _once_. Premature Abstraction is the biggest problem in my eyes.
Write Code. Mostly functions. Not too much.
I think where DRY trips people up is when you have what I call "incidental repetition". Basically, two bits of logic seem to do exactly the same thing, but the contexts are slightly different. So you make a nice abstraction that works well until you need to modify the functionality in one context and not the other...
18 replies →
> it is _much_ easier to refactor copy pasta code
So long as it remains identical. Refactoring almost identical code requires lots of extremely detailed staring to determine whether or not two things are subtly different. Especially if you don't have good test coverage to start with.
3 replies →
> Write Code. Mostly functions. Not too much.
Just wanted to appreciate the nod to good nutritional advice here ("Eat food. Mostly plants. Not too much"), well done
1 reply →
There's a problem with being overly zealous. It's entirely possible to write bad code, either being overly dry or copy paste galore. I think we are prone to these zealous rules because they are concrete. We want an "objective" measure to judge whether something is good or not.
DRY and WET are terms often used as objective measures of implementations, but that doesn't mean that they are rock solid foundations. What does it mean for something to be "repeated"? Without claiming to have TheGreatAnswer™, some things come to mind.
Chaining methods can be very expressive, easy to follow and maintain. They also lead to a lot of repetition. In an effort to be "DRY", some might embark on a misguided effort to combine them. Maybe start replacing
with
This would be a bad idea, also known as Suck™.
But there may equally be situations where consolidation makes sense. For example, if we're in an ORM helper class and we're always querying the database for an object like so
then it with make sense to consolidate that into
My $0.02:
Don't needlessly copy-pastes that which is abstractable.
Don't over abstract at the cost of simplicity and flexibility.
Don't be a zealot.
>it is _much_ easier to refactor copy pasta code
I totally agree assuming that there will be time to get to the second pass of the "write everything twice" approach...some of my least favorite refactoring work has been on older code that was liberally copy-pasted by well-intentioned developers expecting a chance to come back through later but who never get the chance. All too often the winds of corporate decision making will change and send attention elsewhere at the wrong moment, and all those copy pasted bits will slowly but surely drift apart as unfamiliar new developers come through making small tweaks.
I worked on a small team with a very "code bro" culture. No toxic, but definitely making non-PC jokes. We would often say "Ask your doctor about Premature Abstractuation" or "Bad news, dr. says this code has abstractual dysfunction" in code reviews when someone would build an AbstractFactoryFactoryTemplateConstructor for a one-off item.
When we got absorbed by a larger team and were going to have to "merge" our code review / git history into a larger org's repos, we learned that a sister team had gotten in big trouble with the language cops in HR when they discovered similar language in their git commit history. This brings back memories of my team panicked over trying to rewrite a huge amount of git history and code review stuff to sanitize our language before we were caught too.
4 replies →
> it is _much_ easier to refactor copy pasta code,
Its easy to refactor if its nondivergent copypasta and you do it everywhere it is used not later than the third iteration.
If the refactoring gets delayed, the code diverges because different bugs are noticed and fixed (or thr same bug is noticed and fixed different ways) in different iterations, and there are dozens of instances across the code base (possibly in different projects because it was copypastad across projects rather than refactored into a reusable library), the code has in many cases gotten intermixed with code addressing other concerns...
> Write Code. Mostly functions. Not too much.
Think about data structures (types) first. Mostly immutable structures. Then add your functions working on those structures. Not too many.
1 reply →
"Write Code. Mostly Functions. Not too much" made my day and is excellent advice.
I think this ties in to something I've been thinking, though it might be project specific.
Good code should be written to be easy to delete.
'Clever' abstractions work against this. We should be less precious about our code and realise it will probably need to change beyond all recognition multiple times. Code should do things simply so the consequences of deleting it are immediately obvious. I think your recommendations fit with this.
>code should be written to be easy to delete
This article tends to make the rounds on here every once in a while: https://programmingisterrible.com/post/139222674273/how-to-w...
Aligns with my current meta-principle, which is that good code is malleable (easily modified, which includes deletion). A lot of design principles simply describe this principle from different angles. Readable code is easy to modify because you can understand it. Terse code is more easily modified because there’s less of it (unless you’ve sacrificed readability for terseness). SRP limits the scope of changes and thus enhances modifiability. Code with tests is easier to modify because you can refactor with less fear. Immutability makes code easier to modify because you don’t have to worry about state changes affecting disparate parts of the program.
Etc... etc...
(Not saying that this is the only quality of good code or that you won’t have to trade some of the above for performance or whatnot at times).
The unpleasant implication of this is that code has a tendency towards becoming worse over time. Because the code that is good enough to be easy to delete or change is, and the code that is too bad to be touched remains.
1 reply →
https://en.wikipedia.org/wiki/Rule_of_three_(computer_progra...
> * WET (Write everything twice), figure out the abstraction after you need something a 3rd time
There are two opposite situations. One is when several things are viewed as one thing while they're actually different (too much abstraction), and another, where a thing is viewed as different things, when it's actually a single one (when code is just copied over).
In my experience, the best way to solve this is to better analyse and understand the requirements. Do these two pieces of code look the same because they actually mean thing in the meaning of the product? Or they just happen to look the same at this particular moment in time, and can continue to develop in completely different directions as the product grows?
Solving the former is generally way uglier/more obnoxious IMO than solving the latter, esp. if you were not the person who designed the former.
I read Clean Code in 2010 and trying out and applying some of the principles really helped to make my code more maintainable. Now over 10 years later I have come to realize that you cannot set up too many rules on how to structure and write code. It is like forcing all authors to apply the same writing style or all artists to draw their paintings with the exact same technique. With that analogy in mind, I think that one of the biggest contributors to messy code is having a lot of developers, all with different preferences, working in the same code base. Just imagine having 100 different writers trying to write a book, this is the challenge we are trying to address.
I'm not sure that's really true. Any publication with 100 different writers almost certainly has some kind of style guide that they all have to follow.
> WET (Write everything twice)
In practice (time pressure) you might end up duplicating it many times, at which point it becomes difficult to refactor.
If it's really abstractable it shouldn't be difficult to refactor. It should literally be a substitution. If it's not, then you have varied cases that you'd have to go back and tinker with the abstraction to support.
It's a similar design and planning principle to building sidewalks. You have buildings but you don't know exactly the best paths between everything and how to correctly path things out. You can come up with your own design but people will end up ignoring them if they don't fit their needs. Ultimately, you put some obvious direct connection side walks and then wait to see the paths people take. You've now established where you need connections and how they need to be formed.
I do a lot of prototyping work and if I had to sit down and think out a clean abstraction everytime I wanted to get for a functional prototype, I'd never have a functional prototype--plus I'd waste a lot of cognitive capacity on an abstraction instead of solving the problem my code is addressing. It's best, from my experience, to save that time and write messy code but tuck in budget to refactor later (the key is you have to actually refactor later not just say you will).
Once you've built your prototype, iterated on it several times had people break designs forcing hacked out solutions, and now have something you don't touch often, you usually know what most the product/service needs to look like. You then abstract that out and get 80-90% of what you need if there's real demand.
The expanded features beyond that can be costly if they require significant redesign but at that point, you hopefully have a stable enough product it can warrant continued investment to refactor. If it doesn't, you saved yourself a lot of time and energy worrying trying to create a good abstract design that tends to fail multiple times at early stages. There's a balance point of knowing when to build technical debt, when to pay it off, and when to nullify it.
Again, the critical trick is you have to actually pay off the tech debt if that time comes. The product investor can't look bright eyed and linearly extrapolate progress so far thinking they saved a boatload of capital, they have to understand shortcuts were taken and the rationale was to fix them if serious money came along or chuck them in the bin if not.
1 reply →
WET is great until JSON token parsing breaks and a junior dev fixes it in one place and then I am fixing the same exact problem somewhere else and moving it into a shared file. If it's the exact same functionality, move it into a service/helper.
How do you deal with other colleagues that have all the energy and time to push for these practices and I feel makes things worse than the current state?
Explain that the wrong abstraction makes code more complicated than copy-paste and that before you can start factoring out common code you need to be sure the relationship is fundamental and not coincidental.
1 reply →
> figure out the abstraction after you need something a 3rd time
That's still too much of a "rule".
Whenever I feel (or know) two functions are similar, the factors that determine if I should merge them:
- I see significant benefit too doing so, usually the benefit of a utility that saves writing the same thing in the future, or debugging the same/similar code repeatedly.
- How likely the code is to diverge. Sometimes I just mark things for de-duping, but leave it around a while to see if one of the functions change.
- The function is big enough it cannot just be in-lined where it is called, and the benefit of de-duplication is not outweighed by added complexity to the call stack.
repeat after me:
Document.
Your.
Shit.
everything else can do one. Just fucking write documentation as if you're the poor bastard trying to maintain this code with no context or time.
Documentation is rarely adequately maintained, and nothing enforces that it stay accurate and maintained.
Comments in code can lie (they're not functional); can be misplaced (in most languages, they're not attached to the code they document in any enforced way); are most-frequently used to describe things that wouldn't require documenting if they were just named properly; are often little more than noise. Code comments should be exceedingly-rare, and only used to describe exception situations or logic that can't be made more clear through the use of better identifiers or better-composed functions.
External documentation is usually out-of-sight, out-of-mind. Over time, it diverges from reality, to the point that it's usually misleading or wrong. It's not visible in the code (and this isn't an argument in favor of in-code comments). Maintaining it is a burden. There's no agreed-upon standard for how to present or navigate it.
The best way to document things is to name identifiers well, write functions that are well-composed and small enough to understand, stick to single-responsibility principles.
API documentation is important and valuable, especially when your IDE can provide it readily at the point of use. Whenever possible, it should be part of the source code in a formal way, using annotations or other mechanisms tied to the code it describes. I wish more languages would formally include annotation mechanisms for this specific use case.
7 replies →
> WET
One of the most difficult to argue comments in code reviews: “let’s make it generic in case we need it some place else”. First of all, chances that we need it some place else aren’t exactly high, unless you are writing a library and code explicitly design to be shared. And even if such need arises, chances of getting it right generalizing from one example are slim.
Regarding the book though, I have participated in one of the workshops with the author and he seemed to be in favor of WER and against “architecting” levels of abstraction before having concrete examples.
> Realize there's no such thing as "clean code"
You can disagree over what exactly is clean code. But you will learn to distinguish what dirty code is when you try to maintain it.
As a person that has had to maintain dirty code over the years, hearing someone saying dirty code doesn't exist is really frustrating. Noone wants to clean up your code, but doing it is better than allowing the code to become unmaintainable, that's why people bring up that book. If you do not care about what clean code is, stop making life difficult for people that do.
> hearing someone saying dirty code doesn't exist is really frustrating
Not sure why you're being downvoted, but as an unrelated aside, the quote you're responding to literally did not say this.
> that's why people bring up that book
I think the point is that following that book does not really lead to Clean Code.
2 replies →
I think it's more that clean code doesn't exist because there's no objective measure of this (and those services that claim there are are just as dangerous as Clean Code, the book); anyone can come along and find something about the code that could be tidied up. And legacy is legacy, it's a different problem space to the one a greenfield project exists in.
> As a person that has to maintain dirty code
This is a strange credential to present and then use as a basis to be offended. Are you saying that you have dirty code and have to keep it dirty?
9 replies →
This is what I'm doing even while creating new code. There's a few instances for example where the "execution" is down to a single argument - one of "activate", "reactivate" and "deactivate". But I've made them into three distinct, separate code paths so that I can work error and feedback messages into everything without adding complexity via arguments.
I mean yes it's more verbose, BUT it's also super clear and obvious what things do, and they do not leak the underlying implementation.
Your name WET is kind of funny but it is usually refered to as the rule of 3, https://en.m.wikipedia.org/wiki/Rule_of_three_(computer_prog...
I’ve never heard the term WET before but that’s exactly what I do.
The other key thing I think is not to over-engineer abstractions you don’t need yet. But to try and leave ‘seams’ where it’s obvious how to tease code about if you need to start building abstractions.
My experience interviewing recently a number of consultants with only a few years experience was the more they mumbled clean code the less they knew what they were doing.
Ah hahha. I love WET. I always say the only way to write something correctly is to re-write it.
That's not what WET means. The GP is saying you shouldn't isolate logic in a function until you've cut-and-pasted the logic in at least two places and plan to do so in a third.
> Put logic closest to where it needs to live (feature folders)
Can you say more about this?
I think I may have stumbled on a similar insight myself. In a side project (a roguelike game), I've been experimenting with a design that treats features as first-class, composable design units. Here is a list of the subfolder called game-features in the source tree:
An extract from the docstring of the entire game-feature package:
The project is still very much work-in-progress (procrastinating on HN doesn't leave me much time to work on it), and most of the above features are nowhere near completion, but I found the design to be mostly sound. Each game feature provides code that implements its own concerns, and exports various functions and data structures for other game features to use. This is an inversion of traditional design, and is more similar to the ECS pattern, except I bucket all conceptually related things in one place. ECS Components and Systems, utility code, event definitions, etc. that implement a single conceptual game aspect live in the same folder. Inter-feature dependencies are made explicit, and game "superstructure" is designed to allow GFs to wire themselves into appropriate places in the event loop, datastore, etc. - so in game startup code, I just declare which features I want to have enabled.
(Each feature also gets its set of integration tests that use synthetic scenarios to verify a particular aspect of the game works as I want it to.)
One negative side effect of this design is that the execution order of handlers for any given event is hard to determine from code. That's because, to have game features easily compose, GFs can request particular ordering themselves (e.g. "death" can demand its event handler to be executed after "destructibility" but before "log") - so at startup, I get an ordering preference graph that I reconcile and linearize (via topological sorting). I work around this and related issues by adding debug utilities - e.g. some extra code that can, after game startup, generate a PlantUML/GraphViz picture of all events, event handlers, and their ordering.
(I apologize for a long comment, it's a bit of work I always wanted to talk about with someone, but never got around to. The source of the game isn't public right now because I'm afraid of airing my hot garbage code.)
I'd be interested in how you attempt this. Is it all in lisp?
It might be hard to integrate related things, e.g. physical simulation/kinematics <- related to collisions, and maybe sight/hearing <- related to rendering; Which is all great if information flows one way, as a tree, but maybe complicated if it's a graph with intercommunication.
I thought about this before, and figured maybe the design could be initially very loose (and inefficient), but then a constraint-solver could wire things up as needed, i.e. pre-calculate concerns/dependencies.
Another idea, since you mention "logs" as a GF: AOP - using " join points" to declaratively annotate code. This better handles code that is less of a "module" (appropriate for functions and libraries) and more of a cross-cutting "aspect" like logging. This can also get hairy though: could you treated "(bad-path) exception handling" as an aspect? what about "security"?
1 reply →
Sounds interesting. Is the game open source? Published anywhere?
1 reply →
I've gone down roads similar to this. Long story short - the architecture solves for a lower priority class of problem, w/r to games, so it doesn't pay a great dividend, and you add a combination of boilerplate and dynamism that slows down development.
Your top issue in the runtime game loop is always with concurrency and synchronization logic - e.g. A spawns before B, if A's hitbox overlaps with B, is the first frame that a collision event occurs the frame of spawning or one frame after? That's the kind of issue that is hard to catch, occurs not often, and often has some kind of catastrophic impact if handled wrongly. But the actual effect of the event is usually a one-liner like "set a stun timer" - there is nothing to test with respect to the event itself! The perceived behavior is intimately coupled to when its processing occurs and when the effects are "felt" elsewhere in the loop - everything's tied to some kind of clock, whether it's the CPU clock, the rendered frame, turn-taking, or an abstracted timer. These kinds of bugs are a matter of bad specification, rather than bad implementation, so they resist automated testing mightily.
The most straightforward solution is, failing pure functions, to write more inline code(there is a John Carmack posting on inline code that I often use as a reference point). Enforce a static order of events as often as possible. Then debugging is always a matter of "does A happen before B?" It's there in the source code, and you don't need tooling to spot the issue.
The other part of this is, how do you load and initialize the scene? And that's a data problem that does call for more complex dependency management - but again, most games will aim to solve it statically in the build process of the game's assets, and reduce the amount of game state being serialized to save games, reducing the complexity surface of everything related to saves(versioning, corruption, etc). With a roguelike there is more of an impetus to build a lot of dynamic assets(dungeon maps, item placements etc.) which leads to a larger serialization footprint. But ultimately the focus of all of this is on getting the data to a place where you can bring it back up and run queries on it, and that's the kind of thing where you could theoretically use SQLite and have a very flexible runtime data model with a robust query system - but fully exploiting it wouldn't have the level of performance that's expected for a game.
Now, where can your system make sense? Where the game loop is actually dynamic in its function - i.e. modding APIs. But this tends to be a thing you approach gradually and grudgingly, because modders aren't any better at solving concurrency bugs and they are less incentivized to play nice with other mods, so they will always default to hacking in something that stomps the state, creating intermittent race conditions. So in practice you are likely to just have specific feature points where an API can exist(e.g. add a new "on hit" behavior that conditionally changes the one-liner), and those might impose some generalized concurrency logic.
The other thing that might help is to have a language that actually understands that you want to do this decoupling and has the tooling built in to do constraint logic programming and enforce the "musts" and "cannots" at source level. I don't know of a language that really addresses this well for the use case of game loops - it entails having a whole general-purpose language already and then also this other feature. Big project.
I've been taking the approach instead of aiming to develop "little languages" that compose well for certain kinds of features - e.g. instead of programming a finite state machine by hand for each type of NPC, devise a subcategory of state machines that I could describe as a one-liner, with chunks of fixed-function behavior and a bit of programmability. Instead of a universal graphics system, have various programmable painter systems that can manipulate cursors or selections to describe an image. The concurrency stays mostly static, but the little languages drive the dynamic behavior, and because they are small, they are easy to provide some tooling for.
3 replies →
I think one should always be careful not to throw out the baby with the bathwater[0].
Do I force myself to follow every single crazy rule in Clean Code? Heck no. Some of them I don't agree with. But do I find myself to be a better coder because of what I learned from Bob Martin? Heck yes. Most of the points he makes are insightful and I apply them daily in my job.
Being a professional means learning from many sources and knowing that there's something to learn from each of them- and some things to ignore from each of them. It means trying the things the book recommends, and judging the results yourself.
So I'm going to keep recommending Clean Code to new developers, in the hopes that they can learn the good bits, and learn to ignore the bad bits. Because so far, I haven't found a book with more good bits (from my perspective) and fewer bad bits (from my perspective).
[0]https://en.wikipedia.org/wiki/Don%27t_throw_the_baby_out_wit...
I'm completely with you here. Until I read Clean Code, I could never really figure out why my personal projects were so unreadable a year later but the code I read at work was so much better even though it was 8 years old. Sure, I probably took things too far for a while and made my functions too small, or my classes too small, or was too nitpicky on code reviews. But I started to actually think about where I should break a function. I realized that a good name could eliminate almost all the comments I had been writing before, leaving only comments that were actually needed. And as I learned how to break down my code, I was forced to learn how to use my IDE to navigate around. All of a sudden new files weren't a big deal, and that opened up a whole new set of changes that I could start making.
I see a lot of people in here acting like all the advice in Clean Code is obviously true or obviously false, and they claim to know how to write a better book. But, like you, I will continue to recommend Clean Code to new developers on my team. It's the fastest way (that I've found so far, though I see other suggestions in the comments here) to get someone to transition from writing "homework code" (that never has to be revisited) to writing maintainable code. Obviously, there are bad parts of Clean Code, but if that new dev is on my team, I'll talk through why certain parts are less useful than others.
I would recommend Code Complete 2 over Clean Code.
Por que no los dos? Read them both. Take the good parts of each. Ignore the bad parts of each.
2 replies →
Perfect, Its definitly my personal impression, but while reading the post it looks like the author was looking for a "one size fits all" book and was dissapointed they did not find it.
And to be honest that book will never exist, every knowledge contributes to growing as a professional, just make sure to understand, discuss, and use it (or not) for a real reason, not just becaue its on book A or B.
Its not like people need to choose one book and follow it blindly for the rest of their lives, read more books :D
In my opinion the problem is not that the rules are not one size fits all, but that they are so misguided that Martin himself couldn't come up with a piece of code where they would lead to a good result.
One mistake I think people like the author make is treating these books as some sort of bible that you must follow to the letter. People who evangelised TDD were the worst offenders of this. "You HAVE to do it like this, it's what the book says!"
You're not supposed to take it literally for every project, these are concepts that you need to adapt to your needs. In that sense I think the book still holds up.
For me this maps so clearly to the Dreyfus model of skill acquisition. Novices need strict rules to guide their behavior. Experts are able to use intuition they have developed. When something new comes along, everyone seems like a novice for a little while.
The Dreyfus model identifies 5 skill levels:
Novice
Wants to achieve a goal, and not particularly interested in learning. Requires context free rules to follow. When something unexpected happens will get stuck.
Advanced Beginner
Beginning to break away from fixed rules. Can accomplish tasks on own, but still has difficulty troubleshooting. Wants information fast.
Competent
Developed a conceptual model of task environment. Able to troubleshoot. Beginning to solve novel problems. Seeks out and solve problems. Shows initiative and resourcefulness. May still have trouble determining which details to focus on when solving a problem.
Proficient
Needs the big picture. Able to reflect on approach in order to perform better next time. Learns from experience of others. Applies maxims and patterns.
Expert
Primary source of knowledge and information in a field. Constantly look for better ways of doing things. Write books and articles and does the lecture circuit. Work from intuition. Knows the difference between irrelevant and important details.
> Primary source of knowledge and information in a field. Constantly look for better ways of doing things. Write books and articles and does the lecture circuit.
Meh. I'm probably being picky, but it doesn't surprise me that a Thought Leader would put themselves and what they do as Thought Leader in the Expert category. I see them more as running along a parallel track. They write books and run consulting companies and speak at conferences and create a brand, and then there are those of us who get good at writing code because we do it every day, year after year. Kind of exactly the difference between sports commentators and athletes.
1 reply →
The problem is that the book presents things that are at best 60/40 issues as hard rules, which leads novices++ follow them to the detriment of everything else.
Uncle Bob himself acts like it is a bible, so if you buy into the rest of his crap then you'll likely buy into that too.
If treated as guidelines you are correct Clean Code is only eh instead of garbage. But taken in the full context of how it is presented/intended to be taken by the author it is damaging to the industry.
I've read his blog and watched his videos. While his attitude comes off as evangelical, his actual advice is very often "Do it when it makes sense", "There are exceptions - use engineering judgment", etc.
In no way is he treating his book as a bible.
Yup. I see the book as guide to a general goal, not a specific objective that can be defined. To actually reach that goal is sometimes completely impossible and in many other cases it introduces too much complexity.
However, in most cases heading towards that goal is a beneficial thing--you just have to recognize when you're getting too close and bogging down in complying with every detail.
I still consider it the best programming book I've ever read.
But that's what some people are doing - they take this book as the programming bible.
I understand that the people that follow Clean Code religiously are annoying, but the author seems to be doing the same thing in reverse: because some advice is nuanced or doesn't apply all the time then we should stop recommending the book and forget it altogether.
1 reply →
Sure, but it doesn't mean the book itself is bad. It's that beginners should be aware that what's "right" differs from project to project.
The whole point of "You HAVE to do it like this, it's what the book says!" is to sell more books or consulting.
I agree: Clean Coders and TDDers are cut from the same cloth.
Agreed. It’s one of the best books on programming there is. Like any book, probably 20% I don’t agree with. But 80% of it is gold.
FWIW the clean code approach led me to this pattern which has allowed me to build some seriously complex systems in JS/Node: https://ponyfoo.com/articles/action-pattern-clean-obvious-te....
I agree with the sentiment that you don't want to over abstract, but Bob doesn't suggest that (as far as I know). He suggests extract till you drop, meaning simplify your functions down to doing one thing and one thing only and then compose them together.
Hands down, one of the best bits I learned from Bob was the "your code should read like well-written prose." That has enabled me to write some seriously easy to maintain code.
> your code should read like well-written prose
That strikes me as being too vague to be of practical use. I suspect the worst programmers can convince themselves their code is akin to poetry, as bad programmers are almost by definition unable to tell the difference. (Thinking back to when I was learning programming, I'm sure that was true of me.) To be valuable, advice needs to be specific.
If you see a pattern of a junior developer committing unacceptably poor quality code, I doubt it would be productive to tell them Try to make it read more like prose. Instead you'd give more concrete advice, such as choosing good variable names, or the SOLID principles, or judicious use of comments, or sensible indentation.
Perhaps I'm missing something though. In what way was the code should read like well-written prose advice helpful to you?
Specifically in relation to naming. I was caught up in the dogma of "things need to be short" (e.g., using silly variable names like getConf instead of getWebpackConfig). The difference is subtle, but that combined with reading my code aloud to see if it reads like a sentence ("prose") is helpful.
For example, using what I learned I read this code (https://github.com/cheatcode/nodejs-server-boilerplate/blob/...) as:
"This module is going to generate a password reset token. First, it's going to make sure we have an emailAddress as an input, then it's going to generate a random string which I'll refer to as token, and then I want to set that token on the user with this email address."
1 reply →
I'm in the "code should read like poetry" camp. Poetry is the act of conveying meaning that isn't completely semantic - meter and rhyme being the primary examples. In code, that can mean maintaining a cadence of variable names, use of whitespace that helps illuminate structure, or writing blocks or classes where the appearance of the code itself has some mapping to what it does. You can kludge a solution together, or craft a context in which the suchness of what you are trying to convey becomes clear in a narrative climax.
1 reply →
I like to think of the way Hemmingway writes.
Code should be simple and tight and small. It should also, however, strive for an eighth grade reading level.
You shouldn't try to make your classes so small that you're abusing something like nested ternary operators which are difficult to read. You shouldn't try to break up your concepts so much that while the sentences are easy, the meaning of the whole class becomes muddled. You should stick with concepts everyone knows and not try to invent your own domain specific language in every class.
Less code is always more, right up until it becomes difficult to read, then you've gone too far. On the other hand if you extract a helper method from a method which read fine to begin with, then you've made the code harder to read, not easier, because its now bigger with an extra concept. But if that was a horrible conditional with four clauses which you can express with a "NeedsFrobbing" method and a comment about it, then carry on (generating four methods from that conditional to "simplify" it is usually worse, though, due to the introduction of four concepts that could be often better addressed with just some judicious whitespace to separate them).
And I need to learn how to write in English more like Hemmingway, particularly before I've digested coffee. That last paragraph got away from me a bit.
Absolutely this. Code should tell a story, the functions and objects you use are defined by the context of the story at that level of description. If you have to translate between low-level operations to reconstruct the high level behavior of some unit of work, you are missing some opportunities for useful abstractions.
Coding at scale is about managing complexity. The best code is code you don't have to read because of well named functional boundaries. Natural language is our facility for managing complexity generally. It shouldn't be surprising that the two are mutually beneficial.
I tried to write code with small functions and was dissuaded from doing that at both my teams over the past few years. The reason is that it can be hard to follow the logic if it's spread out among several functions. Jumping back and forth breaks your flow of thought.
I think the best compromise is small summary comments at various points of functions that "hold the entire thought".
The point of isolating abstractions is that you don't have to jump and back and forth. You look at a function, and you understand from its contract and calling convention you immediately know what it does. The specific details aren't relevant for the layer of abstraction you're looking at.
Because of well structured abstractions, thoughtful naming conventions, documentation where required, and extensive testing you trust that the function does what it says. If I'm looking at a function like commitPending(), I simply see writeToDisk() and move on. I'm in the object representation layer, and jumping down into the details of the I/O layers breaks flow by moving to a different level of abstraction. The point is I trust writeToDisk() behaves reasonably and safely, and I don't need to inspect its contents, and definitely don't want to inline its code.
If you find that you frequently need to jump down the tree from sub-routine to sub-routine to understand the high level code, then that's a definite code smell. Most likely something is fundamentally broken in your abstraction model.
> I think the best compromise is small summary comments at various points of functions that "hold the entire thought".
I think this is the best, honestly, it reaps most benefits of small single-use functions, without compromising readability or requiring jumping.
This is also how John Carmarck recommends in this popular essay: http://number-none.com/blow/john_carmack_on_inlined_code.htm...
Check out the try/catch and logging pattern I use in the linked post. I added that specifically so I could identify where errors were ocurring without having to guess.
When I get the error in the console/browser, the path to the error is included for me like "[generatePasswordResetToken.setTokenOnUser] Must pass value to $set to perform an update."
With that, I know exactly where the error is ocurring and can jump straight into debugging it.
3 replies →
Nice! However, none of this is required for this endpoint. Here's why:
1. The connect action could be replaced by doing the connection once on app startup.
2. The validation could be replaced with middleware like express-joi.
3. The stripe/email steps should be asynchronous (ex: simple crons). This way, you create the user and that's it. If Stripe is down, or the email provider is down, you still create the user. If the server restarts while someone calls the endpoint, you don't end up with a user with invalid Stripe config. You just create a user with stripeSetup=false and welcomeEmailSent=false and have some crons that every 5 seconds query for these users and do their work. Also, ensure you query for false and not "not equal to true" here as it's not efficient.
Off topic but is connecting to Mongo on every API hit best practice? I abstract my connection to a module and keep that open for the life of the application.
> "your code should read like well-written prose."
This is a good assertion but ironically it's not Bob Martin's line. He was quoting Grady Booch.
Yes, that one did a lot to me too. Especially when business logic gets complicated, I want to be able to skip parts by roughly reading meaning of the section without seeing details.
One long stream of commands is ok to read, if you are author or already know what it should do. But otherwise it forces you to read too many irrelevant details on a way toward what you need.
Robert Martin and his aura always struck me as odd. In part because of how revered he always was at organizations I worked. Senior developers would use his work to end arguments, and many code reviews discussions would be judged by how closely they adhere to Clean Code.
Of course reading Clean Code left me more confused than enlightened due precisely to what he presents as good examples of Code. The author of the article really does hit the nail on the head about Martin's code style - it's borderline unreadable a lot of times.
Who the f. even is Robert Martin?! What has he built? As far as I am able to see he is famous and revered because he is famous and revered.
I think part of his appeal lies in his utter certainty that he is correct. This is also the problem with him.
He ran a consultancy and knew how to pump out books into a world of programmers that wanted books
I was around in the early 90s through to the early 2000s when a lot of the ideas came about slowly got morphed by consultants who were selling this stuff to companies as essentially "religion". The nuanced thoughts of a lot of the people who had most of the original core ideas is mostly lost.
It's a tricky situation, at the core of things, there are some really good ideas, but the messaging by people like "uncle bob" seem to fail to communicate the mindset in a way that develops thinking programmers. Mainly because him, and people like Ron Jerfferies, really didn't actually build anything serious once they became consultants and started giving out all these edicts. If you watched them on forums/blogs at the time, they were really not that good. There were lots of people who were building real things and had great perspectives, but their nuanced perspectives were never really captured into books, and it would be hard to as it is more about the mentality of using ideas and principles and making good pragmatic choices and adapting things and not being limited by "rules" but about incorporating the essence of the ideas into your thinking processes.
So many of those people walked away from a lot of those communities when it morphed into "Agile" and started being dominated by the consultants.
10 or so years ago when I first got into development I looked to people like Martin's for how I should write code.
But I had more and more difficulty reconciling bizarrely optimistic patterns with reality. This from the article perfectly sums it up:
>Martin says that functions should not be large enough to hold nested control structures (conditionals and loops); equivalently, they should not be indented to more than two levels.
Back then as now I could not understand how one person can make such confident and unambiguous statements about business logic across the spectrum of use cases and applications.
It's one thing to say how something should be written in ideal circumstances, it's another to essentially say code is spaghetti garbage because it doesn't precisely align to a very specific dogma.
This is the point that I have the most trouble understanding in critiques of Fowler, Bob, and all writers who write about coding: in my reading, I had always assumed that they were writing about the perfect-world ideal that needs to be balanced with real-world situations. There's a certain level of bluster and over-confidence required in that type of technical writing that I understood to be a necessary evil in order to get points across. After all, a book full of qualifications will fail to inspire confidence in its own advice.
This is true only for people first coming to development. If you're just starting your journey, you are likely looking for quantifiable absolutes as to what is good and what isn't.
After you're a bit more seasoned, I think qualified comments are probably far more welcome than absolutes.
> After all, a book full of qualifications will fail to inspire confidence in its own advice.
I don't think that's true at all. One of the old 'erlang bibles' is "learn you some erlang" and it full of qualifications titled "don't drink the kool-aid" (notably not there in the haskell inspiration for the book). It does not fail to inspire confidence to have qualifications scattered throughout and to me it actually gives me MORE confidence that the content is applicable and the tradeoffs are worth it.
https://learnyousomeerlang.com/introduction#about-this-tutor...
Can you provide a source where he said that? Or did he actually say something more like "thats when I consider refactoring it"?
My recollection of the clean code book and Fowler books were very much “I think these are smells, but smells in the code are also fine”
Note: Robert Martin and Martin Fowler are different people. Are you saying Fowler said this?
It's directly from the article, and Clean Code was one of the first books I purchased.
Fixed the typo, thanks.
1 reply →
https://www.kernel.org/doc/Documentation/process/coding-styl...
All codebases are spaghetti garbage, but some are useful.
> it's another to essentially say code is spaghetti garbage because it doesn't precisely align to a very specific dogma
Does it say that though?
The big problem that I have with Clean Code -- and with its sequel, Clean Architecture -- is that for its most zealous proponents, it has ceased to be a means to an end and has instead become an end in itself. So they'll justify their approach by citing one or other of the SOLID principles, but they won't explain what benefit that particular SOLID principle is going to offer them in that particular case.
The point that I make about patterns and practices in programming is that they need to justify their existence in terms of value that they provide to the end user, to the customer, or to the business. If they can't provide clear evidence that they actually provide those benefits, or if they only provide benefits that the business isn't asking for, then they're just wasting time and money.
One example that Uncle Bob Martin hammers home a lot is separation of concerns. Separation of concerns can make your code a lot easier to read and maintain if it's done right -- unit testing is one good example here. But when it ceases to be a means to an end and becomes an end in itself, or when it tries to solve problems that the business isn't asking for, it degenerates into speculative generality. That's why you'll find project after project after project after project after project with cumbersome and obstructive data access layers just because you "might" want to swap out your database for some unknown mystery alternative some day.
I whole-heartedly agree with basically everything but feel compelled to point out that it's extremely hard to predict future needs.
Nobody thinks their house is going to burn down until it does.
Also, DALs are IMHO best practice for anything but the most rudimentary and not solely -- or even mainly -- for facilitating switching database.
You should have data access layers if you're working on any decently sizable app.
I don’t disagree with the overall message or choice of examples behind this post, but one paragraph stuck out to me:
> Martin says that it should be possible to read a single source file from top to bottom as narrative, with the level of abstraction in each function descending as we read on, each function calling out to others further down. This is far from universally relevant. Many source files, I would even say most source files, cannot be neatly hierarchised in this way.
The relevance is a fair criticism but most programs in most languages can in fact be hierarchized this way, with the small number of mutually interdependent code safely separated. Many functional languages actually enforce this.
As an F# developer it can be very painful to read C# programs even though I often find C# files very elegant and readable: it just seems like a book, presented out of order, and without page numbers. Whereas an .fsproj file provides a robust reading order.
> "with the level of abstraction in each function descending as we read on, each function calling out to others further down." ...
> Many functional languages actually enforce this.
Don't they enforce the opposite? In ML languages (I don't know F# but I thought it was an ML dialect), you can generally only call functions that were defined previously.
Of course, having a clear hierarchy is nice whether it goes from most to least abstract, or the other way around, but I think Martin is recommending the opposite from what you are used to.
Hmm, perhaps I am misreading this? Your understanding of ML languages is correct. I have always found “Uncle Bob” condescending and obnoxious so I can’t speak to the actual source material.
I am putting more emphasis on the “reading top-to-bottom” aspect and less on the level of abstraction itself (might be why I’m misreading it). My understanding was that Bob sez a function shouldn’t call any “helper” functions until the helpers have been defined - if it did, you wouldn’t be able to “read” it. But with your comment, maybe he meant that you should define your lower-level functions as prototypes, implement the higher-level functions completely, then fill in the details for the lower functions at the bottom. Which is situationally useful but yeah, overkill as a hard rule.
In ML and F# you can certainly call interfaces before providing an implementation, as long as you define the interface first. Whereas in C# you can define the interface last and call it all you want beforehand. This is what I find confusing, to the point of being bad practice in most cases.
So even if I misread specifically what (the post said) Bob was saying, I think the overall idea is what Bob had in mind.
2 replies →
> In ML languages, you can generally only call functions that were defined previously.
Hum... At least not in Haskell.
Starting with the mostly dependent code makes a large difference in readability. It's much better to open your file and see what are the overall functions. The alternative is browsing to find it, even when it's on the bottom. Since you read functions from the top to the bottom, locating the bottom of the function isn't much of a help to read it.
1 - The dependency order does not imply on any ordering in abstraction. Both can change in opposite directions just as well as on the same.
ML languages are not functional ("functional" in the original sense of the word - pure functional). They are impure and thus don't enforce it.
3 replies →
We follow this approach closely - the problem is that people confuse helper services for first-order services and call them directly leading to confusion. I don't know how to avoid this without moving the "main" service to a separate project and having `internal` helper services. DI for class libraries in .NET Core is also hacky if you don't want to import every single service explicitly.
Is there a reason why private/internal qualifiers aren’t sufficient? Possibly within the same namespace / partial class if you want to break it up?
As I type this out, I suppose “people don’t use access modifiers when they should” is a defensible reason.... I also think the InternalsVisibleTo attribute should be used more widely for testing.
1 reply →
> But mixed into the chapter there are more questionable assertions. Martin says that Boolean flag arguments are bad practice, which I agree with, because an unadorned true or false in source code is opaque and unclear versus an explicit IS_SUITE or IS_NOT_SUITE... but Martin's reasoning is rather that a Boolean argument means that a function does more than one thing, which it shouldn't.
I see how this can be polemic because most code is littered w/ flags, but I tend to agree that boolean flags can be an anti-pattern (even though it's apparently idiomatic in some languages).
Usually the flag is there to introduce a branching condition (effectively breaking "a function should do one thing") but don't carry any semantic on it's own. I find the same can be achieved w/ polymorphism and/or pattern-matching, the benefit being now your behaviour is part of the data model (the first argument) which is easier to reason about, document, and extend to new cases (don't need to keep passing flags down the call chain).
As anything, I don't think we can say "I recommend / don't recommend X book", all knowledge and experience is useful. Just use your judgment and don't treat programming books as a holy book.
> Usually the flag is there to introduce a branching condition (effectively breaking "a function should do one thing")...
But if you don't let the function branch, then the parent function is going to have to decide which of two different functions to call. Which is going to require the parent function to branch. Sooner or later, someone has to branch. Put the branch where it makes the most sense, that is, where the logical "one-ness" of the function is preserved even with the branch.
> I find the same can be achieved w/ polymorphism and/or pattern-matching, the benefit being now your behaviour is part of the data model (the first argument) which is easier to reason about, document, and extend to new cases (don't need to keep passing flags down the call chain).
You just moved the branch. Polymorphism means that you moved the branch to the point of construction of the object. (And that's a perfectly fine way to do it, in some cases. It's a horrible way to try to deal with all branches, though.) Pattern-matching means that you moved the branch to when you created the data. (Again, that can be a perfectly fine way to do it, in some cases.)
> As anything, I don't think we can say "I recommend / don't recommend X book", all knowledge and experience is useful. Just use your judgment and don't treat programming books as a holy book.
People don't want to go through the trouble of reading several opposing points of view and synthesize that using their own personal experience. They want to have a book tell them everything they need to do and follow that blindly, and if that ever bites them back then that book was clearly trash. This is the POV the article seems to be written from IMHO.
Not even that, this book gets recommended to newbies who don't yet have the experience to read it critically like that.
As far as the boolean flag argument goes, I've seen it justified in terms of data-oriented design, where you want to lift your data dependencies to the top level as much as possible. If a function branches on some argument, and further up the stack that argument is constant, maybe you didn't need that branch at all if only you could invoke the right logic directly.
Notably, this argument has very little to do with readability. I do prefer consolidating data and extracting data dependencies -- I think it makes it easier to get a big-picture view, as in Brook's "Show me your spreadsheets" -- but this argument is rooted specifically in not making the machine do redundant work.
> This is done as part of an overall lesson in the virtue of inventing a new domain-specific testing language for your tests. I was left so confused by this suggestion. I would use exactly the same code to demonstrate exactly the opposite lesson. Don't do this!
This example (code is in the article) was very telling of the book author's core philosophy.
Best I can tell, the OOP movement of the 2000s (I wasn't a professional in 2008, though I was learning Java at the time) was at its heart rooted in the idea that abstractions are nearly always a win; the very idealistic perspective that anything you can possibly give a name to, should be given a name. That programmers down the line will thank you for handing them a named entity instead of perhaps even a single line of underlying code.
This philosophy greatly over-estimates the value, and greatly under-estimates the cost, of idea-creation. I don't just write some code, I create an idea, and then I write a bit of code as implementation details for it. This is a very tantalizing vision of development: all messy details are hidden away, what we're left with is a beautiful constellation of ideas in their purest form.
The problem is that when someone else has to try and make sense of your code, they first have to internalize all of your ideas, instead of just reading the code itself which may be calling out to something they already understand. It is the opposite of self-documenting code: it's code that requires its own glossary in addition to the usual documentation. "wayTooCold()" may read more naturally to the person who wrote it, but there's a fallacy where they assume that that also applies to other minds that come along and read it later.
Establishing a whole new concept with its own terminology in your code is costly. It has to be done with great care and only when absolutely necessary, and then documented thoroughly. I think as an industry we have more awareness of this nowadays. We don't just make UML diagrams and kick them across the fence for all those mundane "implementation details" to be written.
This thread is full of people saying what's wrong with the book without posing alternatives. I get that it's dogmatic, but do people seriously take it as gospel? I'd read it along with other things. Parts are great and others are not. It's not terrible.
I agree. Trying to apply the lessons in there leads to code that is more difficult to read and reason about. Making it "read like a book" and keeping functions short sound good on the surface but they lead to lines getting taken up entirely by function names and a nightmare of tracking call after call after call.
It's been years since I've read the book and I'm still having trouble with the bad ideas from there because they're so well stuck with me that I feel like I'm doing things wrong if I don't follow the guidelines in there. Sometimes I'll actually write something in a sensible way, change it to the Clean Code way, and then revert it back to where it was when I realize my own code is confusing me when written like that.
This isn't just a Robert C Martin issue. It's a cultural issue. People need to stop shaming others if their code doesn't align with Clean Code. People need to stop preaching from the book.
I make my code "read like a book" with a line comment for each algorithmic step inside a function, and adding line-ending comments to clarify. So functions are just containers of steps designed to reduce repetition, increase visibility, and minimize data passing and globals.
I recently read this cover to cover and left a negative review on Amazon. I'm happy to see I'm not the only one, and this goes into it in a whole lot more detail.
The author seems like they took a set of rules that are good for breaking beginning programmers bad habits and then applied them into the extreme. There's a whole lot of rules which aren't bad up until you try to apply them like laws of gravity that must always be followed. Breaking up big clunky methods that do way too much is great for readability, right up until you're spraying one line helper methods all over your classes and making them harder to read because now you're inventing your own domain specific language everyone has to learn (often with the wrong abstractions which get extended through the years and wind up needing a massive refactoring down the road which would have been simpler with fewer methods and abstractions involved at the start).
A whole lot of my job is taking classes, un-DRY'ing them completely so there's duplication all over the place, then extracting the right (or at least more correct) abstractions to make the whole thing simple and readable and tight.
My biggest gripe: Functions shouldn't be short, they should be of appropriate size. They should contain all the logic that isn't supposed to be exposed to the outside for someone else to call. If that means your function is 3000 lines long, so be it.
Realize that your whole program is effectively one big function and you achieve nothing by scattering its guts out into individual sub-functions just to make the pieces smaller.
If something is too convoluted and does too much, or has too much redundancy, you'll know, because it'll cause problems. It'll bother you. You shouldn't pre-empt this case by just writing small functions by default. That'll just cause its own problems.
The problem is there's no way to adequately test a 3000 line function.
> they should be of appropriate size.
"An explanation should be as simple as possible, but no simpler" - Albert Einstein
This is an interesting article because as I was reading Martin's suggestions I agreed with every single one of them. 5 lines of code per function is ideal. Non-nested whenever possible. Don't mix query/pure and commands/impure. Then I got to the code examples and they were dreadful. Those member variables should be readonly.
Using Martin's suggestion with Functional Hexagonal Architecture would lead to beautiful code. I know because that's what I've been writing for the past 3 years.
Great! While we're on it, can we retire the gang of four as well? I mean, the authors are obviously great software engineers, and the Patterns have helped to design, build, and most importantly read, a lot of software. But as we move forward, more and more of these goals can be achieved much more elegantly and sustainably with new languages and more functional approaches. Personally, I find re-teaching junior programmers, who are still trying to make everything into a class, very tiring.
I don’t understand the amount of hate that Clean Code gets these days…it’s a relatively straightforward set of principles that can help you create a software system maintainable by humans for a very long time. Of course it’s not an engineering utopia, there’s no such thing.
I get the impression that it’s about the messengers and not the messages, and that people have had horrible learning experiences that have calcified into resistance to do with anything clean. But valuable insights are being lost, and they will have to be re-learned in a new guise at a later date.
Development trends are cyclical and even the most sound principle has an exception. Even if something is good advice 99% of the time, it will eventually be criticized with that 1% of the time being used as a counter.
TFA makes pretty good arguments.
For me Clean Code is not about slavishly adhering to the rules therein, but about guidelines to help make your code better if you follow them, in most circumstances. On his blog Bob Martin himself says about livable code vs pristinely clean code: "Does this rule apply to code? It absolutely does! When I write code I fight very hard to keep it clean. But there are also little places where I break the rules specifically because those breakages create affordances for transient issues."
I've found the Clean Code guidelines very useful. Your team's mileage may very. As always: Use what works, toss the rest, give back where you can.
For more about this see:
* Bob Martin's blog post 'Too Clean' - https://blog.cleancoder.com/uncle-bob/2018/08/13/TooClean.ht...
* Livable Code by Sarah Mei - https://www.youtube.com/watch?v=8_UoDmJi7U8
I never recommended Clean Code, but I've become a strong advocate against it on teams that I lead after reading opinions by Bob Martin such as this one: https://blog.cleancoder.com/uncle-bob/2017/01/11/TheDarkPath.... That whole article reads as someone who is stuck in their old ways and inflexible, then given their large soapbox tries to justify their discomfort and frustration. I consider Swift, Kotlin (and Rust) to be one of the most important language evolutions that dramatically improved software quality on the projects I've worked on.
I've seen so many real world counter-examples to arguments made in that article and his other blog posts that I'm puzzled why this guy has such a large and devoted following.
Actually, I found the post you linked to fairly logical. He’s saying that humans are inherently lazy, and that a language that gives us the option between being diligent (strong types) or being reckless (opt-out of strong types) will lead to the worst form of recklessness: opting out while not writing tests, giving the misimpression of safety.
His point is that you can’t practically force programmers to be diligent through safety features of a language itself, since edge-cases require escape hatches from those safety features, and those safety hatches will be exploited by our natural tendency to avoid “punishment”.
I’m not sure I agree with his point, but I don’t find it an unreasonable position. I’d be curious if Rust has escape hatches that are easily and often abused.
My favorite example here, and a counterpoint to Bob, is Reacts’s dangerously-unsafe-html attribute. I haven’t seen it in years (to the point where I can’t recall the exact naming), and perhaps it was removed at some point. But it made the escape hatch really painful to use. And so the pain of using the escape hatch made it less painful to actually write React in the right manner. Coming from Angular, I think I struggled at first with thinking I had to write some dangerous html, but over time I forgot the choice of writing poor React code even existed.
So I guess I disagree with Bob’s post here. It is possible to have safety features in languages that are less painful than the escape-hatches from those safety features. And no suite of tests will ever be as powerful as those built-in safety features.
He actually misunderstands and mischaracterizes the features of the languages he complains about. These features remove the need for a developer to keep track of invariants in their code, so should be embraced and welcomed by lazy developers who don't have to simulate the boring parts of code in their head to make sure it works. "If it type-checks then it works" philosophy really goes a long way toward relieving developer's stress.
For example, if I'm using C or Java I have take into account that every pointer or reference can be null, at every place where they are used. I should write null checks, (or error checks say from opening a file handle) but I usually don't because I'm lazy, or I forget, or its hard to keep track of all possible error conditions. So I'm stressed during a release because I can't predict the input that may crash my code.
In a language like Swift I am forced to do a null or an error check once in the code, and for that small effort the compiler will guarantee I will never have to worry about these error conditions again. This type system means I can refactor code drastically and with confidence, and I don't have to spend time worrying about all code paths to see if one of them would result in an unexpected null reference. On a professional development team, it should be a no-brainer to adopt a new technology to eliminate all null-reference exceptions at runtime, or use a language to setup guarantees that will hold under all conditions and in the future evolution of the code.
Worse than that, he sets up a patronizing and misguided mental image of a common developer who he imagines will use a language with type safety just to disable and abuse all the safety features. Nobody does that, in my experience of professional Swift, Kotlin or rust development.
He advocates for unit tests only and above all else. That is also painfully misguided: a test tells you it passes for one given example of input. In comparison a good type system guarantees that your code will work for ALL values of a given type. Of course type systems can't express all invariants, so there is a need for both approaches. But that lack of nuance and plain bad advice turned me into an anti-UncleBob advocate.
Here's my rant on Code Complete and Clean Code:
I find that these two books are in many recommended lists, but I found them entirely unforgettable, entirely too long, and without any "meat."
So much of the advice given is trivial things you'll just figure out in the first few months of coding professionally. Code for more than a week and you'll figure out how to name classes, how to use variables, how scope works, etc. The code examples are only in C++, Java, and Visual Basic (ha!). Completely ignoring non-OO and dynamic languages. Some of the advice is just bad (like prefixing global variables with g_) or incredibly outdated (like, "avoid goto"? Thanks 1968!).
Work on a single software project, or any problem ever, and you'll know that you need to define the problem first. It's not exactly sage advice.
These are cherry-picked examples, but overall Code Complete manages to be too large, go into too specific detail in some areas, while giving vague advice in others.
All books are written in a time and a few become timeless. Software books have an especially short half-life. I think Code Complete was a book Software Engineering needed in 2004, but has since dropped in value.
I will say, that Code Complete does have utility as a way to prop up your monitor for increased ergonomics, which is something you should never ignore.
I have similar issues with Clean Code. One is better off just googling "SOLID Principles" and then just programming using small interfaces more often and use fewer subclasses.
A better alternative is (from above) The Pragmatic Programmer (2019), a good style guide, and/or get your code reviewed by literally anyone.
Another thing Martin advocates for is not putting your name in comments, e.g. "Fixed a bug here; there could still be problematic interactions with subsystem foo -- ericb". He says, "Source control systems are very good at remembering who added what, when." (p. 68, 2009 edition)
Rubbish! Multiple times I've had to track down the original author of code that was auto-refactored, reformatted, changed locations, changed source control, etc. "git blame" and such are useless in these cases; it ends up being a game of Whodunit that involves hours of search, Slack pings, and what not. Just put your name in the comment, if it's documenting something substantial and is the result of your own research and struggle. And if you're in such a position, allow and encourage your colleagues to do this too.
Better put such a long explanation there that your name isn't needed any more. Because if it is your name that makes the difference chances are that you have left the company by the time someone comes across that comment and needs access to your brain.
Sometimes what is interesting is that you have found that another engineer—whom you might not know in a large enough organization—has put time and thought into the code you are working on, and you can be a lot more efficient and less likely to break things if you can talk to that engineer first. It's not always the comment itself.
5 replies →
Or have gotten busy since. I worked at Coinbase in 2019 and saw a comment at the top of a file saying that something should probably be changed. I git-blamed and saw it was written six years earlier by Brian Armstrong.
I think most of what martin says is rubbish, but this is not. I have never had `git blame` fail...ever. I know what user is responsible for every line of code. Doing this is contemporaneous. Its right up there with commenting out blocks of code so you don't lose them.
I don't know what to say, this is a real problem I have encountered in actual production code multiple times. Any code that lives longer than your company's source control of choice, style of choice, or code structure of choice is vulnerable. Moreover, what's the harm? It's more information, not less.
10 replies →
Code never gets moved or refactored by someone other than the original author?
5 replies →
The parent's comment holds when reformatting, especially in languages with suspect formatting practices like golang, where visibility rules are dictated by the case of the first letter (wat?) or how it attempts to align groups of constants or struct fields depending on the length of the longest field name. Ends up in completely unnecessary changes that divert away from the main diff.
It does require some rigor in your version control practices, but I'm happily "git blame"-ing twenty year old code every now and then.
What is even the downside of adding a few extra characters to the end of a comment to show who wrote it?
And has Martin ever even worked on large, non-greenfield projects? That's the only way I could see anyone professing such idealism.
> What is even the downside of adding a few extra characters to the end of a comment to show who wrote it?
You're right - in fact we should do this for every line of code, so that we know of whom to ask questions!
What's the downside of adding a few extra characters!?
Of course, this view is already available to people: `git blame` - and it's the same for comments, so there is no need.
The exception is "notes to future self" during the development of a feature (to be removed before review), in which case the most useful place for them to appear is at the _start_ of the comment with a marker:
Then they are easy to find...
3 replies →
I think your comment is controversial, for a number of reasons. One, I think nobody should own code. Code should be obvious, tested, documented and reviewed (bringing the number of people involved to at least two), the story behind it should be either in the git comments or referenced to e.g. a task management system. Code ownership just creates islands.
I mean by all means assign a "domain expert" to a PART of your code, but no individual segment of code should belong to anyone.
Second: There's something to be said about avoiding churn. Everybody loves refactoring and rewriting code, present company included, but it muddles the version control waters. I've seen a few github projects where the guidelines stated not to create PRs for minor refactorings, because they create churn and version control noise.
Anyway, that's all "ideal world" thinking, I know in practice it doesn't work like that.
Maybe not exclusive ownership, but there are always going to be those more familiar with a section of code that others.
It's not really efficient to insist everyone know the codebase equally, especially with larger codebases.
I never found people adding a name useful.
Either the code is recent (in which case 'git blame' works better since someone changing a few characters may or may not decide to add their name to the file) or it's old and the author has either left the company or has forgotten practically everything about the code.
If you have to track down the author, then it is already bad. The code should not hope that the author never finds a new job.
But sometimes it is bad, and not fixable within the author's control. I occasionally leave author notes, as a shortcut. If I'm no longer here, yeah you gotta figure it all out the hard way. But if I am, I can probably save you a week, maybe a month. And obviously if its something you can succintly describe, you'd just leave a comment. This is the domain of "Based on being here a few years on a few teams, and three services between this one, a few migrations etc etc". Some business problems have a lot of baggage that aren't easily documented or described, its the hard thing about professional development especially in a changing business. There's also cases where I _didnt'_ author the code, but did purposefully not change something that looks like it should be changed. In those cases, without my name comment, git blame wouldn't point you to me. YMMV.
A 1000 times this. We never use git blame - who cares? The code should be self-explanatory, and if it's not, the author doesn't remember why they did it 5 years down the line either.
1 reply →
I have found other good articles on the same topic, by Hillel Wayne author of Practical TLA+
Uncle Bob is ruining software - https://hillelwayne.com/post/10x/
Uncle Bob and Silver Bullets - https://hillelwayne.com/post/uncle-bob/
> First, the class name, SetupTeardownIncluder, is dreadful. It is, at least, a noun phrase, as all class names should be. But it's a nouned verb phrase, the strangled kind of class name you invariably get when you're working in strictly object-oriented code, where everything has to be a class, but sometimes the thing you really need is just one simple gosh-danged function.
Moving from Java as my only language to JavaScript and Rust, this point was driven home in spades. A programming language can be dysfunctional, causing its users to implement harmful practices.
SetupDeardownIncluder is a good example of the kind of code you get when there are no free-standing functions. It's also one path on the slippery slope to FactoryFactoryManager code.
The main problem is that the intent of the code isn't even clear. Compare it with something you might write in Rust:
If you saw that function at the top line of file, or if you saw render.rs in a directory listing, you'd have a pretty good idea of what's going on before you even dug into the code.
Just randomly searching the Fitness repo, there's this:
https://github.com/unclebob/fitnesse/blob/3bec390e6f8e9e3411...
Really, how does this file even justify its existence? Its a subsidy courtesy of language design decisions made a long time ago.
For me all this kind of stuff is only to sell books, conferences and consulting services, and a big headache when working in teams whose architects have bought too much into it.
The problem is not really with this book IMHO. Most of its advice and guidelines are perfectly sensible, at least for its intended domain.
The problem is people applying principles dogmatically without seeing the larger picture or considering the context purpose of the rules in the first place.
This book or any book cannot be blamed for people applying the advice blindly. But it is a pervasive problem in the industry. It runs much deeper than any particular book. I suspect it has something to do with how CS education typically happen, but I'm not sure.
Why is it that software engineering is so against comments?
I know nothing of clean code. When I read the link, I assumed that clean code meant very simple and well commented code. I hit cmd+f # and nothing came up. Not one comment saying "this function is an example of this" or "note the use of this line here, it does this" etc, on a blog no less where you'd expect to see these things. The type of stuff I put in my own code, even the code that only I read, because in two weeks I'm going to forget everything unless I write full sentences to paragraph comments, and spend way more time trying to get back in the zone than the time it took me to write those super descriptive comments in the first place.
I hate looking at other peoples scripts because, once again, they never write comments. Practically ever. What they do write is often entirely useless to the point where they shouldn't have even bothered writing those two words or whatever. Most people's code is just keyboard diarrhea of syntax and regex and patterns that you can't exactly google for, assuming whoever is looking at the code has the exact same knowledge base as you, and knows everything that you've put down into the script. Maybe it's a side effect of CS major training, where you don't write comments on your homework because the grader is going to know what is happening. Stop doing that with your code and actually make a write up to save others (and often yourself) mountains of time and effort.
> Why is it that software engineering is so against comments?
Good question. Funny thing is, I worked for a company that mandated that every method be documented, which gets you a whole bunch of "The method GetThing(name) gets a Thing, and the argument 'name' is the name of the Thing". Plus 4 lines of Doxygen boilerplate. Oof.
Of course, I've seen my share of uncommented, unreadable code. And also code with some scattered comments that have become so mangled over 10 years of careless maintenance and haphazard copy-pasting that their existence is detrimental. Many of the comments I come across that might be useful are incoherent ungrammatical ramblings. In large projects, often some of the engineers aren't native English speakers.
My point being that writing consistently useful comments (and readable, properly organized code) is hard. Very, very hard. It requires written communication skills that only a small percentage of engineers (or even humans in general) are capable of. And the demand for software engineers is too high to filter out people who aren't great at writing. So I guess many people just try to work around the problem instead.
There's something bad about going over-the-top halfway. Those sort of strict rules that everyone follows half assed are so common on software teams (and the rest of the business and society, but whatevs). It seems like they have all the downsides of both strictness and laxness. It would work better if you just let devs do their things. It would also work better if you went all the way insane. Like the first time you write some garbage docstring like that the CTO comes to your desk and tells you off. I'm not saying that would be the right move in this case, but at least it's something.
One reason is that comments get stale. People need to maintain them but probably won't. Second reason is that they think the code should be self-documenting. If it's not then you just need better names and code structure. Many books like clean code advocate this approach, and that's where I first learned the idea of don't write comments as well.
Personally now I've held both sides of the argument at different times. I think in the end it's a trade-off. There's no hard and fast rule, you need to use your best judgement about what's going to be easiest for the reader and also for the people maintaining the code. Usually I just try to strike a balance that I think my coworkers won't complain about. The other thing I've realized that makes this tricky is that people will almost always overestimate their (or others) commitment to maintaining comments, and/or overestimate how "self-documenting" their code is.
The code in the website is written in Java, not Python ...
Previous discussion from 11 months ago: https://news.ycombinator.com/item?id=23671022
It's also probably time to stop recommending TDD, object-oriented programming, and a host of other anti-patterns in software development, and get serious about treating it like a real engineering profession instead of a circus of personality- and company-driven paradigms.
It is interesting that he uses a fitnesse example.
Years ago we started using fitnesse at a place I was working, and we needed something that was not included, I think it was being able to make a table of http basic auth tests/requests.
The code base seems large and complex at first, but I was able to very quickly add this feature with minimal changes and was pretty confident it was correct. Also, I had little experience in Java at the time. All in all it was a pretty big success.
Fitnesse is Uncle Bob's baby so it makes sense to use that example, he can't get through a book without talking about it at length.
Interesting, is probably the wrong, word. I should say interesting to me, because I had a different experience with it. And it was not any sort of theoretical analysis, it was a feature I needed to get done.
Like everything else: it's fine in moderation. Newbies should practice clean code, everybody else gets to make their own decisions. Treating anything as dogma whether it is Clean Code, TDD, Agile or whatever is the flavor of madness of the day is going to lead to suboptimal outcomes. And they're also a great way to get rid of your most productive and knowledgeable team members.
So apply with caution and care and you'll be fine.
There's a word, in other comments, that I expected to find: zealots. Zealots aren't sufficiently critical, and they don't want to think for themselves; a reasonable person should be able to, and a professional should be constantly itching to, step back, look at code, and decide whether some refactoring or rewriting is an improvement, taking a book like Clean Code as a source of general principles and good examples, not of rules.
All the "bad" examples discussed in the article are rather context dependent, representing uninspired examples or extreme tastes in the book rather than bad or obsolete ideas.
Shredding medium length meaningful operations into many very small and quite ad hoc functions can reduce redundancy at the expense of readability, which might or might not be an improvement; a little DSL that looks silly if used in a couple of test cases can be readable and efficient if more extensive usage makes it familiar; a function with boolean arguments can be an accretion of special cases, mature for refactoring or a respectable way to organize otherwise repetitive code.
Most of these types of books approach things from the wrong direction. Any recommendation should look at the way well designed, maintainable systems are actually written and draw their conclusions from there. Otherwise you allow too much theorizing to sneak in. Lots of good options to choose from and everyone will have their own pet projects, but something like SQLite is probably exemplary of what a small development team could aim for, Postgres or some sort of game engine would maybe be good for a larger example (maybe some of the big open source projects from major web companies would be better, I don't know).
There are books that have done something like this[0], but they are a bit high level. There is room for something at a lower level.
[0]: http://aosabook.org/en/index.html for example.
I would say Code Complete is one such book.
"Promote I/O to management (where it can't do any damage)" is the actionably good thing i've taken from Brandon Rhoades' talk based on this: https://www.youtube.com/watch?v=DJtef410XaM
Living in a world where people regularly write single functions that: 1. loads data from a hardcoded string path of file location 2. does all the analysis inside the same loop that the file content iteration happens in and 3. plots the results ... that cleavage plane is a meaningfully good one.
The rest of the ideas fall into "all things in moderation, including moderation", and can and should be special-cased judiciously as long as you know what you're doing. But oh god please can we stop writing _that_ function already.
I’d say “Stop recommending Clean Code to noobs! It’s dangerous”
Because Clean Code is really a good book once you have enough experience to understand when each idea works and when does not. And why.
And noobs tend to over-engineer, so any book like CC or design patterns give them additional excuse and reason to over engineer and make a mess.
Let's not throw the baby out with the bathwater. We can still measure how quickly new (average) developers become proficient, average team velocity over time, and a host of other metrics that tell us if we are increasing or decreasing the quality of our code over time. Ignoring it all because it's somewhat subjective is selfish and bad for your business.
Leave off the word "clean" or whatever... DO have metrics and don't ignore them. You have people on your team that make it easier for the others, and people who take their "wins" at the expense of their teammates' productivity.
I was skipping those ugly Java code examples while reading and the book made sense.
I know that I'm late to this party, but what would Clean Coders think about algorithm heavy weight code like "TimSort.java"[1]? (This is the Java port of the famous Python stable sort.) Since Java doesn't have mutable references (pointers) or allow multiple return values, it gets very tricky to manage all the local variables across different functional scopes. I guess you could put all your locals into a struct/POJO, and then pass it around to endless tiny functions. (Honestly, the Java regex library basically does this... sucessfully.) Somehow, I feel it would be objectively worse if this algo code were split into endless 1/5/10 line functions! (Yes yes, that is an _opinion_... not a fact!)
Come to think of it, is the original C code for Python's timsort equally non-Clean Code-ish?[2] Probably not!
[1] https://github.com/AdoptOpenJDK/openjdk-jdk11u/blob/master/s...
[2] https://github.com/python/cpython/blob/main/Objects/listobje...
Articles like these make me feel better about never having read any of the 'how to code' books. Mainly substituting them by reading TheDailyWTF during the formative years.
I have the same complaint with Code Complete. I read bits in college and I'm not sure I follow most of its advice today (i.e. putting constants on the left side of a comparison).
However, the book also presents the psych study about people not remembering more than 7 (+/- 2) things at a time (therefore you should simplify your code so readers don't have to keep track of too much stuff) and it stuck with me. I must be one of the people with only 5 available slots in their brain...
(edited for clarity)
> 7 (+/- 2)
That study was done for specific stimuli (words, digits), and doesn't generalize to e.g. statements. There are studies that show that rate of presentation, complexity, and processing load have an effect. However, STM capacity is obviously limited, so it's good to keep that in mind when you're worried about readability. And I think it's also safe to assume that expert programmers can "chunk" more than novices, and have a lower processing load.
> putting constants on the left side of a comparison
Yoda conditions? I hate those, they are difficult to read. Yes they are useful for languages which allow assignments in conditionals, but even then it's not really worth it. It's a very novice mistake to make. For me equality rarely appears in conditionals, it's either a numeric comparison or checking for existence.
Java is still popular. JDK 17 looks as much like ML (e.g. oCAML) as they can make it as fast they can make it without breaking things.
Also, the tooling and compiler speed aren't fucked like they are in Scala or Kotlin. I like Kotlin, especially the null-safety, but the ecosystem's quality is kinda shoddy. Everything is just faster and less buggy in Java.
Honestly Clean Code probably isn't worth recommending anymore. We've taken the good bits and absorbed it into best practice. I think it has been usurped by books like "Software Engineering at Google" and "Building Secure and Reliable Systems".
I don't believe in being prescriptive to anyone about how they write their code, because I think people have different preferences and forcing someone to write small functions when they tend to write large functions well is a unique form of torture. Just leave people alone and let them do their job!
I don't think it is the perfect solution, but a lot of people assert "we can't do better, no point in trying, just write whatever you feel like" and I think that is a degenerate attitude. We CAN find better ways to construct and organize our code, and I don't think we should stop trying because people don't want to update their pull requests.
Software Engineering at Google (for Google, by Google, detailing issues which are issues mainly at Google) [You're not Google]
I've heard this before, and I agree, but don't let the name put you off. I agree that designing and iterating for google scale is a bad idea, but there is a lot in that book that is applicable to all software teams.
Maybe there is no such thing as clean code by following a set of rules. I know the author of the book never advocateshis book as a "bible" but it does give the reader such feeling.
There is only years, decades of deep experience into certain domains (e.g. game rendering engine programming, or, Customer Relationship backends), extra hours reading proved high quality code, countless times of reflection on existing code (that also means extra hours reviewing existing code) based on the reading and a strong will to improve them, not based on some set of rules, but based on common sense on programming and many trials of re-writing the code into another form.
I think ultimately it goes down to something similar to 10,000-hour rule: We need to put down a lot of time in X, and not only that, we also need to challenge ourselves for every step.
I think the book is still useful with all it's flaws, mainly because "overengineering code" is like exercising too much: sure, it happens to some and can be a big issue, but for the vast majority of people it's the opposite that is the problem.
What well-regarded codebases has this author written, so you can see his principles in action? OTOH, if you’re wondering about the quality of John Ousterhout’s advice in _A Philosophy of Software Design_, you can just read the Tcl source.
The article quotes a code sample from FitNesse – the author has apparently maintained that codebase since then. You can check out the code for the current version at https://github.com/unclebob/fitnesse, or browse the code in a Monaco editor using https://github1s.com/unclebob/fitnesse/blob/HEAD/src/. (I have no idea if that code is “well-regarded”, but as you wrote, you can read it for yourself.)
I'm surprised by the amount of detractors. We know from history that any book with advice should not be taken too literal. Reading the comments here, it feels almost like I read a different book (about 10 years ago).
Read it again critically. Maybe you will see it differently, I know I did when I read it a second time after a few more years of experience.
I did actually found the SetupTeardownIncluder class author complains about really easier to read and interpret then the original version. I know from the start what the intention was and where I should go if I have issue with some part of the code.
I dont even take issue with name. It makes it easy to find the class just by remembering what it should do. I dont really care all that much about verbs vs nouns. I want to be able to find the right place by having rough idea about what it should do. I want to get hint about class functionality from its name too.
'Clean Code' is a style, not all practices are best. I feel that a good software team understands each other's styles, therefore making it easier to read others code within the context of a style. However, when people disagree on code style it has a way of creating cliques within teams, so sometimes it's just easier to pick a style that is well documented already and be done with mainly petty disagreements. Clean code fits the definition of well documented and is a lazy way of defining a team wide style.
I am interested in reading books about software development and best practices like Clean Code and The Pragmatic Programmer [0]. I have coded for about eight years, but I would like to do it better. I would like to know your opinion about [0], since Clean Code has been significantly criticized.
[0] https://pragprog.com/titles/tpp20/the-pragmatic-programmer-2...
How about we throw Clean Architecture in this while we're at it. And also realize that the only rule in SOLID that isn't defined subjectively or partially is the "L".
This is the first time I've heard of this book. I certainly agree some of these recommendations are way off the mark.
One guideline I've always tried to keep in mind is that statistically speaking, the number of bugs in a function goes way up when the code exceeds a page or so in length. I try to keep that in mind. I still routinely write functions well over a page in length but I give them extra care when they do, lots of comments and I make sure there's a "narrative flow" to the logic.
The big one to keep an eye on is cyclomatic complexity with respect to function length. Just 3 conditional statements in your code gives you no less than 8 ways through your code and it only goes up from there.
All of these 'clean code' style systems have the same flaw. People follow them without understanding why the system was made. It is why you see companies put in ping pong tables, but no one uses them. They saw what someone else was doing and they were successful so they copy them. Not understanding why the ping pong table was there. They ignore the reason the chesterton's fence was built. Which is just as important if you are removing it. Clean code by itself is 'ok'. I personally am not very good at that particular style of coding. I do like that it makes things very nice to decompose into testing units.
A downside to this style of coding is it can hide complexity with an even more complex framework. It seems to have a nasty side effect of smearing the code across dozens of functions/methods which is harder in some ways to get the 'big picture'. You can wander into a meeting and say 'my method has CC of 1' but the realty is that thing is called at the bottom of a for loop, inside of 2 other if conditions. But you 'pass' because your function is short.
4 line functions everywhere is insanity. yes, you should aim for short functions that do one thing, but in the real world readability and maintainability would suffer greatly if you fragment everything down to an arbitrarily small number.
Number of bugs per line also goes way up when the average length of functions goes below 5, and the effect in most studies is larger than the effect of too large functions.
> Why are we using both int[] and ArrayList<Integer>? (Answer: because replacing the int[] with a second ArrayList<Integer> causes an out-of-bounds exception.)
Isn't it because one is pre-allocated with a known size of n and the other is grown dynamically?
> And what of thread safety?
Indeed. If he had written the prime number class like the earlier example, with public static methods creating a new instance for each call and all the other methods being private instance methods, this wouldn't be an issue.
Pardon me for stating this but pundits of the Clean Code mantra I've worked with tend to be those consultants who bill enormous amount of money to ensure that they have lengthy contracts which is justified by wrapping codes in so much classes to abstract it and considered *CLEAN* and *TESTABLE*.
They will preach the awesomeness of clean code in terms of maintainability, scalability and all those fancy enter-pricey terms that at the end of the day, brought not enough value to justify their cost.
Welcome to another episode of "X, as per my definition of X, is bad - Let's talk about Y, which is another definition of X, but not the one I disagree with".
So many coding recommendations trip up when they fail to take into account Ron's First Law: all extreme positions are wrong. Functions that are too long are bad, but functions that are too short are equally bad. 2-4 lines? Good grief! That's not even enough for a single properly formatted if-then-else!
IMHO it's only when you can't see the top and bottom of a function on the screen at the same time that you should start to worry.
Sad to see the top comments are "Yes clean code leads to bad code" and not "TOO MUCH clean code leads to bad code". Excluded middle much?
Somewhat relatedly (covers some very similar concerns and IIRC even references "Clean Code"), Brian Wills has a few good videos
https://www.youtube.com/watch?v=QM1iUe6IofM
https://www.youtube.com/watch?v=IRTfhkiAqPw
I don't completely disagree, but his point about the irrelevance of SOLID, OO, and Java in this supposedly grand new age of FP programming ignores that OO is still the pre-eminent paradigm for most applications and Java is remains one of the largest and most utilized languages in the world. Also, I would say that excitement around FP has waned more than it has for Java.
Can we also ditch the infatuation microservices while we’re at it? They primarily serve as a revenue generation scheme for cloud providers.
I often hear that people should read Clean Code and that it is necessary in large projects. I would say that there is no direct correlation between how large and complex the business logic is and the difficulty of understanding and maintaining the code. I have seen small simple applications that are not maintainable because people have followed SOLID to the extreme.
One of the biggest issues I have found is that I can sometimes not easily create a test for code that I have modified because it is part of a larger class (I'm coding in C++). This normally happens when I cannot extract the function out of the class and it relies on internal state, and the class is not already being tested.
Love to know if there is an easy way of doing this!
A lot of Robert C Martins pieces are just variations on his strong belief that ill-defined concepts like "craftsmanship" and "clean code" (which are basically just whatever his opinions are on any given day) is how to reduce defects and increase quality, not built-in safety and better tools, and if you think built-in safety and better tools are desirable, you're not a Real Programmer (tm).
I'm not the only one who is skeptical of this toxic, holier-than-thou and dangerous attitude.
Removing braces from if statements is a great example of another dangerous thing he advocates for no justifiable reason
https://news.ycombinator.com/item?id=15440848
>The current state of software safety discussion resembles the state of medical safety discussion 2, 3 decades ago (yeah, software is really really behind time). > >Back then, too, the thoughts on medical safety also were divided into 2 schools: the professionalism and the process oriented. The former school argues more or less what Uncle Bob argues: blame the damned and * who made the mistakes; be more careful, damn it. > >But of course, that stupidity fell out of favor. After all, when mistakes kill, people are serious about it. After a while, serious people realize that blaming and clamoring for care backfires big time. That's when they applied, you know, science and statistic to safety. > >So, tools are upgraded: better color coded medicine boxes, for example, or checklists in surgery. But it's more. They figured out what trainings and processes provide high impacts and do them rigorously. Nurses are taught (I am not kidding you) how to question doctors when weird things happen; identity verification (ever notice why nurses ask your birthday like a thousand times a day?) got extremely serious; etc. > >My take: give it a few more years, and software, too, probably will follow the same path. We needs more data, though.
Weird:
> Page Not Found
> This question was removed from Software Engineering Stack Exchange for reasons of moderation.
Clean Code is not a blind religion, Uncle Bob is trying to make a point with the concepts behind the book and teaching you to consider/question if you're falling into bad code traps.
This book was written to make developers think and consider their choices, not a script for good code.
Disagree. There are zillions of worse books than Clean Code to forget first.
> output arguments are to be avoided in favour of return values
what is an output argument?
An output argument is when you pass an argument to a function, the function makes changes, and after returning you examine the argument you passed to see what happened.
Example: the caller could pass an empty list, and the method adds items to the list.
Why not return the list? Well, maybe the method computes more things than just the list.
> Why not return the list? Well, maybe the method computes more things than just the list.
Or in C you want to allocate the list yourself in a particular way and the method should not concern with doing the allocation itself. And the return value is usually the error status/code since C doesn't have exceptions.
That's a C/C++ trick where a location to dump the output is presented as an argument to the function. This makes functions un-pure and leads to all kind of nastiness such as buffer overruns and such if you are not very careful.
sprintf(buffer,"formatstring", args)
'buffer' is an output argument.
It's wrong to call output parameters a "C/C++ trick" because the concept really has nothing to do with C, C++, buffer overruns, purity, or "other nastiness".
The idea is that the caller tells the function its calling where to store results, rather than returning the results as values.
For example, Ada and Pascal both have 'out' parameters:
Theoretically, other than different calling syntax, there's conceptually no difference between "out" parameters and returning values.
In practice, though, many languages (C, C++, Java, Python, ...) support "out" parameters accidentally by passing references to non-constant objects, and that's where things get ugly.
Not only in C land; C# has "ref" (pass by reference, usually implying you want to overwrite it) and "out" (like ref but you _must_ set it in all code paths). Both are a bit of a code smell and you're nearly always better off with tuples.
Unfortunately in C land for all sorts of important system APIs you have to use output arguments.
An output argument (or parameter) is assigned a result. In Pascal, for instance, a procedure like ReadInteger(n) would assign the result to n. In C (which does not have variable parameters) you need to pass the address of the argument, so the function call is instead ReadInteger(&n). The example function ReadInteger has a side effect so it is therefor preferable to use an output parameter rather than to return a result.
Related to the topic, would folks here recommend some OSS codebases that they've found be a pleasure to read/understand?
For balance, perhaps in both statically and dynamically typed languages?
Thank you!
1. Clean Code is not a fixed destination to which you'll ever arrive. It's a way of life.
2. We might not use the same methods to write clean code. But when you see clean code, you know it is clean.
3. Some traits clean code can have
0. Real clean code does not exist.
I worked at a large billion dollar company in the Bay Area (who is in the health space) and they religiously followed Clean Code. Their main architect was such a zealot for it. My problem is not with the book and author itself but the seniority that peddles this as some gospel to the more junior engineers. Clean code is not end all be all. Be your own person and develop WHAT IS RIGHT FOR YOUR ORG not peddle some book as gospel
So glad I work at a company now where we actually THINK on the right abstractions now and not peddle some book
It's probably time to stop recommending OOP completely at this point if you want to fix those issues.
I just bought the book, god damn it
It's an ok book to read and think about, but understand it is written by someone that hasn't really built a lot of great software, but rather is paid to consult and give out sage advice that is difficult to verify.
Read with great skepticism, but don't feel bad if you decide not to read it at all.
I actually just purchased Clean Code this morning - this should be interesting.
I am glad the author meant the book Clean Code and not the concept.
totally agree with the author about the code examples and the disparity between them and the text.
if it weren’t in a published book id have though i were being trolled.
I'd like to say that software engineering is a lot like playing jazz. It's really hard for the beginner to know where to start, and there're also endless sources for the "right" way to do things.
In truth however, like playing jazz, the only thing that really matters is results, and even those can be subjective. You can learn and practice scales all day long, but that doesn't really tell you how to make music.
I developed a style of software engineering that works really well for me. It's fast, accurate, and naturally extensible and easily refactorable. However, for various reasons, I've never been able to explain it junior (or even senior) engineers, when asked about why I code a certain way. At a certain point, it's not the material that matters, but the audience's ability to really get what's at the heart of the lesson.
Technically, you could say something that's indisputable accurate, like, "there're only 12 notes in an (western) octave, and you just mix them until they sound good", but that's obviously true to a degree that's fundamentally unhelpful. At the same time, you could say "A good way to think about how to use a scale is to focus less on the notes that you play and more on the ones you don't". This is better advice but it may altogether be unhelpful, because it doesn't really yet strike at the true heart of what holds people back.
So at a certain point, I don't really know if anyone can be taught something as fundamentally "artful" (I.e. a hard thing largely consisting of innumerable decisions that are larger matters of taste - which is a word that should not be confused with mere "opinion") as software engineering or jazz music. This is because teaching alone is just not enough. At a certain point people just have to learn for themselves, and obviously the material that's out there is helpful, but I'm not sure if anything can ever be explained so well as to remove the need for the student at a certain point to simply "feel" what sounding good sounds like, or what good software engineering feels like.
I'll add one last thing. Going back to what I was saying about not being able to explain to "junior (or even senior)" engineers. Were the same "lesson" to happen with someone who is very, very advanced, like a seasoned principal engineer who's built and delivered many things, time and time again, across many different engineering organizations and different technologies - someone like a jazz music great for example. Anything I would have to say on my approach to such a software engineer would be treated as obvious and boring, and they'd probably much rather talk about something else. I don't say this because I mean to imply that whatever I would have to say is wrong or incorrect, but rather at a certain level of advancement, you forget everything that you know and don't remember what it took to get there. There are a few who have a specific passion for teaching, but that's orthogonal to the subject.
I think it was Bill Evans who said something like "it takes years of study and practice to learn theory and technique, but it takes still a lifetime to forget it all". When you play like you've forgotten it all, that is when you achieve that certain sound in jazz music. Parenthetically, I'll add that doesn't mean you can't sound good being less advanced, but there's a certain sound that I'm trying to tie together with this metaphor that's parallel to great software engineering from knowledge, practice, and then the experience to forget it all and simply do.
I think that's fundamentally what's at the heart of the matter, not that it takes anyone any closer to getting there. You just have to do it, because we don't really know how to teach how to do really hard things in a way that produces reproducible results.
Uncle "Literally who?" Bob claims you should separate your code into as many small functions spread across as many classes as you can and makes a living selling (proverbial) shovels. John Carmack says you should keep functions long and have the business logic all be encapsulated together for mental cohesion. Carmack makes a living writing software.
I happen to agree with you and have posted in various HN threads over the years about the research on this, which (for what it's worth) showed that longer functions were less error prone. However, the snarky and nasty way that you made the point makes the comment a bad one for HN, no matter how right you are. Can you please not post that way? We're trying for something quite different here: https://news.ycombinator.com/newsguidelines.html.
It's even more important to stick to the site guidelines when you're right, because otherwise you discredit the truth and give people a reason to reject it, which harms all of us.
https://hn.algolia.com/?dateRange=all&page=0&prefix=true&sor...
Carmack's writing on the proper length of functions (although he expresses it in terms of when to inline a function): https://news.ycombinator.com/item?id=8374345
A choice quote from Carmack:
> The function that is least likely to cause a problem is one that doesn't exist, which is the benefit of inlining it. If a function is only called in a single place, the decision is fairly simple.
> In almost all cases, code duplication is a greater evil than whatever second order problems arise from functions being called in different circumstances, so I would rarely advocate duplicating code to avoid a function, but in a lot of cases you can still avoid the function by flagging an operation to be performed at the properly controlled time. For instance, having one check in the player think code for health <= 0 && !killed is almost certain to spawn less bugs than having KillPlayer() called in 20 different places.
On the spectrum you've described, I'm progressively shifting from Uncle Bob's end to Carmack's the further I get into my career. I think of it as code density. I've found that high density code is often easier to grok because there's less ceremony to keep in my head (e.g. many long method names that may or may not be named well, jumping around a bunch of files). Of course, there's a point at which code becomes so dense that it again becomes difficult to grok.
Uncle Bob makes a living selling snake oil. Which one should we listen to?
I work with long functions right now. It does not give mental cohesion. Instead, it makes it difficult to figure out what author intended to happen.
Or perhaps the length of the function is orthogonal to the quality of the author's code. Make the function as long as necessary to be readable and maintainable by the people most likely to read and maintain it. But that's not a very sellable snippet, nor a rule that can be grokked in 5 minutes.
Carmack is literally the top .1% (or higher) of ability and experience. Not to mention has mostly worked in a field with different constraints than most. I don't think looking to him for general development advice is all that useful.
I think the exact opposite.
Read the doom source code and you can see that he didn't mess around with trying to put everything into some nonsense function just because he has some part of a larger function that can be scoped and named.
The way he wrote programs even back then is very direct. You don't have to jump around into lots of different functions and files for no reason. There aren't many specialized data structures or overly clever syntax tricks to prove how smart he is.
There also aren't attempts to write overly general libraries with the idea that they will be reused 100 times in the future. Everything just does what it needs to do directly.
9 replies →
The type of software John writes is different (much more conceptually challenging), and I don't recall him being as big of a proponent of TDD (which is the biggest benefit to small functions).
I think the right answer depends on a number of other factors.
Short functions make it much harder for bugs to hide.
Bugs favorite place to hide is in interfaces.
Long functions make it much harder for bugs to hide.
See what I did there?
TLDR: Author recommends "A Philosophy of Software Design" over "Clean Code"
clean code is a c-level/exec/sales term. No one who actually writes code truly believes in clean code.
it's probably time to stop recommending alt-right and fascist products, but you're all literally nazis, and you're going to die like ones
woah shots fired
The problem with Clean Code is also the problem with saying to ignore Clean Code. If you treat everything as a dogmatic rule on how to do things, you're going to have a bad time.
Because, they're more like guidelines. If you try not to repeat yourself, you'll generally wind up with better code. If you try to make your methods short, you'll generally wind up with better code.
However, if you abuse partial just to meet some arbitrary length requirement, then you haven't really understood the reason for the guideline.
But the problem isn't so much because the book has a mix of good and bad recommendations. We as an evolutionary race have been pretty good at selectively filtering out bad recommendations over the long term.
The problem is that Uncle Bob has a delusional cult following (that he deliberately cultivated), which takes everything he says at face value, and are willing to drown out any dissenting voices with a non-stop barrage of bullshit platitudes.
There are plenty of ideas in Clean Code that are great, and there are plenty that are terrible...but the religiosity of adherence to it prevents of from separating the two.
Clean Code is fine. It's a little dated, as you would expect, and for the most part, everything of value in it has been assimilated into the best practices and osmotic ether that pervades software development now. It's effectively all the same stuff as you see in Refactoring or Code Complete or Pragmatic Programmer.
I suspect a lot of backlash against it centers around Uncle Bob's less than progressive political and social stances in recent years.
I never read Clean Code and know nothing about its author so I'm willing to trust you on the first part, but the second paragraph is really uncalled for IMO. The article is long and gives precise examples of its issues with the book. Assuming an ulterior motive is unwarranted.
I don't think it is uncalled for when there have been several instances of boycotts organized for that reason.
Nobody is that upset about anodyne OOP design patterns
TFA really seems to be saying that there's no such thing as clean code, so don't even bother trying.
This article is garbage. The argument is basically like saying "famous scientist X was wrong about Y, let's stop doing science. Clearly there is no point to it."
I cannot believe what I am reading here.
My open source community knows exactly what good code looks like and we've delivered great products in very short timeframes repeatedly and often beating our own expectations.
These kinds of articles make me feel like I must have discovered something revolutionary... But in reality I'm just following some very simple principles which were invented by other people several decades ago.
Too many coders these days have been misled into all sorts of goofy trends. Most coders don't know how to code. The vast majority of the people who claim to be experts and who write books about it don't know what they're talking about. That's the real problem. The industry has been hijacked by people who simply aren't wise or clever enough to be sharing any kind of complex knowledge. There absolutely is such a thing as good code.
I'm tired of hearing developers who have never read a single word of Alan Kay (the father of OOP) tell everyone else how bad OOP is and why FP is the answer. It's like watching someone drive a nail straight into their own hand and then complain to everyone that hammer and nails are not the right tool for attaching two pieces of wood together... That instead, the answer is clearly to tie them together with a small piece of string because nobody can get hurt that way.
Just read the manual written by the inventor of the tool.
Alan Kay said "The Big Idea is Messaging"... Yet almost none of the OOP code I read designs their components in such a way that they're "communicating" together... Instead, all the components try to use methods to micromanage each other's internal state... Passing around ridiculously complex instances to each other (clearly a whole object instance is not a message).
> The argument is basically like saying "famous scientist X was wrong about Y, let's stop doing science. Clearly there is no point to it."
In my opinion the argument is more "famous paper X by scientist Y was wrong, let's stop citing it". Except that Clean Code isn't science and doesn't pretend to be.
If the article only attacked that specific book "Clean Code", then I would not be as critical. But the first line in the article suggests that it is an attack against the entire idea of writing good quality code:
'It may not be possible for us to ever reach empirical definitions of "good code" or "clean code"'
It might seem far fetched that someone might question the benefits of writing high quality code (readable, composable, maintainable, succinct, efficient...) but I've been in this industry long enough (and worked for enough different kinds of companies) to realize that there is an actual agenda to push the industry in that direction.
Some people in the corporate sphere really believe that the best way to implement software is to brute force it by throwing thousands of engineers at a giant ball of spaghetti code then writing an even more gargantuan spaghetti ball of tests to ensure that the monstrosity actually works.
I see it as an immoral waste of human potential.
7 replies →
Martin is, and always has been, a plagiarist, ghost-written, clueless, idiot, with a way of convincing other know-nothings that he knew something. At one time he tried to set up a reputation on StackOverflow, and was rapidly seen off.
Toxic. Avoid.
Yeah I agree with the author, and I would go further, it's a nice list of reasons why Uncle Bob is insufferable.
Because of stuff like this:
> Martin's reasoning is rather that a Boolean argument means that a function does more than one thing, which it shouldn't.
Really? Really? Not even for dependency injection? Or, you know, you should duplicate your function into two very similar things to have one with the flag and another one without. Oh but DRY. Sure.
> . He says that an ideal function has zero arguments (but still no side effects??), and that a function with just three arguments is confusing and difficult to test
Again, really?
I find it funny who treats him as a guru. Or maybe that's the right way to treat him, like those self-help gurus with meaningless guidance and whishy-washy feel-good statements.
> Every function in this program was just two, or three, or four lines long. Each was transparently obvious. Each told a story. And each led you to the next in a compelling order.
Wow, illumination! Salvation! Right here!
Until, of course you have to actually maintain this and has to chase down 3 or 4 levels of functions deep what is it that the code is actually doing. And think of function signature for every minor thing. And passing all the arguments you need (ignoring that "perfect functions have zero arguments" above - good luck with that)
Again, it sounds like self-help BS and not much more than that.
> Until, of course you have to actually maintain this and has to chase down 3 or 4 levels of functions deep what is it that the code is actually doing.
The art is to chain your short functions like a paragraph, not to nest them a mile deep, where the "shortness" is purely an illusion and the outer ones are doing tons of things by calling the inner ones.
That's a lot harder, though.
But it fits much better with the spirit of "don't have a lot of args for your functions" - if you're making deeply nested calls, you're gonna have to pass all the arguments the inner ones need through the outer ones. Or else do something to obfuscate how much you're passing around (global deps/state, crazy amounts of deep DI, etc...) which doesn't make testing any easier.
> Really? Really? Not even for dependency injection? Or, you know, you should duplicate your function into two very similar things to have one with the flag and another one without. Oh but DRY. Sure.
I'm not sure dependency injection has anything to do with boolean flags or method args. I think the key point here is that he is a proponent of object oriented programming. I think he touches on dependency injection later in the book, but it's been a while since I've read it. He suggests your dependencies get passed at object initialization, not passed as method options. That let's you mock stuff without needing to make any modifications to the method that uses that dependency easily.
> Until, of course you have to actually maintain this and has to chase down 3 or 4 levels of functions deep what is it that the code is actually doing. And think of function signature for every minor thing. And passing all the arguments you need (ignoring that "perfect functions have zero arguments" above - good luck with that)
I myself find it easier to read and understand simple functions than large ones with multiple indentation levels. Also, it definitely does not make sense to pass many arguments along with those many small functions. He recommends making them object instance properties so that you don't need to do that.
It may not be for everyone, but I'll take reading code that follows his principles instead of code that had no thought about design put into it any day of the week. It's not some dogmatic law that should be followed in all cases, but to me it's a set of pretty great ideas to keep in mind to lay out code that is easy to maintain and test.
> I'm not sure dependency injection has anything to do with boolean flags or method args.
DI can be abused as a way to get around long function signatures. "I don't take a lot of arguments here" (I'm just inside of an object that has ten other dependencies injected in). Welcome back to testing hell.
1 reply →
> I myself find it easier to read and understand simple functions than large ones with multiple indentation levels
A CRUD app might get away with this, but anything more complex needs a lot of different variables and has complex code.
It's actually a good marketing trick. He can sell something slightly different and more "pure" and make promises on it and then sell books, trainings and merchandises.
That's what the wellness industry do all the time.
The boolean issue is probably the one that's caused me the most pain. That contradiction with DRY has actually had me go back and forth between repeating myself and using a flag, wasting a ton of time on something incredibly pointless to be thinking that hard about. I feel like the best thing for my career would have been to not read that book right when I started my first professional programing job.
It's been a while since I've read it, but I think to handle boolean flag type logic well he suggests to rely on object subclassing instead. So, for an example that uses a dry run flag for scary operations, you can have your normal object (a) and all of its methods that actually perform those scary operations. Then you can subclass that object (a) to create a dry run subclass (b). That object b, can override only the methods that perform the scary operations that you want to dry run while at the same time using all of its non scary methods. That would let you avoid having if dry_run == true; then dry_run_method() else scary_method() scattered in lots of different methods.
It might make sense to divide your function with boolean flag into two functions and extract common code into third private function. Or may be it'll make things ugly.
I treat those books as something to show me how other people do things. I learn from it and I add it to my skill book. Then I'll apply it and see if I like it. If I don't like it in this particular case, I'll not apply it. IMO it's all about having insight into every possible solution. If you can implement something 10 different ways, you can choose the best one among them, but you have to learn those 10 different ways first.