← Back to context

Comment by ryguz

6 hours ago

The interesting thing about Rule 1 is that it makes Rules 3-5 follow almost mechanically. If you genuinely accept that you cannot predict where the bottleneck is, then writing straightforward code and measuring becomes the only rational strategy. The problem is most people treat these rules as independent guidelines rather than as consequences of a single premise.

In practice what I see fail most often is not premature optimization but premature abstraction. People build elaborate indirection layers for flexibility they never need, and those layers impose real costs on every future reader of the code. The irony is that abstraction is supposed to manage complexity but prematurely applied it just creates a different kind.

> In practice what I see fail most often is not premature optimization but premature abstraction

This matches my experience as well.

Someone here commented once that abstractions should be emergent, not speculative, and I loved that line so much I use it with my team all the time now when I see the craziness starting.

Premature abstraction is one of the worst pitfalls when writing software. It's paid for up front, it costs developer time to understand and work with, it adds complexity (tech debt), increases likeliness of bugs, and increases refactor time. All for someone to say "if we ever want to do X". If we wanted to do it, we'd do it now.

I truly believe this comes from devs who want to feel smart by "architecting" solutions to future problems before those problems have become well defined.

This comment is fascinating to me, as it indicates an entirely different mindset than mine. I'm much more interested in code readability and maintainabilty (and simplicty and elegance) than performance, unless it's necessary. So I would start by saying everything flows from rule 4 or maybe 5. Rule 1 is a consequence of rule 4 for me.

I was just dealing with this is some home-grown configuration nightmare where every line of the file was pulled from some other namespace to the point where you end up with two dozen folders, each with half a dozen partial configuration files in them, and it takes half a hour to figure out where a single value is coming from.

I'm sure it's super flexible but the exactly same thing could have been achieved with 8 YAML files and 60% of the content between them would be identical.

As someone who believes strongly in type based programming and the importance of good data structure choice I'm not seeing how Rule 5 follows Rule 1. I think it's important to reinforce how impactful good data structure choice is compared to trying to solve everything through procedural logic since a well structured coordination of data interactions can end up greatly simplifying the amount of standalone logic.

  • Data cache issues is one case of something being surprising slow because of how data is organized. That said, Structure of Arrays vs Array of structures is an example where rule 4 and 5 somewhat contradict each other, if one confuses "simple" and "easy" - Structure of Array style is "harder" because we don't see it often; but then if it's harder, it is is likely more bug-prone.

  • But good data structure is not always evident from the get go. And if your types are too specific it would make future development hard if the specs change. This is what I struggle with

    • Professionally I'm a data architect. Modeling data in a way that is functional, performant and forward facing is not an easy problem so it's perfectly fine to struggle with it. We do our best job with what we've got within reasonable constraints - we can't do anything more than that.

      I found that over time my senses have been honed to more quickly identify things that are important to deeply study and plan right now and areas where I can skimp more and fix it later if problems develop. I don't know if there was a short cut to honing those senses that didn't involve a lot of pain as I needed to pick apart and rework oversights.

I only agree if you have a bounded dataset size that you know will never grow. If it can grow in future (and if you're not sure, you should assume it can), not only will many data structures and algorithms scale poorly along the way, but they will grow to dominate the bottleneck as well. By the time it no longer meets requirements and you get a trouble ticket, you're now under time pressure to develop, qualify, and deploy a new solution. You're much more likely to encounter regressions when doing this under time pressure.

If you've been monitoring properly, you buy yourself time before it becomes a problem as such, but in my experience most developers who don't anticipate load scaling also don't monitor properly.

I've seen a "senior software engineer with 20 years of industry experience" put code into production that ended up needing 30 minute timeouts for a HTTP response only 2 years after initial deployment. That is not a typo, 30 minutes. I had to take over and rewrite their "simple" code to stop the VP-level escalations our org received because of this engineering philosophy.

  • > You're much more likely to encounter regressions when doing this under time pressure.

    There is nothing to suggest you should wait to optimize under pressure, only that you should optimize only after you have measured. Benchmark tests are still best written during the development cycle, not while running hot in production.

    Starting with the naive solution helps quickly ensure that your API is sensible and that your testing/benchmarking is in good shape before you start poking at the hard bits where you are much more likely to screw things up, all while offering a baseline score to prove that your optimizations are actually necessary and an improvement.

Really need that [flag bot] button added to HN.

  • It would be easier if we could just block comments from green users. I get that it loses ~.1% of authors who might have made an account to comment on a blogpost of theirs that was posted here. I'd rather have that loss than have to deal with the 99.9% of spam.

You tend to see it a lot in "Enterprise" software (.Net and Java shops in particular). A lot of Enterprise Software Architects will reach for their favored abstractions out of practice as opposed to if they fit. Custom solution providers will build a bunch of the same out of practice.

This is something I tend to consider far, far worse than "AI Slop" in practice. I always hated Microsoft Enterprise Library's Data Access Application Block (DAAB) in practice. I've literally only ever seen one product that supported multiple database backends that necessitated that level of abstraction... but I've seen that library well over a dozen times in practice. Just as a specific example.

IMO, abstractions should generally serve to make the rest of the codebase reasonable more often than not... abstractions that hide complexity are useful... abstractions that add complexity much less so.