← Back to context

Comment by morcus

1 day ago

> They won’t notice those extra 10 milliseconds you save

They won't notice if this decision happens once, no. But if you make a dozen such decisions over the course of developing a product, then the user will notice. And if the user has e.g. old hardware or slow Internet, they will notice before a dozen such decisions are made.

In my career of writing software most developers are fully incapable of measuring things. They instead make completely unfounded assertions that are wrong more than 80% of the time and when they are wrong they tend to be wrong by several orders of magnitude.

And yes, contrary to many comments here, users will notice that 10ms saved if it’s on every key stroke and mouse action. Closer to reality though is sub-millisecond savings that occurs tens of thousands of times on each user interaction that developers disregard as insignificant and users always notice. The only way to tell is to measure things.

  • When I was at Google, our team kept RUM metrics for a bunch of common user actions. We had a zero regression policy and a common part of coding a new feature was running benchmarks to show that performance didn't regress. We also had a small "forbidden list" of JavaScript coding constructs that we measured to be particularly slow in at least one of Chrome/Firefox/Internet Explorer.

    Outside contributors to our team absolutely hated us for it (and honestly some of the people on the team hated it too); everyone likes it when their product is fast, and nobody likes being held to the standard of keeping it that way. When you ask them to rewrite their functional coding as a series of `for` loops because the function overhead is measurably 30% slower across browsers[0], they get so mad.

    [0] This was in 2010, I have no idea what the performance difference is in the Year of Our Lord 2025.

    • Have you had the chance to interact with any of the web interfaces for their cloud products like GCP Console, Looker Studio, Big Query etc? It's painful, like when clicking a button or link you can feel a cloud run initializing itself in the background before processing your request.

      1 reply →

    • Boy, do I wish more teams worked this way. Too many product leaders are tasked with improving a single KPI (for example, reducing service calls) but without requiring other KPIs such as user satisfaction to remain constant. The end result is a worse experience for the customer, but hey, at least the leader’s goal was met.

  • The amount of premature optimizations and premature abstractions I’ve seen (and stopped when possible) is staggering. Some people just really want to show how “clever” they are solving a problem that no one asked them to solve and wasn’t even an issue. Often it’s “solved” in a way that cannot be added to or modified without bringing the whole house of cards down (aka, requiring a rewrite).

    I can handle premature optimization better than premature abstraction. The second is more insidious IMHO. I’m not throwing too many stones here because I did this too when I was a junior dev but the desire to try and abstract everything, make everything “reusable” kills me. 99.99999% of the time the developer (myself included) will guess /wrong/ on what future development will need.

    They will make it take 2s to add a new column (just look at that beautiful abstraction!) but if you ask them to add another row then it’s a multi-day process to do because their little abstraction didn’t account for that.

    I think we as developers think we can predict the future way better than we can which is why I try very hard to copy and paste code /first/ and then only once I’ve done that 3-4 times do I even consider an abstraction.

    Most abstractions I’ve seen end up as what I’ve taken to calling “hourglass abstractions” (maybe there’s a real name for this I don’t know it). An hourglass abstraction is where you try to abstract some common piece of code, but your use cases are so different that the abstraction becomes a jumbled mess of if/else or switch/case statements to handle all the different cases. Just because 2 code blocks “rhyme” doesn’t mean you should abstract them.

    The 2 ends of the hourglass represent your inputs/outputs and they all squeeze down to go though the center of the hourglass (your abstraction). Abstractions should look like funnels, not hourglasses.

  • > They instead make completely unfounded assertions that are wrong more than 80% of the time and when they are wrong they tend to be wrong by several orders of magnitude.

    I completely agree. It blows my mind how fifteen minutes of testing something gets replaced with a guess. The most common situation I see this in (over and over again) is with DB indexes.

    The query is slow? Add a bunch of random indexes. Let's not look at the EXPLAIN and make sure the index improves the situation.

    I just recently worked with a really strong engineer that kept saying we were going to need to shard our DB soon, but we're way to small of a company for that to be justified. Our DB shouldn't be working that hard (it was all CPU load), there had to be a bad query in there. He even started drafting plans for sharding because he was adamant that it was needed. Then we checked RDS Performance Insights and saw it was one rogue query (as one should expect). It was about about a 45 minute fix and after downsizing one notch on RDS, we're sitting at about 4% most of the time on the DB.

    But this is a common thing. Some engineers will _think_ there's going to be an issue, or when there is one, completely guess what it is without getting any data.

    Another anecdote from a past company was them upsizing their RDS instance way more than they should need for their load because they dealt with really high connection counts. There was no way this number of connections should be going on based on request frequency. After a very small amount of digging, I found that they would open a new DB connection per object they created (this was PHP). Sometimes they'd create 20 objects in a loop. All the code was synchronous. You ended up with some very simple HTTP requests that would cause 30 DB connections to be established and then dropped.

  • My plex server was down and my lazy solution was to connect directly to the NAS. I was surprised just how much I noticed the responsiveness after getting used to web players. A week ago I wouldn't have said web player bothered me at all. Now I can't not notice.

  • Can you show me any longitudinal studies that show examples of a causal connection between incrementality of latency and churn? It’s easy to make such a claim and follow up with “go measure it”. That takes work. There are numerous other things a company may choose to measure instead that are stronger predictors of business impact.

    There is probably some connection. Anchoring to 10ms is a bit extreme IMO because it’s indirectly implying that latency is incredibly important which isn’t universally true - each product’s metrics that are predictive of success are much more nuanced and may even have something akin to the set of LLM neurons called “polysemantic” - it may be a combination of several metrics expressed via some nontrivial function that are the best predictor.

    For SaaS, if we did want to simplify things and pick just one - usage. That’s the strongest churn signal.

    Takeaway: don’t just measure. Be deliberate about what you choose to measure. Measuring everything creates noise and can actually be detrimental.

    • Human factors has a long history of studying this. I'm 30 years out of school and wouldn't know where to find my notes (and thus references) , but there are places where users will notice 5ms. There are other places where seconds are not noticed.

      The web forced people to get used to very long latency and so fail no longer comment on 10+ seconds but the old studies prove they notice them and shorter waits would drive better "feelings". Back in the old days (of 25mhz CPUs!) we had numbers of how long your application could take to do various things before users would become dissatisfied. Most of the time dissatisfied is not something they would blame on the latency even though the lab test proved that was the issue, instead it was a general 'feeling' they would be unable to explain.

      There are many many different factors that UI studies used to measure. Lag in the mouse was a big problem, not just the point movement either: if the user clicks you have only so long before it must be obvious that the application saw a click (My laptop fails at this when I click on a link), but didn't have to bring up the respond nearly as fast so long as users could tell it was processing.

    • Here is a study on performance that I did for JavaScript in the browser: https://github.com/prettydiff/wisdom/blob/master/performance...

      TLDR; full state restoration of a OS GUI in the browser under 80ms from page request. I was eventually able to get that exact scenario down to 67ms. Not only is the state restoration complete but it covers all interactions and states of the application in a far more durable and complete way than big JavaScript frameworks can provide.

      Extreme performance showed me two things:

      1. Have good test automation. With a combination of good test automation and types/interfaces on everything you can refactor absolutely massive applications in about 2 hours with almost no risk of breaking anything.

      2. Tiny performance improvements mean massive performance gains overall. The difference in behavior is extreme. Imagine pressing a button and what you want is just there before your brain can process screen flicker. This results in a wildly different set of user behaviors than slow software that causes users to wait between interactions.

      Then there are downstream consequences to massive performance improvements, the second order consequences. If your software is extremely fast across the board then your test automation can be extremely fast across the board. Again, there is a wildly different set of expectations around quality when you can run end-to-end testing across 300 scenarios in under 8 seconds as compared to waiting 30 minutes to fully validate software quality. In the later case nobody runs the tests until they are forced to as some sort of CI step and even then people will debate if a given change is worth the effort. When testing takes less than 8 seconds everybody and their dog, including the completely non-technical people, runs the tests dozens of times a day.

      I wrote my study of performance just a few months before being laid off from JavaScript land. Now, I will never go back for less than half a million in salary per year. I got tired of people repeating the same mistakes over and over. God forbid you know what the answer is to cure world hunger and bring in world peace, because any suggestion to make things better is ALWAYS met with hostility if it challenges a developer's comfort bubble. So, now I do something else where I can make just as much money without all the stupidity.

  • Soooooo, my "totally in the works" post about how direct connection to your RDMS is the next API may not be so tongue in cheek. No rest, no graphQL, no http overhead. Just plain SQL over the wire.

    Authentication? Already baked-in. Discoverability? Already in. Authorization? You get it almost for free. User throttling? Some offer it.

    Caching is for weak apps.

I find it fascinating that HN comments always assert that 10ms matters in the context of user interactions.

60Hz screens don't even update every 10ms.

What's even more amazing is that the average non-tech person probably won't even notice the difference between a 60Hz and a 120Hz screen. I've held 120Hz and 60Hz phones side by side in front of many people, scrolled on both of them, and had the other person shrug because they don't really see a difference.

The average user does not care about trivial things. As long as the app does what they want in a reasonable amount of time, it's fine. 10ms is nothing.

  • 60hz screens update every 16.7 ms. So if you add 10 ms to your frame time you will probably miss that 16.7 ms window.

    Almost all can see the difference between 60 and 120 fps. Most probably don't care though.

Multiply that by the number of users and total hours your software is used, and suddenly it's a lot of wasted Watts of energy people rarely talk about.

But they still don't care about your stack. They care that you made something slow.

Fix that however you like but don't pretend your non-technical users directly care that you used go vs java or whatever. The only time that's relevant to them is if you can use it for marketing.

  • That's fine, but I am responding directly to the article.

    > [Questions like] “Is this framework more performant than the other?” keep coming up. But the truth is: users don’t care about any of that.