← Back to context

Comment by grumpymuppet

3 days ago

This is something I've wished to eliminate too. Maybe we just cast the past 20 years as the "prototyping phase" of modern infrastructure.

It would be interesting to collect a roadmap for optimizing software at scale -- where is there low hanging fruit? What are the prime "offenders"?

Call it a power saving initiative and get environmentally-minded folks involved.

IMO, the prime offender is simply not understanding fundamentals. From simple things like “a network call is orders of magnitude slower than a local disk, which is orders of magnitude slower than RAM…” (and moreover, not understanding that EBS et al. are networked disks, albeit highly specialized and optimized), or doing insertions to a DB by looping over a list and writing each row individually.

I have struggled against this long enough that I don’t think there is an easy fix. My current company is the first I’ve been at that is taking it seriously, and that’s only because we had a spate of SEV0s. It’s still not easy, because a. I and the other technically-minded people have to find the problems, then figure out how to explain them b. At its heart, it’s a culture war. Properly normalizing your data model is harder than chucking everything into JSON, even if the former will save you headaches months down the road. Learning how to profile code (and fix the problems) may not be exactly hard, but it’s certainly harder than just adding more pods to your deployment.

Use of underpowered databases and abstractions that don't eliminate round-trips is a big one. The hardware is fast but apps take seconds to load because on the backend there's a lot of round-trips to the DB and back, and the query mix is unoptimized because there are no DBAs anymore.

It's the sort of thing that can be handled via better libraries, if people use them. Instead of Hibernate use a mapper like Micronaut Data. Turn on roundtrip diagnostics in your JDBC driver, look for places where they can be eliminated by using stored procedures. Have someone whose job is to look out for slow queries and optimize them, or pay for a commercial DB that can do that by itself. Also: use a database that lets you pipeline queries on a connection and receive the results asynchronously, along with server languages that make it easy to exploit that for additional latency wins.