Comment by tkiolp4
10 months ago
Are his books (the one about Systems Performance and eBPF) relevant for normal software engineers who want to improve performance in normal services? I don’t work for faang, and our usual performance issues are solved by adding indexes here and there, caching, and simple code analysis. Tools like Datadog help a lot already.
Profiling is a pretty basic technique that is applicable to all software engineering. I'm not sure what a "normal" service is here, but I think we all have an obligation to understand what's happening in the systems we own.
Some people may believe that 100ms latency is acceptable for a CLI tool, but what if it could be 3ms? On some aesthetic level, it also feels good to be able to eliminate excess. Finally, you should learn it because you won't necessarily have that job forever.
Diving into flame graphs being worthwhile for optimization, assumes that your workload is CPU-bound. Most business software does not have such workloads, and rather (as you yourself have noted) spend most of their time waiting for I/O (database, network, filesystem, etc).
And so, (as you again have noted), your best bet is to just use plain old logging and tracing (like what datadog provides) to find out where the waiting is happening.