Comment by dwattttt

6 months ago

> People ... aren’t machines that can run constantly on 100% utilization.

You also can't run machines at 100% utilisation & expect quality results. That's when you see tail latencies blow out, hash maps lose their performance, physical machines wear supra-linearly... The list goes on.

4 comments

dwattttt

dehrmann 6 months ago

The standard rule for CPU-bound RPC server utilization is 80%. Any less and you could use fewer machines; any more and latency starts to take a hit. This is when you're optimizing for latency. Throughput is different.

pdhborges 5 months ago
Doesn't this depend on the number of servers, crash rates and recovery times? I wouldn't feel confident running 3 servers running at 80% capacity in ultra low latency scenarios. A single crash would overwhelm the other 2 servers in no time.
- dehrmann 5 months ago
  
  Right; this is only for large pools of servers.

namaria 5 months ago

Difference is machines break and that costs lots of money.

People just quit, some businesses consider it a better outcome.