Comment by danpalmer

14 hours ago

2ms per RPC is pretty high if you need to make dozens of RPCs to serve a request.

That was the origin for this solution. A client app had to issue millions of small SQL queries where the first query had to complete before the second query could be made. Millions of MS adds up.

Lowest possible latency would of course be running the client code on the same physical box as the SQL server, but thats hard to do.

It’s generally sub that. On average it seems to be about .7 MS.

  • In my experience it has been relatively high variance – it does get as low as 0.5, but can be 3-4. That's an order of magnitude difference, and can be the difference between a great and a terrible UX when you amplify it across many RPCs.

    In general the goal should be to deploy as much of the stack in one zone as possible, and have multiple zones for redundancy.

    • AWS publish their own metrics for cross-AZ and internal-AZ latency: https://eu-central-1.console.aws.amazon.com/nip/ (Network Manager > Infrastructure Performance)

      > In general the goal should be to deploy as much of the stack in one zone as possible

      Agree. The can be a few downsides one has to consider if you have to fail over to another zone. Worst case, there isn't sufficient capacity available when you fail over if everyone else is asking for capacity at the same time. If one uses e.g. karpenter, you should be able to be very diverse in the instance selection process, so that you get at least some capacity, but maybe not the preferred.