← Back to context

Comment by stackskipton

20 hours ago

It's generally sub 2MS. Most people take slight latency increase for higher availability, but I guess in this case, that was not acceptable.

2ms per RPC is pretty high if you need to make dozens of RPCs to serve a request.

  • That was the origin for this solution. A client app had to issue millions of small SQL queries where the first query had to complete before the second query could be made. Millions of MS adds up.

    Lowest possible latency would of course be running the client code on the same physical box as the SQL server, but thats hard to do.

  • It’s generally sub that. On average it seems to be about .7 MS.

    • In my experience it has been relatively high variance – it does get as low as 0.5, but can be 3-4. That's an order of magnitude difference, and can be the difference between a great and a terrible UX when you amplify it across many RPCs.

      In general the goal should be to deploy as much of the stack in one zone as possible, and have multiple zones for redundancy.

      1 reply →