← Back to context Comment by slashdave 17 hours ago Disk where? LLM requests are routed dynamically. You might not even land in the same data center. 1 comment slashdave Reply FuckButtons 15 hours ago But if you have a tiered cache, then waiting several seconds / minutes is still preferable to getting a cache miss. I suspect the larger problem is the amount of tinkering they are doing with the model makes that not viable.
FuckButtons 15 hours ago But if you have a tiered cache, then waiting several seconds / minutes is still preferable to getting a cache miss. I suspect the larger problem is the amount of tinkering they are doing with the model makes that not viable.
But if you have a tiered cache, then waiting several seconds / minutes is still preferable to getting a cache miss. I suspect the larger problem is the amount of tinkering they are doing with the model makes that not viable.