Comment by marcosdumay

5 months ago

There is probably a non-linear function of how slow your software is to how many users will put-up with it.

Those 10 ms may quite well mean the difference between success and failure... or they may be completely irrelevant. I don't know if this is knowable.

2 comments

marcosdumay

nottorp 5 months ago

There is. But what the OP is doing is not that, it's "scaling". Which probably makes sense for whatever they're working on*. For the other 99% of projects, it doesn't.

* ... if they're at ClosedAI or Facebook or something. If they're at some startup selling "AI" solutions that has 10 customers, it may be wishful thinking that they'll reach ClosedAI levels of usage.

marcosdumay 5 months ago

It's not really clear to me that the OP is talking about hardware costs. If so, yeah, once you have enough scale and with a read-only service like an LLM, those are perfectly linear.
If it's about saving the users time, it's very non-linear. And if it's not a scalable read-only service, the costs will be very non-linear too.