Comment by mrandish

6 months ago

Benchmark tracking of cloud AI performance is going to be crucial going forward. Vendors are selling a service that by its nature is very difficult for customers to gauge day to day. How will I know if a code revision is ~2.5% less good today than it would have been yesterday? Or if queries during peak load hours use one less 'expert' in their MoE?

Yet vendor's costs to deliver these services are skyrocketing, competition is intense and their ability to subsidize with investor capital is going away. The pressure on vendors to reduce costs by dialing back performance a few percent or under-resourcing peak loads will be overwhelming. And I'm just a hobbyist now. If I was an org with dozens or hundreds of devs I'd want credible ways to verify the QoS and minimum service levels I'm paying for are being fulfilled long after a vendor has won the contract.

0 comments

mrandish

No comments yet

Contribute on Hacker News ↗