← Back to context

Comment by fieldcny

3 hours ago

You speak so authoritatively about quality and performance of these models, yet there are no quantitative metrics that correlate to real world outcomes that indicate that the quality and performance of these models is anything but subjective noise and classic benchmark nonsense.

A company consumed half a billion dollars worth of tokens in a month and nobody noticed anything until the bill came due.

Tha $500m dollars is roughly equivalent to 2000 people working for a year or 500 people working for four years, they can and would accomplish a lot if they worked in companies that add value to the economy by solving real problems.

Indeed Its irrelevant. Each firm will make its own cost-benefit analysis, especially since the frontier labs are raising prices.

Marketing only takes you so far in creating noise.

Its weird seeing this focus on bench marks again - PC's did this for quite some time. But in the end it came down to - what does all this additional horsepower let you do? Oh create interesting apps, multi-tasking etc. Which was really the value-add.

> You speak so authoritatively about quality and performance of these models, yet there are no quantitative metrics that correlate to real world outcomes that indicate that the quality and performance of these models is anything but subjective noise and classic benchmark nonsense.

I'm responsible for AI roll out at a small business and we've had data science go over these things internally in terms of what results we get for 12+ months now. Its just my experience that is roughly the results we've seen using Deepseek, etc. and comparing cost/results vs. Anthropic/ChatGPT.

> A company consumed half a billion dollars worth of tokens in a month and nobody noticed anything until the bill came due.

It was sourced from one anonymous source. Its highly unlikely to be true in my view, but hey, you do you.