Comment by ModernMech

4 hours ago

> You wouldn’t see entire companies completely retooled and refactored around these tools.

That's exactly what I'd expect people who are driven by hype and FOMO and YOLO and anecdotal evidence to do.

> resulting in systemic collapse.

Many people are noting the system is collapsing. Maybe it's not going as quickly as you expect, but there's definitely evidence of this from increased service outage frequency, billion dollar notes being passed in a circle between companies, open projects refusing AI contributions entirely because they're overwhelmed by crap, Sam Altman begging governments to force citizens to buy their product through "universal basic compute", etc.

> Look at the enormous body of work on measurement of these systems.

It's certainly possible to measure anything. Benchmarks are a form of evidence but they famously a) don't represent reality and b) can be easily gamed.

1 comment

ModernMech

aspenmartin 2 hours ago

> That's exactly what I'd expect people who are driven by hype and FOMO and YOLO and anecdotal evidence to do.

Not at this scale.

> Many people are noting the system is collapsing.

On HN? any piece of evidence to support this? service outage frequency is not a sign of systemic collapse. billion dollar notes passed in a circle is brought up a lot and misunderstands how finance works. "open projects refusing AI contributions entirely because they're overwhelmed by crap" is not a systemic collapse, its not being able to adapt to a new world with new challenges. Btw "slop" is getting less and less sloppy.

> Sam Altman begging governments to force citizens to buy their product through "universal basic compute", etc.

Very interested in some citation detail that sounds like a headline quote of something more complex.

> It's certainly possible to measure anything. Benchmarks are a form of evidence but they famously a) don't represent reality and b) can be easily gamed.

I mean I work on benchmarks for a living I can tell you both of these things are true but only partially, and in aggregate they all tell a consistent story. Not to mention, static OSS benchmarks are not what these companies rely on. They have live traffic, ability to run A/B tests, full conversation traces, to ignore this is pretty incredible.