Comment by fsh

6 days ago

Companies are optimizing for all the big benchmarks. This is why there is so little correlation between benchmark performance and real world performance now.

Isn’t there? I mean, Claude code has been my biggest usecase and it basically one shots everything now

  • Yes, LLMs have become extremely good at coding (not software engineer though). But try using them for anything original that cannot be adapted from GitHub and Stack Overflow. I haven't seen much improvement at all at such tasks.

    • Strongly disagree with this. And I'm going to provide as much evidence as you did.

    • No shot, their classic engineering ability has exploded too.

      The amount of information available online about optics is probably <0.001% of what is available for software, and they can just breeze through modeling solutions. A year ago was immediate face-planting.

      The gains are likely coming from exactly where they say they are coming from - scaling compute.