← Back to context

Comment by samuelknight

7 days ago

https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...

That info is from mid 2025, talking about models released in Oct 2024 and Feb 2025. It predates tools like Claude Code and Codex, Lovable was 1/3 current ARR, etc.

This might still be true but we desperately need new data.

None of those changes address the issue jdlshore is pointing out: self assessed developers productivity increases from LLMs are not a reliable indication of actual productivity increases. It's true that modern LLMs might have less of a negative impact on productivity or increase it, but you won't be able to tell by asking developers if they feel more productive.

(Also, Anthropic released Claude Code in Febuary of 2025, which was near the start of the period the study ran).

  • > self assessed developers productivity increases from LLMs are not a reliable indication of actual productivity increases.

    I believe the other direction makes more sense; if the studies disagree with self-reported information, it's more likely the studies are wrong. At the very least, it's worth heavily questioning whether the studies are wrong.

Yeah new data would be great, but i feel like these tools are not substantively better and this is becoming the new "its different this time!"