Comment by xmprt
8 months ago
I think this is the big roadblock that I don't see the current AI models/architectures getting past. Normally, intelligence gets smarter over time as it learns from its mistakes. However most AI models come in with tons of knowledge but start to decompose after a while which makes them extremely unreliable on complex tasks. The hardest part of using them is that you don't know when they'll break down so they might work perfectly up till a point and then fail spectacularly immediately past that.
Task length is increasing over time - and many AI labs are working on pushing it out further. Which necessitates better attention, better context management skills, better decomposition and compartmentalization and more.
I think the commenters critique still stands. Humans build human-capital, so the longer you "run" them for in a domain, the more valuable they become. AIs work inversely, and the longer they're run for, the worse they tend to become at that specific task. Even in the best-case scenario, they stay exactly as competent at the task throughout its length.
Increasing task length doesn't build in an equivalent of human-capital. It's just pushing the point at which they degrade. This approach isn't generalisably scalable, because there's always going to be a task longer than the SOTA capabilities.
We really need to work on a low cost human-capital-equivalent for models.
True, that's why I'm beginning to adopt a certain strategy working with AI coding agents:
I don't babysit them for long periods in one session. I allow them to one-shot the initial solution. Then, I thoroughly review the output, make notes of what should be improved, then either feed that into the original prompt and one-shot it again, or ask the agent to update the solution based on the notes.
1 reply →