← Back to context

Comment by OtherShrezzing

8 months ago

I think the commenters critique still stands. Humans build human-capital, so the longer you "run" them for in a domain, the more valuable they become. AIs work inversely, and the longer they're run for, the worse they tend to become at that specific task. Even in the best-case scenario, they stay exactly as competent at the task throughout its length.

Increasing task length doesn't build in an equivalent of human-capital. It's just pushing the point at which they degrade. This approach isn't generalisably scalable, because there's always going to be a task longer than the SOTA capabilities.

We really need to work on a low cost human-capital-equivalent for models.

True, that's why I'm beginning to adopt a certain strategy working with AI coding agents:

I don't babysit them for long periods in one session. I allow them to one-shot the initial solution. Then, I thoroughly review the output, make notes of what should be improved, then either feed that into the original prompt and one-shot it again, or ask the agent to update the solution based on the notes.

  • Yes I do exactly this too. But I do sometimes 2, 3 shot some problems. The method I use is that in Cursor/Copilot interface I use a Markdown file to chat with the bot. Once I have some solutions after a few turns, I edit the file, add more information, add things to avoid etc and restart. It most definitely gives better results, and the agent doesn't have to read the whole file if not necessary, which means context is used more efficiently per problem.