Comment by laserlight
6 days ago
I wouldn't consider the proposed workflow agentic. When you review each step, give feedback after each step, it's simply development with LLMs.
6 days ago
I wouldn't consider the proposed workflow agentic. When you review each step, give feedback after each step, it's simply development with LLMs.
Interesting. What would make the workflow "agentic" in your mind? The AI implementing the task fully autonomously, never getting any human feedback?
To me "agentic" in this context essentially that the LLM has the ability to operate autonomously, so execute tools on my behalf, etc. So for example my coding agents will often run unit tests, run code generation tools, etc. I've even used my agents to fix issues with git pre-commit hooks, in which case they've operated in a loop, repeatedly trying to check in code and fixing errors they see in the output.
So in that sense they are theoretically capable of one-shot implementing any task I set them to, their quality is just not good enough yet to trust them to. But maybe you mean something different?
IMHO, agentic workflow is the autonomous execution of a detailed plan. Back-and-forth between LLM and developer is fine in the planning stage. Then, the agent is supposed to overcome any difficulties or devise solutions to unplanned situations. Otherwise, Cursor had been able to develop in a tight loop of writing and running tests, followed by fixing bugs, before “agentic” became a buzzword.
Perhaps “agentic” initially referred to this simple loop, but the milestone was achieved so quickly that the meaning shifted. Regardless, I could be wrong.
Yeah, I have no idea what the consensus definition of the term is, and I suppose I can't say for sure what OP meant. I haven't used Cursor. My understanding was that it exercises IDE functions but does not execute arbitrary shell commands, maybe I'm wrong. I've specifically had good experiences with the tools being able to run arbitrary commands (like the git debugging example I mentioned).
In my experience reading discussions like this, people seem to be saying that they don't believe that Claude Code and similar tools provide much of a productivity boost on relatively open ended domains (i.e. the AI is driving the writing of the code, not just assisting you in writing your own code faster). And that's certainly not my experience.
I agree with you that success with the initial milestone ("agent operates in a self-contained loop and can execute arbitrary commands") was achieved pretty quickly. But in my experience a lot of people don't believe this. :-)