Comment by esperent
6 days ago
I stopped paying attention for a few days so I'm way out of date. What is the state of the art for agentic coding now?
I've been using Cline and it can do a few of the things suggested as "agentic", but I'd have no idea how to leave it writing and then running tests in a VM and creating a PR for me to review. Or let it roam around in the file tree and create new files as needed. How does that work? Are there better tools for this? Or do I need to configure Cline in some way?
tptacek is using Zed, which I've not tried myself.
I actually do most of my "agentic coding" (not a fan of the term, but whatever) in ChatGPT Code Interpreter, which hasn't changed much in two years other than massive upgrades to the model it uses - I run that mainly via o4-mini-high or o3 these days.
OpenAI's Codex is a leading new thing, but only if you pay $200/month for it. Google's equivalent https://jules.google/ is currently free.
GitHub Copilot gained an "agent mode" recently: https://github.blog/ai-and-ml/github-copilot/agent-mode-101-...
There's also Copilot Coding Agent, which is confusingly an entirely different product: https://github.blog/changelog/2025-05-19-github-copilot-codi...
I'd be quite interested in a more formal post with a detailed analysis of the effectiveness of the different agent impls, including Claude Code and Jetbrains Junie.
Do you use ChatGPT Code Interpreter because it's better, or is it just something you're more familiar with and you're sticking with it for convenience?
Of course, I don't know how one would structure a suitable test, since doing it sequentially would likely bias the later agents with clearer descriptions & feedback on the tasks. I imagine familiarity with how to prompt each particular model is also a factor.
I like Code Interpreter because I'm deeply familiar with it. I don't have to worry about safety at all because it's running in OpenAI's kubernetes container, not on my own laptop. I can control exactly what it can see by selectively uploading files to it. I know it can't make outbound network requests.
The current state of agent in the last batch of project launches (copilot agent, jules, devin...) is to take over and do things in a PR like you want. However the vedict is still out there in terms of whether these implementation prove more useful than agentic code in an IDE.