Comment by tptacek

18 hours ago

This article is knocking down a very expansive claim that most serious (ie: not vibe-coding) developers aren't making. Their point is that LLM agents have not yet reached the point where they can finish a complicated job end-to-end, and that if you want to do a completely hands-off project, where only the LLM generates any code, it takes a lot of prompting effort to accomplish.

This seems true, right now!

But in building out stuff with LLMs, I don't expect (or want) them to do the job end-to-end. I've ~25 merged PRs into a project right now (out of ~40 PRs generated). Most merged PRs I pulled into Zed and cleaned something up. At around PR #10 I went in and significantly restructured the code.

The overall process has been much faster and more pleasant than writing from scratch, and, notably, did not involve me honing my LLM communications skills. The restructuring work I did was exactly the same kind of thing I do on all my projects; until you've got something working it's hard to see what the exact right shape is. I expect I'll do that 2-3 more times before the project is done.

I feel like Kenton Varda was trying to make a point in the way they drove their LLM agent; the point of that project was in part to record the 2025 experience of doing something complicated end-to-end with an agent. That took some doing. But you don't have to do that to get a lot of acceleration from LLMs.

15 comments

tptacek

ofjcihen 17 hours ago

It’s almost like unrealistic expectations of LLMs driven by those working for companies who have something to gain by labeling any skepticism as “crazy” does significant damage to our perception of it’s usefulness.

Believe it or not I agree.

tptacek 13 hours ago
I'm sorry, I read this comment like 3 times and I still don't understand what it's trying to say. Who are the companies you're talking about and are they too positive on LLMs or too negative?
- ofjcihen 10 hours ago
  
  Just that blind fanaticism leads to things like constant goal post moving when the product doesn’t live up to the hype. This damages people’s perception of the tool and causes them to be burnt out on it when it isn’t in fact magic.
  Instead we should be accepting that people will or wont find uses for it depending on their competency (CRUD app churn VS somewhat novel creations) and accept that without telling them they’re nuts, luddites, etc.
  Then again like I said the people doing that usually have something to gain such as a product related to the hype generating product.
  Here’s an example article that hit the front page for HN this week https://fly.io/blog/youre-all-nuts/
  
  10 replies →

threeseed 15 hours ago

The plural of anecdote is not data.

Let's repeat this process for 100 coding examples and see how many it can complete "hands-off" especially where (a) it isn't a case of here is a spec and I need you to implement it and (b) it isn't for a a use for which there is already publicly available code.

Otherwise your claim of "this seems true, right now!" is baseless.

tptacek 14 hours ago

I can't tell if you're saying I'm being too generous towards LLMs or too skeptical.