Comment by tptacek
6 days ago
This is cope. I know my own experience, and I know the backgrounds and problem domains of the friends I'm talking to that do this stuff better than I do. The productivity gains are real. They're also intuitive: if you can't look at a work week and spot huge fractions of work that you're doing that isn't fundamentally discerning or creative, but rather muscle-memory rote repetition of best practices you've honed over your career, you're not trying (or you haven't built that muscle memory yet). What's happening is skeptics can't believe that an LLM plus a couple hundred lines of Python agent code can capture and replicate most of the rote work, freeing all that time back up.
Another thing I think people are missing is that serious LLM-using coders aren't expecting 100% success on prompts, or anything close to it. One of the skills you (rapidly) develop is the intuition for when to stop a runaway agent.
If an intern spun off hopelessly on a task, it'd be somewhat problematic, because there are finite intern hours and they're expensive. But failed agent prompts are nickel-denominated.
We had a post on the front page last week about someone doing vulnerability research with an LLM. They isolated some target code and wrote a prompt. Then they ran it one hundred times (preemptively!) and sifted the output. That approach finds new kernel vulnerabilities!
Ordinary developers won't do anything like that, but they will get used to the idea of only 2/3 of prompts ending up with something they merge.
Another problem I think a lot of skeptics are running into: stop sitting there staring at the chain of thought logs.
The thing I don't understand is that you keep bringing up your friends' experience in all your responses and in the blog itself. What about your experience and your success rate and productivity gain that you observed with AI agent? It feels like you yourselves aren't confident on your gain and must bring up second hand experience from your friends to prop up your arguments
> if you can't look at a work week and spot huge fractions of work that you're doing that isn't fundamentally discerning or creative, but rather muscle-memory rote repetition of best practices you've honed over your career, you're not trying (or you haven't built that muscle memory yet). What's happening is skeptics can't believe that an LLM plus a couple hundred lines of Python agent code can capture and replicate most of the rote work, freeing all that time back up.
No senior-level engineer worth their salt, and in any kind of minimally effective organization, is spending any meaningful amount of their time doing the rote repetition stuff you're describing here. If this is your experience of work then let me say to you very clearly: your experience is pathological and non-representative and you need to seek better employment :)
Regardless of what people say to you about this, most (all?) undergraduates in CS programs are using LLMs. It's extremely pervasive. Even people with no formal training are using AI and vercel and churning out apps over the weekend. Even if people find reasons to dislike AI code writing, culturally, it's the future. I don't see that changing. So either a huge percent of people writing code are doing it all wrong or times are changing.