Comment by afavour
6 months ago
Feels somewhat like a self fulfilling prophecy though. Big tech companies jam “AI” in every product crevice they can find… “see how widely it’s used? It’s inevitable!”
I agree that AI is inevitable. But there’s such a level of groupthink about it at the moment that everything is manifested as an agentic text box. I’m looking forward to discovering what comes after everyone moves on from that.
We haven't even barely extracted the value from the current generation of SOTA models. I would estimate less then 0.1% of the possible economic benefit is currently extracted, even if the tech effectively stood still.
That is what I find so wild about the current conversation and debate. I have claude code toiling away building my personal organization software right now that uses LLMs to take unstructured input and create my personal plans/project/tasks/etc.
I keep hearing this over and over. Some llm toiling away coding personal side projects, and utilities. Source code never shared, usually because it’s “too specific to my needs”. This is the code version of slop.
When someone uses an agent to increase their productivity by 10x in a real, production codebase that people actually get paid to work on, that will start to validate the hype. I don’t think we’ve seen any evidence of it, in fact we’ve seen the opposite.
100% agree. I have so much trouble squaring my experience with the hype and the grandparent post here.
The types of tasks I have been putting Claude Code to work on are iterative changes on a medium complexity code base. I have an extensive Claude.md. I write detailed PRDs. I use planning mode to plan the implementation with Claude. After a bunch of iteration I end up with nicely detailed checklists that take quite a lot of time to develop but look like a decent plan for implementation. I turn Claude (Opus) loose and religiously babysit it as it goes through the implementation.
Less than 50% of the time I end up with something that compiles. Despite spending hundreds of thousands of tokens while Claude desperately throws stuff against the wall trying to make it work.
I end up spending as much time as it would have taken just to write it to get through this process AND then do a meticulous line by line review where I typically find quite a lot to fix. I really can't form a strong opinion about the efficiency of this whole thing. It's possible this is faster. It's possible that it's not. It's definitely very high variance.
I am getting better at pattern matching on things AI will do competently. But it's not a long list and it's not much of the work I actually do in a day. Really the biggest benefit is that I end up with better documentation because I generated all of that to try and make the whole thing actually work in the first place.
Either I am doing something wrong, the work that AI excels at looks very different than mine, or people are just lying.
1 reply →
People have much more favorable interactions with coding LLMs when they are using it for greenfield projects that they don't have to maintain (ie personal projects). You can get 2 months of work done in a weekend and then you hit a brick wall because the code is such a gigantic ball of mud that neither you nor the LLM are capable of working on it.
Working with production code is basically jumping straight to the ball of mud phase, maybe somewhat less tangled but usually a much much larger codebase. Its very hard to describe to an LLM what to even do since you have such a complex web of interactions to consider in most mature production code.
3 replies →
:| I'm an engineer of 30+ years. I think I know good and bad quality. You can't "vibe code" good quality, you have to review the code. However it is like having a team of 20 Junior Engineers working. If you know how to steer a group of engineers, then you can create high quality code by reviewing the code. But sure, bury your head in the sand and don't learn how to use this incredibly powerful tool. I don't care. I just find it surprising that some people have such a myopic perspective.
It is really the same kind of thing.. but the model is "smarter" then a junior engineer usually. You can say something like "hmm.. I think an event bus makes sense here" Then the LLM will do it in 5 seconds. The problem is that there are certain behavioral biases that require active reminding (though I think some MCP integration work might resolve most of them, but this is just based on the current Claude Code and Opus/Sonnet 4 models)
11 replies →
Big Tech can jam X everywhere and not get actual adoption though, it's not magic. They can nudge people but can't force them to use it. And yes a lot of AI jammed everywhere is getting the Clippy reaction.
The thing a lot of people haven't yet realized is: all those AI features jammed into your consumer products, aren't for you. They're for investors.
We saw the same thing with blockchain. We started seeing the most ridiculous attempts to integrate blockchain, by companies where it didn't even make any sense. But it was all because doing so excited investors and boosted stock prices and valuations, not because consumers wanted it.