Comment by Wowfunhappy
1 day ago
It's impossible to prove in either direction. AI benchmarks suck.
Personally, I like using Claude (for the things I'm able to make it do, and not for the things I can't), and I don't really care whether anyone else does.
I'd just like to see a live coding session from one of these 10x AI devs
Like genuinely. I want to get stuff done 10x as fast too
My wife used to be a professional streamer so I know how distracting it can be to try and entertain an audience. So when I attempted to become one of these 10x AI devs over my Christmas vacation I did not live stream. But I did make a bunch of atomic commits and push them up to soucrcehut. Perhaps you'll find that helpful?
Just Christmas Vacation (12-18h days): https://git.sr.ht/~kerrick/ratatui_ruby/log/v0.8.0
Lastest (slowed down by job & real life): https://git.sr.ht/~kerrick/ratatui_ruby/log/trunk and https://git.sr.ht/~kerrick/ratatui_ruby-wiki/log/wiki and https://git.sr.ht/~kerrick/ratatui_ruby-tea/log/trunk
But the benefit might not be speed, it might be economy of attention.
I can code with Claude when my mind isn't fresh. That adds several hours of time I can schedule, where previously I had to do fiddly things when I was fresh.
What I can attest is that I used to have a backlog of things I wanted to fix, but hadn't gotten around to. That's now gone, and it vanished a lot faster than the half a year I had thought it would take.
Doesn't that mean you're less likely to catch bugs and other issues that the AI spits out?
7 replies →
I'd also like to see how it compares to their coding without AI.
I mean I really need to understand what the "x" is in 10x. If their x is <0.1 then who gives a shit. But if their x is >2 then holy fuck I want to know.
Who doesn't want to be faster? But it's not like x is the same for everybody.
I'm really dubious of such claims. Even if true, I think they're not seeing the whole picture. Sure, I could churn out code 10x as fast, but I still have to review it. I still have to think of the implementation. I still have to think of the test cases and write them. Now, adding the prerequisites for LLMs, I have to word this in a way the AI can understand it, which is extra mental load. I have to review code sometimes multiple times if it gets something wrong, and I have to re-generate, or make corrections, or sometimes end up fixing entire sections it generated, when I decide it just won't get this task right. Overally, while the typing, researching dependency docs (sometimes), time is saved, I still face the same cognitive load as ever, if not more, due to having extra code to review, having to think of prompting, I'm still limited by the same thing at the end of the day: my mental energy. I can write the code myself and it's if anything a bit slower. I still need to know my dependencies, I still need to know my codebase and all its gripes, even if the AI generates code correctly. Overally, the net complexity of my codebase is the same, and I don't buy the crap, also because I've never heard of stories about reducing complexity (refactoring), only about generating code and fixing codebases with testing and comments/docs (bad practice imo, unlikely the shallow docs generated will say anything more than what the code already makes evident). Anyways, I'm not a believer, I only use LLMs for scaffolding, rote tasks.
I'm a "backend" dev, so you could say that I am very very unfamiliar, have mostly-basic and high-level knowledge of frontend development. Getting this thing to spit out screens and components and adjust them as I see fit has got to be some sort of super-power and definitely 20x'd my frontend development for hobby projects. Previous to this, my team was giving me wild "1 week" estimates to code simple CRUD screens (plus 1 week for "api integration") and those estimates always smelled funny to me.
Now that I've seen what the AI/agents can do, those estimates definitely reek, and the frontend "senior" javascript dev's days are numbered. Especially for CRUD screens, which lets face it, make up most screens these days and should absolutely be churned out like in an assembly line instead of being delicate "hand crafted" precious works of art that allows 0.1x devs to waste our time because they are the only ones who supposedly know the ancient and arcane 'npm install, npm etc, npm angular component create" spells.
Look at the recent Tailwind team layoffs, they're definitely seeing the impact of this as are many team-leads and managers in most companies in our industry. Especially "javascript senior dev" heavy shops in the VC space, which many people are realizing they have an over-abundance of because those devs bullshitted entire teams and companies into thinking simple CRUD screens take weeks to develop. It was like a giant cartel, with them all padding and confirming the other "engineer's" estimates and essentially slow-devving their own screens to validate the ridiculous padding.
5 replies →
Yeah this is the key point. Part of me wonders if it's just 0.1x devs somehow reaching 1.0x productivity...
3 replies →
I don't think any serious dev has claimed 10x as a general statement. Obviously, no true scotsman and all that, so even my statement about makers of anecdotal statements is anecdotal.
Even as a slight fan, I'd never claim more than 10-20% all together. I could maybe see 5x for some specific typing heavy usages. Like adding a basic CRUD stuff for a basic entity into an already existing Spring app.
Obviously, there has to be huge variability between people based on initial starting conditions.
It is like if someone says they are losing weight eating 2500 calories a day and someone else says that is impossible because they started eating 2500 calories and gained weight.
Neither are making anything up or being untruthful.
What is strange to me is that smart people can't see something this obvious.
> I want to get stuff done 10x as fast too
I don’t. I mean I like being productive but by doing the right thing rather than churning out ten times as much code.
I’d really like to see a 10x ai dev vs a 10x analog dev
And an added "6 months" later to see which delivered result didn't blow up in their face down the road.
Theo the YouTuber who also runs T3.chat always makes videos about how great coding agents are and he’ll try to do something on stream and it ALWAYS fails massively and he’s always like “well it wasn’t like this when I did it earlier.”
Sure buddy.
Theo is the type of programmer where you don't care when he boos you, because you know what makes him cheer.
> AI benchmarks suck.
Not only do they suck, but it's an essentially an impossible task since there is no frame of reference on what "good code" looks like.
[dead]