Comment by benreesman
3 days ago
I find the swings to be wild, when you win with it, you win really big. But when you lose with it, it's a real bite out of your week too. And I think 10x to 20x has to be figurative right, you can do 20x by volume maybe, but to borrow an expression from Steve Ballmer, that's like measuring an airplane by kilograms.
Someone already operating at the very limit of their abilities doing stuff that is for them high complexity, high cognitive load, detail intense, and tactically non-obvious? Even a machine that just handed you the perfect code can't 20x your real output, even if it gave you the source file at 20x your native sophistication you wouldn't be able to build and deploy it, let alone make changes to it.
But even if it's the last 5-20% after you're already operating at your very limit and trying to hit your limit every single day is massive, it makes a bunch of stuff on the bubble go from "not realistic" to "we did that".
There are definitely swings. Last night it took about 2 hours to get Monaco into my webpack built bootstrap template, it came down to CSS being mishandled and Claude couldn't see the light. I just pasted the code into ChatGPT o3 and it fixed it first try. I pasted the output of ChatGPT into Claude and viola, all done.
A key skill is to sense when the AI is starting to guess for solutions (no different to human devs) and then either lean into another AI or reset context and start over.
I'm finding the code quality increase greatly with the addition of the text 'and please follow best practices because will be pen tested on this!' and wow.. it takes it much more seriously.
Doesn't sound like you were writing actual functionality code, just integrating libraries?
That's right for this part of the work.
Most of the coding needed to give people CRUD interfaces to resources is all about copy / pasting and integrating tools together.
Sort of like the old days when we were patching all those copy/paste's from StackOverflow.
Too little of full stack application writing is truly unique.
Is there a way to have two agentic AIs do pair programming?
I did experiment with this where Claude Code was the 'programmer' and ChatGPT was the Software Architect. The outcome was really solid and I made it clear that each was talking to an AI and they really seemed to collaborate and respect the key points of each side.
It would be interesting to set up a MCP style interface, but even me copy/pasting between windows was constructive.
The time this worked best was when I was building a security model for an API that had to be flexible and follow best practices. It was interesting seeing ChatGPT compare and contrast against major API vendors, and Claude Code asking the detailed implementation questions.
The final output was a pragmatic middle-ground between simplistic and way too complex.
yes, definitely. https://github.com/BeehiveInnovations/zen-mcp-server is one example of people going off on this, but i'm sure there are many others
Let's be serious, what percentage of devs are doing "high complexity, high cognitive load, detail intense" work?
All of them, some just don’t notice, don’t care or don’t know this line of work is like that. Look at how junior devs work vs really experienced, self-aware engineers. The latter routinely solve problems the former didn’t know existed.
What does being experienced in a field of work have to do with self awareness?
Also I disagree. For web dev atleast, most people are just rewriting the same stuff in a different order. Even though the entire project might be complex from a high level perspective, when you dive into the components or even just a single route it ain't "high complexity" at all and since I believe most jobs are in web / app dev which just recycles the same code over and over again that's why there's a lot of people claiming huge boosts to productivity.
1 reply →
> Someone already operating at the very limit of their abilities doing stuff that is for them high complexity, high cognitive load, detail intense, and tactically non-obvious?
How much of the code you write is actually like this? I work in the domain of data modeling, for me once the math is worked out majority of the code is "trivial". The kind of code you are talking about is maybe 20% of my time. Honestly, also the most enjoyable 20%. I will be very happy if that is all I would work on while rest of it done by AI.
Creatively thinking about what a client needs, how the architecture for that would be like, general systems thinking, UX etc. and seeing that come to live in a clean, maintainable way, that's what lights up my eyes. The minutiae of code implementation, not so much, that's just an implementation detail, a hurdle to overcome. The current crop of tooling helps with that tremendously, and for someone like me, it's been a wonderful time, a golden era. To the people who like to handcraft every line of code to perfection, people who derive their joy from that, I think they benefit a lot less.
> Someone already operating at the very limit of their abilities doing stuff that is for them high complexity, high cognitive load, detail intense, and tactically non-obvious?
When you zoom in, even this kind of work isn't uniform - a lot of it is still shaving yaks, boring chores, and tasks that are hard dependencies for the work that is truly cognitively demanding, but themselves are easy(ish) annoyances. It's those subtasks - and the extra burden of mentally keeping track of them - that sets the limit of what even the most skilled, productive engineer can do. Offloading some of that to AI lets one free some mental capacity for work that actually benefits from that.
> Even a machine that just handed you the perfect code can't 20x your real output, even if it gave you the source file at 20x your native sophistication you wouldn't be able to build and deploy it, let alone make changes to it.
Not true if you use it right.
You're probably following the "grug developer" philosophy, as it's popular these days (as well as "but think of the juniors!", which is the perceived ideal in the current zeitgeist). By design, this turns coding into boring, low-cognitive-load work. Reviewing such code is, thus, easier (and less demoralizing) than writing it.
20x is probably a bit much across the board, but for the technical part, I can believe it - there's too much unavoidable but trivial bullshit involved in software these days (build scripts, Dockerfies, IaaS). Preventing deep context switching on those is a big time saver.
When you zoom in, even this kind of work isn't uniform - a lot of it is still shaving yaks, boring chores, and tasks that are hard dependencies for the work that is truly cognitively demanding, but themselves are easy(ish) annoyances. It's those subtasks - and the extra burden of mentally keeping track of them - that sets the limit of what even the most skilled, productive engineer can do. Offloading some of that to AI lets one free some mental capacity for work that actually benefits from that.
Yeah, I'm not a dev but I can see why this is true, because it's also the argument I use in my job as an academic. Some people say "but your work is intellectually complex, how can you trust LLMs to do research, etc.?", which of course, I don't. But 80% of the job is not actually incrementally complex, it's routine stuff. These days I'm writing the final report of a project and half of the text is being generated by Gemini, when I write the data management plan (which is even more useless) probably 90% will be generated by Gemini. This frees a lot of time that I can devote to the actual research. And the same when I use it to polish a grant proposal, generate me some code for a chart in a paper, reformat a LaTeX table, brainstorm some initial ideas, come up with an exercise for an exam, etc.
Yes, things that get resolved very quickly with AI include fixing Linting errors, reorganizing CI pipelines, documenting agreed on requirements, building well documented commits, cleaning up temporary files used to validate dev work, building README.md's in key locations to describe important code aspects, implementing difficult but well known code, e.g. I got a trie security model implemented very quickly.
Tons of dev work is not exciting, I have already launched a solo dev startup that was acquired, and the 'fun' part of that coding was minimal. Too much was the scaffolding, CRUD endpoints, web forms, build scripts, endpoint documentation, and the true innovative stuff was such a small part of the whole project. Of the 14 months of work, only 1 month was truly innovative.
> Offloading some of that to AI lets one free some mental capacity for work that actually benefits from that.
Maybe, but I don't feel (of course, I could be wrong) that doing boring tasks take away any mental capacity; they feel more like fidgeting while I think. If a tool could do the boring things it may free my time to do other boring work that allows me to think - like doing the dishes - provided I don't have to carefully review the code.
Another issue (that I asked about yesterday [1]) is that seemingly boring tasks may end up being more subtle once you start coding them, and while I don't care too much about the quality of the code in the early iterations of the project, I have to be able to trust that whatever does the coding for me will come back and report any difficulties I hadn't anticipated.
> Reviewing such code is, thus, easier (and less demoralizing) than writing it.
That might well be true, but since writing it doesn't cost me much to begin with, the benefit might not be large. Don't get me wrong, I would still take it, but only if I could fully trust the agent to tell me what subtleties it encountered.
> there's too much unavoidable but trivial bullshit involved in software these days (build scripts, Dockerfies, IaaS). Preventing deep context switching on those is a big time saver.
If work is truly trivial, I'd like it to be automated by something that I can trust to do trivial work well and/or tell me when things aren't as trivial and I should pay attention to some detail I overlooked.
We can generally trust machines to either work reliably or fail with some clear indication. People might not be fully reliable, but we can generally trust them to report back with important questions they have or information they've learnt while doing the job. From the reports I've seen about using coding agents, they work like neither. You can neither trust them to succeed or fail reliably, nor can you trust them to come back with pertinent questions or information. Without either kind of trust, I don't think that "offloading" work to them would truly feel like offloading. I'm sure some people can work with that, but I think I'll wait until I can trust the agents.
[1]: https://news.ycombinator.com/item?id=44526048
Yeah, I don't fuck with Docker jank and cloud jank and shit. I don't fuck with dynamic linking. I don't fuck with lagged-ass electron apps. I don't fuck with package managers that need a SAT solver but don't have one. That's all going to be a hard no from me dawg.
When I said that after you've done all the other stuff, I was including cutting all the ridiculous bullshit that's been foisted on an entire generation of hackers to buy yachts for Bezos and shit.
I build clean libraries from source with correct `pkg-info` and then anything will build against it. I have well-maintained Debian and NixOS configurations that run on non-virtualized hardware. I use an `emacs` configuration that is built-to-specifications, and best-in-class open builds for other important editors.
I don't even know why someone would want a model spewing more of that garbage onto the road in front of them until you're running a tight, optimized stack to begin with, then the model emulates to some degree the things it sees, and they're also good.
Ok, that's great for you. Most of us don't have the luxury of going full Richard Stallmann in their day to day and are more than happy to have some of the necessary grunt work to be automated away.
1 reply →