Comment by pron
6 days ago
I'm not sure that having the patience to work with something with a very inconsistent performance and that frequently lies is an extension of existing development skills. It doesn't work like tools developers use and it doesn't work like people developers work with. Furthermore, techniques of working with agents today may be completely outdated a year from now. The acceleration is also inconsistent: sometimes there's an acceleration, sometimes a deceleration.
Generative AI is at the same time incredibly impressive and completely unreliable. This makes it interesting, but also very uncertain. Maybe it's worth my investment to learn how to master today's agents, and maybe I'd be better off waiting until these things become better.
You wrote:
> Getting good results out of a coding agent feels uncomfortably close to getting good results out of a human collaborator. You need to provide clear instructions, ensure they have the necessary context and provide actionable feedback on what they produce.
That is true (about people) but misses out the most important thing for me: it's not about the information I give them, but about the information they give me. For good results, regardless of their skill level, I need to absolutely trust that they tell me what challenges they've run into and what new knowledge they've gained that I may have missed in my own understanding of the problem. If that doesn't happen, I won't get good results. If that kind of communication only reliably happens through code I have to read, it becomes inefficient. If I can't trust an agent to tell me what I need to know (and what I trust when working with people) then the whole experience breaks down.
> I'm not sure that having the patience to work with something with a very inconsistent performance and that frequently lies is an extension of existing development skills.
If you’ve be been tasked with leadership of an engineering effort involving multiple engineers and stakeholders you know that this is in fact a crucial part of the role the more senior you get. It is much the same with people: know their limitations, show them a path to success, help them overcome their limitations by laying down the right abstractions and giving them the right coaching, make it easier to do the right thing. Most of the same approaches apply. When we do these things with people it’s called leadership or management. With agents, it’s context engineering.
Because I reached that position 15 years ago, I can tell you that this is untrue (in the sense that the experience is completely different from an LLM).
Training is one thing, but training doesn't increase the productivity of the trainer; it's meant to improve the capability of the trainee.
At any level of capability, though - whether we're talking about an intern after one year of university or a senior developer with 20 years of experience - effective management requires that you're able to trust that the person tells you when they've hit a snag or anything else you may need to know. We may not be talking 100% of trust, but not too far from that, either. You can't continue working with someone that doesn't tell you what you need to know even 10% of the time, regardless of their level. LLMs are not at that acceptable level yet, so the experience is not similar to technical leadership.
If you've ever been tasked with leading one or more significant projects you'd know that if you feel you have to review every line of code anyone on the team writes, at every step of the process, that's not the path to success (if you did that, not only would progress be slow, but your team wouldn't like you very much). Code review is a very important part of the process, but it's not an efficient mechanism for day-to-day communication.
> effective management requires that you're able to trust that the person tells you when they've hit a snag or anything else you may need to know
Nope, effective management is on YOU, not them. If everyone you’re managing is completely transparent and immediately tells you stuff, you’re playing in easy mode
11 replies →
> effective management requires that you're able to trust that the person tells you when they've hit a snag or anything else you may need to know
This is what we shoot for, yes, but many of the most interesting war stories involve times when people should have been telling you about snags but weren't-- either because they didn't realize they were spinning their wheels, or because they were hoping they'd somehow magically pull off the win before the due date, or innumerable other variations on the theme. People are most definitely not reliable about telling you things they should have told you.
> if you feel you have to review every line of code anyone on the team writes...
Somebody has to review the code, and step back and think about it. Not necessarily the manager, but someone does.
1 reply →
1000% this. Today LLMs are like enthusiastic, energetic, over-confident, well-read junior engineers.
Does it take effort to work with them and get them to be effective in your code base? Yes. But is there a way to lead them in such a way that your "team" (you in this case) gets more done? Yes.
But it does take effort. That's why I love "vibe engineering" as a term because the engineering (or "senior" or "lead" engineering) is STILL what we are doing.
Inconsistent performance and frequent lies are a crucial part of the role, really? I've only met a couple of people like that on my career. Interviews go both ways: if I can't establish that the team I'll be working with is composed and managed by honest and competent people, I don't accept their offer. Sometimes it has meant missing out on the highest compensation, but at least I don't deal with lies and inconsistent performance.
> incredibly impressive and completely unreliable.
There have been methods of protecting against this since before AI, and they still apply. LLMs work great with test driven development, for example.
I would say that high-level knowledge and good engineering practices more important than ever, but they were always important.
Test-driven development helps protect against wrong code, but it's not code I'm interested in, and it's not wrong code that I'm afraid of (I mean, that's table stakes). What I need is something that would help me generate understanding and do so reliably (even if the performance is poor). I can't exercise high-level knowledge efficiently if my only reliable input is code. Once you have to work at the code level at every step, there's no raising of the level of thought. The problem for me isn't that the agent might generate code that doesn't pass the test suite, but that it cannot reliably tell me what I need to know about the code. There's nothing I can reliably offload to the machine other than typing. That could still be useful, but it's not necessarily a game-changer.
Writing code in Java or Python as opposed to Assembly also raises the level of abstract thought. Not as much as we hope AI will be able to do someday, but at least it does the job reliably enough. Imagine how useful Java or Python would be if 10% of the time they would emit the wrong machine instructions. If there's no trust on anything, then the offloading of effort is drastically diminished.
In my experience with Claude Code and Sonnet, it is absolutely possible to have architectural and design-oriented conversations about the work, at an entirely different and higher level than using a (formerly) high-level programming language. I have been able to learn new systems and frameworks far faster with Claude than with any previous system I have used. It definitely does require close attention to detect mistakes it does not realize it is making, but that is where the skill comes in. I find it being right 80% of the time and wrong 20% of the time to be a hugely acceptable tradeoff, when it allows me to go radically faster because it can do that 80% much quicker than I could. Especially when it comes to learning new code bases and exploring new repos I have cloned -- it can read code superhumanly quickly and explain it to me in depth.
It is certainly a hugely different style of interaction, but it helps to think of it as a conversation, or more precisely, a series of individual small targeted specific conversations, each aimed at researching a specific issue or solving a specific problem.
2 replies →
Without meaning to sound flippant or dismissive, I think you're overthinking it. By the sounds of it, agents aren't offering what you say you need. What are _are_ offering is the boilerplate, the research, the planning etc. All the stuff that's ancillary. You could quite fairly say that it's in the pursuit of this stuff where details and ideas emerge and I would agree, but sometimes you don't need ideas. You need solutions which are run-of-the-mill and boring.
1 reply →
If you're writing your own tests, sure, AI is fast at writing code that passes the tests.
But if you write a comprehensive test suite for a problem, you've effectively done the hard development work to solve the problem in the first place. How did the AI help?
Oh have the AI write unit tests you say? Claude cheats constantly at the tests ime. It frequently tests the mock instead of the UUT and reports a pass. That's worse than useless! I'm sure a huge swath of slop unit tests that all pass is acceptable quality for a lot of businesses out there.
> But if you write a comprehensive test suite for a problem, you've effectively done the hard development work to solve the problem in the first place. How did the AI help?
By making you not write the implementation?
Also, the AI writing anything bad isn’t an excuse. You’re the one piloting that ship, and if not, you’re probably the one reviewing the code. It’s your job to review your own and others’ code with a critical eye, and that goes double in the LLM age.
> doesn't work like people developers work with
I don't know.
This is true for people working in an environment that provides psychological safety, has room for mistakes and rewards hard work.
This might sound cynical, but in all other places I see the "lying to cover your ass" behavior present in one form or another.
> It doesn't work like tools developers use and it doesn't work like people developers work with. Furthermore, techniques of working with agents today may be completely outdated a year from now.
Sounds like big money to be made in improving UX
> I'm not sure that having the patience to work with something with a very inconsistent performance and that frequently lies is an extension of existing development skills.
that's a basic skill you gotta have if you're leading anything or anyone. There'll always be levels of that. So if you're planning to lead anyone in your career, it's a good skillset to develop
That's not the same skill at all https://news.ycombinator.com/item?id=45518204