Comment by serial_dev

1 month ago

In a professional setting where you still have coding standards, and people will review your code, and the code actually reaches hundreds of thousands of real users, handling one agent at a time is plenty for me. The code output is never good enough, and it makes up stuff even for moderately complicated debugging ("Oh I can clearly see the issue now", I heard it ten times before and you were always wrong!)

I do use them, though, it helps me, search, understand, narrow down and ideate, it's still a better Google, and the experience is getting better every quarter, but people letting tens or hundreds of agents just rip... I can't imagine doing it.

For personal throwaway projects that you do because you want to reach the end output (as opposed to learning or caring), sure, do it, you verify it works roughly, and be done with it.

8 comments

serial_dev

pron 1 month ago

This is my problem with the whole "can LLMs code?" discussion. Obviously, LLMs can produce code, well even, much like a champion golfer can get a hole in one. But can they code in the sense of "the pilot can fly the plane", i.e. barring a catastrophic mechanical malfunction or a once-in-a-decade weather phenomennon, the pilot will get the plane to its destination safely? I don't think so.

To me, someone who can code means someone who (unless they're in a detectable state of drunkenness, fatigue, illness, or distraction) will successfully complete a coding task commensurate with some level of experience or, at the very least, explain why exactly the task is proving difficult. While I've seen coding agents do things that truly amaze me, they also make mistakes that no one who "can code" ever makes. If you can't trust an LLM to complete a task anyone who can code will either complete or explain their failure, then it can't code, even if it can (in the sense of "a flipped coin can come up heads") sometimes emit impressive code.

lupire 1 month ago
That's a funny analogy. You should look into how modern planes are flown. Hint: it's a computer.
- pron 1 month ago
  
  > Hint: it's a computer.
  Not quite, but in any event none of the avionics is an LLM or a program generated by one.

Sateeshm 1 month ago

Exactly my experience too.

I also heard "I see the issue now" so many times because it missed or misunderstood something very simple.

KaiserPro 1 month ago

> people will review your code,

I mean you'd think. But it depends on the motivations.

At meta, we had league tables for reviewing code. Even then people only really looked at it if a) they were a nitpicking shit b) don't like you and wanted piss on your chips c) its another team trying to fix our shit.

With the internal claude rollout and the drive to vibe code all the things, I'm not sure that situation has got any better. Fortunately its not my problem anymore

serial_dev 1 month ago

Well, it certainly depends on the culture of the team and organization.
Where you have shared ownership, meaning once I approved your PR, I am just as responsible of something goes wrong as you are and I can be expected to understand it just as well as you do… your code will get reviewed.
If shipping is the number one priority of the team, and a team is really just a group of individuals working to meet their quota, and everyone wants to simply ship their stuff, managers pressure managers to constantly put pressure on the devs, you’ll get your PR rubber stamped after 20s of review. Why would I spend hours trying to understand what you did if I could work on my stuff.
And yes, these tools make this 100x worse, people don’t understand their fixes, code standards are no longer relevant, and you are expected to ship 10x faster, so it’s all just slop from here on.

prmoustache 1 month ago

> people will review your code,

People will ask LLM to review some slop made by LLM and they will be absolutely right!

There is no limit to lazyness.

flemhans 1 month ago

Soon you'll be seen as irresponsible and wasteful if you don't let the smarter LLM do it.