Comment by jasondigitized

1 month ago

I feel like I am taking crazy pills. I am getting code that works from Opus 4.5. It seems like people are living in two separate worlds.

17 comments

jasondigitized

ruszki 1 month ago

Working code doesn’t mean the same for everyone. My coworker just started vibe coding. Her code works… on happy paths. It absolutely doesn’t work when any kind of error happens. It’s also absolutely impossible to refactor it in any way. She thinks her code works.

The same coworker asked to update a service to Spring Boot 4. She made a blog post about. She used LLM for it. So far every point which I read was a lie, and her workarounds make, for example tests, unnecessarily less readable.

So yeah, “it works”, until it doesn’t, and when it hits you, that you need to work more in sum at the end, because there are more obscure bugs, and fixing those are more difficult because of terrible readability.

WarmWash 1 month ago

I can't help but think of my earliest days of coding, 20ish years ago, when I would post my code online looking for help on a small thing, and being told that my code is garbage and doesn't work at all even if it actually is working.

There are many ways to skin a cat, and in programming the happens-in-a-digital-space aspect removes seemingly all boundaries, leading to fractal ways to "skin a cat".

A lot of programmers have hard heads and know the right way to do something. These are the same guys who criticized every other senior dev as being a bad/weak coder long before LLMs were around.

crystal_revenge 1 month ago

Parent's profile shows that they are an experienced software engineer in multiple areas of software development.

Your own profile says you are a PM whose software skills amount to "Script kiddie at best but love hacking things together."

It seems like the "separate worlds" you are describing is the impression of reviewing the code base from a seasoned engineer vs an amateur. It shouldn't be even a little surprising that your impression of the result is that the code is much better looking than the impression of a more experienced developer.

At least in my experience, learning to quickly read a code base is one of the later skills a software engineer develops. Generally only very experienced engineers can dive into an open source code base to answer questions about how the library works and is used (typically, most engineers need documentation to aid them in this process).

I mean, I've dabbled in home plumbing quite a bit, but if AI instructed me to repair my pipes and I thought it "looked great!" but an experienced plumber's response was "ugh, this doesn't look good to me, lots of issues here" I wouldn't argue there are "two separate worlds".

ModernMech 1 month ago
> It shouldn't be even a little surprising that your impression of the result is that the code is much better looking than the impression of a more experienced developer.
This really is it: AI produces bad to mediocre code. To someone who produces terrible code mediocre is an upgrade, but to someone who produces good to excellent code, mediocre is a downgrade.
- habinero 1 month ago
  
  Yeah. It turns 0.05X developers into 0.2X developers and 1X developers into 0.9-1.1X developers.
  The problem is the 0.05X developers thought they were 0.5X and now they think they're 20X.
- jasondigitized 1 month ago
  
  Today. It produces mediocre code today. That is really it. What is the quality of that code compared to 1 year ago. What will it be in 1 year? Opus 6.5 is inevitable.
  
  4 replies →
jasondigitized 1 month ago
Except I work with extremely competent software engineers on software used in mission critical applications in the Fortune 500. I call myself a script kiddie because I did not study Computer Science. Am I green in the test run? Does it pass load tests? Is it making money? While some of yall are worried about leaky abstractions, we just closed another client. Two worlds for sure where one team is skating to the puck, looking to raise cattle while another wants to continue nurturing an exotic pet.
Plenty of respect to the craft of code but the AI of today is the worst is is ever going to be.
- crystal_revenge 1 month ago
  
  Can you just clarify the claim you're making here: you personally are shipping vibe coded features, as a PM, that makes it into prod and this prod feature that you're building is largely vibe coded?
- AstroBen 1 month ago
  
  Have a read of this: https://martinfowler.com/articles/is-quality-worth-cost.html

zeroCalories 1 month ago

It depends heavily on the scope and type of problem. If you're putting together a standard isolated TypeScript app from scratch it can do wonders, but many large systems are spread between multiple services, use abstractions unique to the project, and are generally dealing with far stricter requirements. I couldn't depend on Claude to do some of the stuff I'd really want, like refactor the shared code between six massive files without breaking tests. The space I can still have it work productively in is still fairly limited.

GoatInGrey 1 month ago

That's a significant rub with LLMs, particularly hosted ones: the variability. Add in quantization, speculative decoding, and dynamic adjustment of temperature, nucleus sampling, attention head count, & skipped layers at runtime, and you can get wildly different behaviors with even the same prompt and context sent to the same model endpoint a couple hours apart.

That's all before you even get to all of the other quirks with LLMs.

HarHarVeryFunny 1 month ago

That is such a vague claim, that there is no contradiction.

Getting code to do exactly what, based on using and prompting Opus in what way?

Of course it works well for some things.