Comment by koe123

12 days ago

> But now that most code is written by LLMs

Am I in the Truman show? I don’t think AI has generated even 1% of the code that I run in prod, nor does anyone I respect. Heavily inspired by AI examples, heavily assisted by AI during research sure. Who are these devs that are seeing such great success vibecoding? Vibecoding in prod seems irresponsible at best

It's all over the place depending on the person or domain. If you are building a brand new frontend, you can generate quite a lot. If you are working on an existing backend where reliability and quality are critical, it's easier to just do yourself. Maybe having LLMs writing the unit tests on the code you've already verified working.

> Who are these devs that are seeing such great success vibecoding? Vibecoding in prod seems irresponsible at best

AI written code != vibecoding. I think anyone who believes they are the same is truly in trouble of being left behind as AI assisted development continues to take hold. There's plenty of space between "Claude build me Facebook" and "I write all my code by hand"

I was talking to a product manager a couple weeks ago about this. His response: most managers have been vibecoding for long time. They've just been using engineers instead of LLMs.

  • Having done both, right now I prefer vibe coding with good engineers. Way less handholding. For non-technical managers, outside of prototyping vibe coding produces terrible results

FAANG here (service oriented arch, distributed systems) and id say probably 20+ percent of code written on my team is by an LLM. it's great for frontends, works well with test generation, or following an existing paradigm.

I think a lot of people wrote it off initially as it was low quality. But gemini 3 pro or sonnet 4.5 saves me a ton of time at work these days.

Perfect? Absolutely not. Good enough for tons of run of the mill boilerplate tasks? Without question.

  • > probably 20+ percent of code written on my team is by an LLM. it's great for frontends

    Frontend has always been shitshow since JS dynamic web UIs invented. With it and CSS no one cares what runs page and how many Mb it takes to show one button.

    But regarding the backend, the vibecoding still rare, and we are still lucky it is like that, and there was no train crush because of it. Yet.

    • Backend has always been easier than frontend. AI has made backend absolutely trivial, the code only has to work on one type of machine in one environment. If you think it's rare or will remain rare you're just not being exposed to it, because it's on the backend.

      5 replies →

    • I think you’re onto something. Frontend tends to not actually solve problems, rather it’s mostly hiding and showing parts of a page. Sometimes frontend makes something possible that wasn’t possible before, and sometimes the frontend is the product, but usually the frontend is an optimization that makes something more efficient, and the problem is being solved on the backend.

      It’s been interesting to observe when people rave about AI or want to show you the thing they built, to stop and notice what’s at stake. I’m finding more and more, the more manic someone comes across about AI, the lower the stakes of whatever they made.

      3 replies →

  • As someone currently outside FAANG, can you point to where that added productivity is going? Is any of it customer visible?

    Looking at the quality crisis at Microsoft, between GitHub reliability and broken Windows updates, I fear LLMs are hurting them.

    I totally see how LLMs make you feel more productive, but I don't think I'm seeing end customer visible benefits.

    • I think much of the rot in FAANG is more organizational than about LLMs. They got a lot bigger, headcount-wise, in 2020-2023.

      Ultimately I doubt LLMs have much of an impact on code quality either way compared to the increased coordination costs, increased politics, and the increase of new commercial objectives (generating ads and services revenue in new places). None of those things are good for product quality.

      That also probably means that LLMs aren't going to make this better, if the problem is organizational and commercial in the first place.

  • Does great for front ends mean considerate A11Y? In the projects I've looked over, that's almost never the case and the A11Y implementation is hardly worthy of being called prototype, much less production. Mock up seems to be the best label. I'll bet you think because the surface looks right that runs down to the roots so you call it good at front ends. This is the problem with LLMs, they do not do the hard work and they teach people that the hard work they cannot do is fine left undone or partially done and the more people "program" like this the worse the situation gets for real human beings trying to live in a world dominated by software.

    • It turns out if you tell a coding agent "make it accessible" you'll get better results than you would from most professional front-end developers.

      I'm not satisfied yet: I want coding agents to be able to actively test on screen readers as part of their iteration loop.

      I've not found a system that can do that well yet out of the box, but GuidePup is very promising: https://github.com/guidepup/guidepup

For the last 2 or 3 months we made a commitment as a team to go all in on claude code, and have been sharing prompts, skills, etc, and documented all of our projects and at this point, claude is writing a _large_ percentage of our code. Probably upwards of 70 or 80%. It's also been updating our jira tickets and github PRs, which is probably even more useful than writing the code.

Our test coverage has improved dramatically, our documentation has gotten better, our pace of development has gone up. There is also a _big_ difference between the quality of the end product between junior and senior devs on the team.

Junior devs tend to be just like "look at this ticket and write the code."

Senior devs are more like: Okay, can you read the ticket, try to explain to to me in your own words, let's refine the description, can you propose a solution -- ugh that's awful, what if we did this instead.

You would think you would not save a lot of time that way, but even spending an _hour_ trying to direct claude to write the code correctly is less than the 5-6 hours it would take to write it yourself for most issues, with more tests and better documentation when you are finished.

When you first start using claude code, it feels like you are spending more time to get worse work out of it, but once you sort of build up the documentation/skills/tools it needs to be successful, it starts to pay dividends. Last week, I didn't open an IDE _once_ and I committed several thousands lines of code across 2 or 3 different internal projects. A lot of that was a major refactor (smaller files, smaller function sizes, making things more DRY) that I had been putting off for months.

Claude itself made a huge list of suggestions, which I knocked back to about 8 or 10, it opened a tracking issue in jira with small, tractable subtasks, then started knocking out one at a time, each of them being a fairly reviewable PR, with lots of test coverage (the tests had been built out over the previous several months of coding with cursor and claude that sort of mandated them to stop them from breaking functionality), etc.

I had a coworker and chatgpt estimate how long the issue would take if they had to do it without AI. The coworker looked at the code base and said "two weeks". Both claude and chat GPT estimate somewhere in the 6-8 weeks range (which I thought was a wild over estimate, even without AI). Claude code knocked the whole thing out in 8 hours.

If you work on highly repetitive areas like web programming, I can clearly see why they're using LLMs. If you're in a more niche area, then it gets harder to use LLM all the time.

There is a nice medium between full-on vibe coding and doing it yourself by hand. Coding agents can be very effective on established codebases, and nobody is forcing you to push without reviewing.