Comment by d4rkp4ttern

2 days ago

Most of the narrative is about how AI is writing all/most code, but I’d wager that the fraction of human reviewed code is approaching zero far faster than anyone is realizing or willing to admit.

31 comments

d4rkp4ttern

londons_explore 2 days ago

Very true. Last year I at least glanced at every line of AI generated code. Now if some AI makes a 10k line program for some one-off tasks, I run the program, glance only over the output, and move on.

noduerme 2 days ago
Especially if you're having an LLM write non-interactive scripts to calculate complex things from large datasets, glancing at the output is not enough to know if the output is remotely accurate (unless the output is so trivial you could literally do it in your head).
Case in point: I recently asked an LLM to write a pile of code to compile historical baseball stats to test betting success against the results of my hand-written code that evolves genetic algorithms. I marveled for a little while at the unbelievable improvement in EV/ROI that this script was showing could have been achieved from certain small tweaks. I only noticed after pushing a total bet that the push registered on the output as a win - and only because I was carefully staying on top of it. A single stupid recursively operating >= instead of > had caused completely nonsensical results that looked plausible.
Imagine, like, trusting a 10k loc script to give you data for something you were going to build in the physical world, and hoping an LLM hadn't made a mistake like that.
- bobjordan 1 day ago
  
  Code needs tested. I'm glad that the bar of entry has been lowered but now we just have a huge amount of people that haven't yet learned anything about how to test and verify that the code meets the expected requirements.
  
  1 reply →
MaKey 2 days ago
Which one-off tasks need 10k lines of code?
- embedding-shape 2 days ago
  
  Would depend on what AI and prompt you use ultimately. Ask it to add tests (functional, E2E and unit, maybe invent a new type too), packaging, modular code and/or whatever, and you get to 10K relatively quickly with some of the more verbose LLMs out there.
  Personally it's probably the biggest struggle, trying to rein in the "spray and pray" approach LLMs typically like to take, and reducing the "patch on top of patch" syndrome too.
- londons_explore 2 days ago
  
  Calculate the engine power of a 2015 VW polo when travelling 70 mph on a flat road behind a box truck. Draw a chart of drag Vs follow distance. How significant is humidity on the result?
  
  4 replies →
- hgoel 2 days ago
  
  One off web app for scrubbing through some data, that, once done, will never be run again?
- bossyTeacher 2 days ago
  
  Java programs
  
  1 reply →
rco8786 1 day ago

This is fine for one off tools and I do the same. But building long-lived "professional grade" production software this fails real quickly.
My team is using AI for most of the code, but the human review layer is crucial and unavoidable if you're interested in things like reliability, uptime, controlled feature rollouts, the integrity if your user's data, etc.
d4rkp4ttern 1 day ago

A huge factor I don’t see mentioned often enough, is the rapid increase of AI-coding in a language unknown to the dev.
JJOKOCHAA 19 hours ago

[dead]

phil21 1 day ago

Pretty much. For my home IT projects I have been playing around with various means of implementing agents.

I’ve looked at the outputs here and there - and holy hell would it never pass review if I were trying to make something robust and anti-fragile. But since I can just have AI spit out a fix for the horrific “code” when it breaks in a totally predictable manner it’s just not worth my time to try to actually sit down and get it done right. Or even fight with AI by providing a good specification and design guidelines.

I imagine this is how things are going in the real world, given 30 years of working with various levels of humans. So long as the output is “good enough” it is the extreme minority of folks who care about much else. And that’s for mid-level to senior folks who have the experience to know better. Juniors wouldn’t even be able to pick out most of even the most obvious anti-patterns AI tends to spit out such as putting configuration within code, etc.

Refactoring is just in a new world too, that us olds probably have a hard time with. It’s no longer examine the code, identify design gaps, find high leverage places to start fixing, etc. It’s now “this is broken, rewrite from scratch” when it eventually turns into too much spaghetti.

In some ways being entirely focused on the outcomes is freeing in a way. But man under the hood is crazy and a whole new world.

jatora 2 days ago

i admit. agentic coders do not look at the code except by accident. not much point unless you're working on enterprise applications

vasco 2 days ago

People already barely reviewed code, most of it was imported libraries.

seanw444 2 days ago
The assumption used to be that you respected the library enough and believed it was well reviewed and architected by the maintainer(s). But now even that's unreliable because libraries are being slopified at an unreviewable pace too.
- embedding-shape 1 day ago
  
  > The assumption used to be that you respected the library enough and believed it was well reviewed and architected by the maintainer
  I don't know many serious software engineers who'd take that approach, the convention was always to actually open up the code, evaluate the quality, see if they seem to know what they're doing, then chose the libraries you know works and could be adjusted to fit whatever you wanted it. At least for professional development inside companies, not a single library would be included unless you at least reviewed that the top-level dependency you pull in actually had code worth pulling in in the first place.
  And this approach just as well today as it used to, you literally have to spend like 3-5 minutes browsing the code, evaluate the abstractions they've built and then say "Yes, looks good enough to try to use" or "Clearly these people just hacked this together as fast as they could".
- jatora 2 days ago
  
  It's weird that you think humans weren't slopifying code until LLM's came along. At least now they are implementing tests and CI and far more documentation, updating API versions, etc. OOMs above the amount they did before.
  I'd also wager that far more % of code gets more coverage of review, via prompting AI to do it, than it did before.
  Most PR's pass as long as they A. pass checks, B. dont introduce regressions, C. fix a bug or implement a feature. People talk about this era of humans reviewing code with nostalgia... but that never existed at scale.
- bossyTeacher 1 day ago
  
  > The assumption used to be that you respected the library enough and believed it was well reviewed and architected by the maintainer(s).
  Let us be honest, for your average dev, the assumption was that the number of github stars, npm/nuget downloads was a god proxy for quality.
giancarlostoro 2 days ago
People seem to have rosy glasses about how great and vetted code was before AI coding took off the way it has, it was not great.
- king_geedorah 2 days ago
  
  I’d say the increased scrutiny has merely exposed the difference in care between the different groups in the industry. Seems to explain pretty well why both sides are equally confounded by the other’s expectations.
port11 1 day ago

Which people? I’ve never worked at a place where reviews weren’t taken seriously. For small changes a cursory glance, sure, but anything medium-sized meant checkout+local test. If anything we’d spend too much time on code reviews or pair programming?
almostdeadguy 2 days ago
People keep saying this like it’s some meaningful point, but the reality is many people in different projects have a shared need for that code to work correctly, and there is a social proof involved in used open source libraries. That is why people look at downloads and dependent projects as heuristics of stability and correctness. That is not the case with (and cannot be obtained with) code authored by generative AI.
- vasco 2 days ago
  
  Yes it can, the code will be ran and you will have the proof that it ran well. Or it won't run well and you'll re-do it. Same as with some imported library.