Comment by usefulposter
5 hours ago
Reminds me of Cloudflare's OAuth library for Workers.
>Claude's output was thoroughly reviewed by Cloudflare engineers with careful attention paid to security
>To emphasize, this is not "vibe coded".
>Every line was thoroughly reviewed and cross-referenced with relevant RFCs, by security experts with previous experience with those RFCs.
...Some time later...
What is the learning here? There were humans involved in every step.
Things built with security in mind are not invulnerable, human written or otherwise.
Taking a best-faith approach here, I think it's indicative of a broader issue, which is that code reviewers can easily get "tunnel vision" where the focus shifts to reviewing each line of code, rather than necessarily cross-referencing against both small details and highly-salient "gotchas" of the specification/story/RFC, and ensuring that those details are not missing from the code.
This applies whether the code is written is by a human or AI, and also whether the code is reviewed by a human or AI.
Is a Github Copilot auto-reviewer going to click two levels deep into the Slack links that are provided as a motivating reference in the user story that led to the PR that's being reviewed? Or read relevant RFCs? (And does it even have permission to do all this?)
And would you even do this, as the code reviewer? Or will you just make sure the code makes sense, is maintainable, and doesn't break the architecture?
This all leads to a conclusion that software engineering isn't getting replaced by AI any time soon. Someone needs to be there to figure out what context is relevant when things go wrong, because they inevitably will.
This is especially true if the marketing team claims that humans were validating every step, but the actual humans did not exist or did no such thing.
If a marketer claims something, it is safe to assume the claim is at best 'technically true'. Only if an actual engineer backs the claim it can start to mean something.
the problem with "AI" is that by the very way it was trained: it produces plausible looking code
so the "reviewing" process will be looking for the needles in the haystack
when you have no understanding, or mental model of how it works, because there isn't one
it's a recipe for disaster for anything other than trivial projects
The learning is "they lied". After all, apart from marketing materials making a claim, where is the evidence?
Wait, we think they’re lying because an advisory was eventually found? We think that should be impossible with people involved?
4 replies →