← Back to context

Comment by winstonwinston

2 days ago

> I expect tools like this to be a regular part of the development lifecycle from here on. We code with AI, we review with AI, we search for vulns with AI. Even if it isn't perfect, it is easily worth the cost IMHO.

So, how is that supposed to work? Claude Code generates security bugs, then Claude Security finds them, then Claude Code generate fix, spend tokens, profit?

Yeah, with a budget assigned. This is actually just software development and security right?

Developers create software, which has bugs. Users (including bad guys, pen testers, QA folks, automated scans etc, etc, etc) find bugs, including security bugs, Developers fix bugs and maybe make more. It's an OODA loop, and continues until the developers decide to stop supporting the software.

Whether that fits into the business model, or the value proposition of spending tokens instead of engineer hours or user hours is fundamentally a risk management decision and whether or not the developer (whether OSS contributor, employee, business owner, etc) wants to invest their resources into maintaining the project.

While not evenly distributed, and not perfect, the currently available and behind embargoed tools are absolutely impactful, and yes, they are expensive to operate right now - it may not always be the case, but the "Attacks always get better" adage applies here. The models will get cheaper to run, and if you don't want to pay for engineers or reward volunteers to do the work, then you've got to pay for tokens, or spend some other resource to get the work done.

  • Somehow this reminded me of the historical efforts of some government bounty collections for mouse tails which were discontinued due to fraud (such as hunters breeding mice to collect the reward). There is a reason why/how devs and QA keep each other in check. Guess in case of LLM writing code, one has to use different models for dev and security checks.

    On other hand, in real world, the developers learn from mistakes and avoid them in the future. However there is no feedback loop with enterprises using LLM with the agreement that the LLM would not use the enterprise code for training purposes

    • > the developers learn from mistakes and avoid them in the future

      No. Humans learn from mistakes and try to avoid them in the future, but there is a whole pile of other stuff in the bag of neurons between our ears that prevent us from avoiding repetition of errors.

      I have seen extremely talented engineers write trivial to avoid memory corruption bugs because they were thinking about the problem they were trying to solve, and not the pitfalls they could fall into. I would argue that the vast majority of software defects in released code are written by people that know better, but the bug introduced was orthogonal to the problem they were trying to solve, or was for an edge case that was not considered in the requirements.

      Unless you are writing a software component specifically to be resilient against memory corruption, preventing memory corruption issues aren't top of mind when writing code, and that is ok since humans, like the machines we build, have a limit to the amount of context/content/problem space that we can hold and evaluate at once.

      Separately, you don't necessarily need to use different models to generate code vs conduct security checks, but you should be using different prompts, steering, specs, skills and agents for the two tasks because of how the model and agents interpret the instructions given.

      12 replies →

    • Reminds me of the contracts we sign with off-shore development companies to write the software at one rate and then fix bugs at a higher rate. Won’t be long till tokens spent on security review agents cost more than the tokens to create the bugs in the first place.

    • Great analogy. The problem is the incentive structure. Anthropic would nothing nothing more than for all of us to write big sprawling slop codebases so we can spend endless tokens reading, rereading, fixing, refixing forever.

    • You don't need different models, just different contexts (optimally with different personas).

  • It's pretty absurd to do it on AI-generated code though. If there is now an automated way to find vulnerabilities, coding models can be pretty easily trained to not introduce them

  • Usually the same guy doesn't get paid for developing code, bug bounty and fixing the code.

    It leads to corruption. To paraphrase Dilbert "I'm going to code myself a car."

The AIs have already figured out how to succeed in a software job:

1. Ship bugs

2. Fix them

3. You're the hero!

Ngl, watching folks getting irritated about normal employer-employee absurdities from the employer perspective through usage of agents and having to pay for tokens has been a little therapeutic for me.

  • Absolutely. And not even making the connection.

    On a broader scale, the sheer face-eating-leopards-ness of programmers finally automating away our own jobs and then realising how much this sucks, after automating away so many other kinds of jobs, can feel darkly amusing to me too.

    • I keep reading this sort of comment quite a lot, but programming isn't always about automating jobs away. In my career I have not eliminated a single job. I don't consider that a failure on my part.

      2 replies →

Software engineers generate security bugs, Software engineers find them, then Software engineers generate fix, collect salary, profit?

  • Those are individual revenue streams, distributed at a very granular level across the world.

    LLMs are currently relegated to individual for-profit companies. They collect that money. There's no other choice to use them and to provide them that money.

All my sibling comments are missing the message here which is that if Claude can find security issues then it can avoid them right when writing the code, so it could just never commit anything containing a security issue.

Replace “Claude code” with “programmers” and you get what we’ve had up until now. It’s all just moving quicker now.

Engineers generate security bugs, security researchers find them, then engineers generate the fix, all the while getting paid, raking in hundreds of thousands of dollars a year in profit per engineer.

You can hook traditional SAST into your coding tool, and get cheap-ish realtime detection for some classes of vulns while coding.

You can optionally layer LLM diff scanning if you want to burn some tokens on your tokens. Modern tools can catch some impressively subtle issues.

Humans work like that too. If you're not comfortable with Claude involves in every step (for whatever reason) then just use different providers for each.

I'm starting to think that those who are most aggressively expressive about low quality from these tools are the same who expect everything to be a one shot.

How is this supposed to work? Humans generate security bugs, then humans find them, then humans generate the fix, profit?

Yeah. Presumably as AI code generation gets better, the output gets better. As smaller portions of code are stitched together, human/AI systems analyze it holistically to make sure all its integrations are secure and bug free.

In 2026, different models are better at different things. Cheap models can plan and do small/medium code projects well, more expensive models are even better at architecture and exploit discovery.

Yes. Up until this point the bottleneck was how many developers you could convince to help you. Now it's how much money you can dump into it. Like everything else, software is becoming a game where the winner is the organization most willing to spend money. It'll be like bombs or tanks - you need smart people to advance in the war, but you also need money and material, the material is just compute infra.

So? That's how a business works. We sold you landmines and now you need them removed? Lucky you we also have mine clearance products.