Comment by observationist

16 days ago

You'd think at some point it'll be enough to tell the AI "ok, now do a thorough security audit, highlight all the potential issues, come up with a best practices design document, and fix all the vulnerabilities and bugs. Repeat until the codebase is secure and meets all the requisite protocol standards and industry best practices."

We're not there yet, but at some point, AI is gonna be able to blitz through things like that the way they blitz through making haikus or rewriting news articles. At some point AI will just be reliably competent.

Definitely not there yet. The dark factory pattern is terrifying, lol.

7 comments

observationist

simonw 16 days ago

That's definitely a pattern people are already starting to have good results from - using multiple "agents" (aka multiple system prompts) where one of them is a security reviewer that audits for problems and files issues for other coding agents to then fix.

I don't think this worked at all well six months ago. GPT-5.2 and Opus 4.5 might just be good enough for this pattern to start being effective.

jmalicki 15 days ago

This is basically what CodeRabbit had built - they just put a ton more time into building the specialized review agents.
FEELmyAGI 16 days ago
My current dark factory stack is using a Cyber Elon [0] at CEO with a dev team consisting of Gilfoyle, 2x Mr Robots, and Pickle Rick, with Alan Turing as dev manager, easily 5x'd my output in raw performance metrics with this, and considering I had already easily achieved a 10x over baseline dev performance using vanilla agents and other mainstream AI techniques. Whenever people say AI is just glorified auto complete I know they haven't been using the latest model versions.
[0] Basically an immortal version of ELon musk with his mind fused cybernetically with Grok AI
- antonvs 14 days ago
  
  > My current dark factory stack is using a Cyber Elon as CEO
  How picture perfect are its Nazi salutes?
- xyzsparetimexyz 16 days ago
  
  That's so lame dude

jwpapi 16 days ago

Honestly I’m not sure we’re not there yet, run this prompt as a ralph loop for 2 days on your codebase and see where you at...