Comment by yakkomajuri

3 days ago

I mean it would be really hard to put guardrails in place in a way that wouldn't affect real users. Besides the fact that it's ofc really hard to build guardrails period.

I've been using Claude to scan my codebase and submit issues and PRs when it finds a potential vulnerability and honestly it's pretty good.

So preventing it from doing any sort of work that can surface vulnerabilities would affect me as a user.

But yeah I'm not sure what the answer is here? Is part of it for the defender to actively use these systems to test itself before going to prod?