← Back to context

Comment by unethical_ban

7 hours ago

Interesting for teams looking to implement ai into their deployment process.

I don't think guardrails are useful long term. Assuming we don't see the end of open near-frontier models, it is folly to try to keep models from doing exploit generation. The solution needs to be all software projects writing code under the assumption that hackers will be running LLMs against their code in search of exploits and write secure code accordingly.

even careful programmers working in unsafe languages will introduce bugs; it's inevitable. in 2026 we should be using safe languages for all new projects, but there's a gargantuan amount of C/C++ handling protocols.

but I agree that guardrails will only help for like, 3-6 months. we should be screening as much as we can with Mythos; unfortunately, Anthropic is only giving access to the big players.