Comment by andai
7 hours ago
>it tends to leave big, dangerous holes hiding inside implementations unless babied.
A brainwave: perhaps GLM or DeepSeek could be integrated into the mix for the purposes of red-teaming the code. Fable has been blinded to security by design[0], and the open models are pretty decent at it.
[0] It's not clear what the situation with GPT-5.6 will be but the blog suggests similarly over-cautious safety filters.
Amusingly the posts for recent Opus releases brag that they successfully made it worse at security! "during its [Opus 4.7] training we experimented with efforts to differentially reduce these ["cyber"] capabilities"
I definitely use GPT-5.5 as a counterpart to validate these exact sorts of things in Anthropic models' implementations, in the (now-rarer) cases where I allow Anthropic's models _to_ implement.
And yeah, it's a bit depressing to think that 5.6 might be similarly nerfed. Less secure software for us all, I guess... except BigCorps. :(