Comment by sigmarule
20 hours ago
> A third-party demonstrated that it was possible to jailbreak the safety measures of Fable to access the raw Mythos abilities. Abilities which Anthropic say are too dangerous for the public.
Pressure test this assumption before getting behind this position.
I will certainly revisit it as more information comes out, but is it your contention that Anthropic solved jailbreaking with Mythos?
What you claim contradicts Anthropic’s statements. I assume that is the contention.
That is a strawman. My contention is what you just implicitly acknowledged - there is not information put out yet to validate the quoted claim. There are claims to the contrary, as well, from Anthropic themselves.
In the absence of information, maybe it’s better to ask which claim is more extraordinary.
That,
A. Anthropic solved the llm jailbreak problem with mythos (despite no claim to have done so on their part)
B. That a full jailbreak of mythos is possible.
7 replies →
What assumption?
The one I quoted, which contradicts Anthropic’s post and has no supporting evidence publicly available. That a jailbreak was found that accesses the model’s _raw_ capabilities. Something Anthropic has explained was not the case.
It is pretty clear, no? Anthropic claims that the jailbreaks they were made aware of did not access the model’s raw capability, explained that there are protections to mitigate the impact of successful jailbreaks, etc. Coming here and stating something to the contrary with zero explanation or actual evidence is the assumption.