Comment by himata4113

1 day ago

I mean it's possible that I just haven't found the secret sauce or I'm running into the invisible guardrails and that people have much stronger jailbreaks than I do.

However, I would not rule out openai involvement in all of this.

7 comments

himata4113

binyu 1 day ago

I was able to use Fable to generate PoC for several classes of vulnerabilities and I didn't observe the model refusing to engage in detailed analysis to come up with creative approaches, the very contrary.

> I used a fork of oh-my-pi

Why not use the leaked claude code source? Not that you really need it to execute the jailbreak

zozbot234 1 day ago
I don't think educational "proof of concept" code can be described as even loosely realistic cyber offense in this day and age. The Mythos preview paper claimed an ability to stage attacks in an end-to-end fashion and work around sophisticated defenses/mitigations, so something like this should be the relevant standard.
- binyu 1 day ago
  
  Depends of what the proof of concept is about. It could be just a toy example, e.g. a RCE that opens the calculator app or something much more nefarious, like returning a root shell and would still fall under the definition of PoC.
  
  2 replies →
himata4113 1 day ago

Interesting, that means I was in-fact running into invisible guardrails.

lazystar 1 day ago

> I mean it's possible that I just haven't found the secret sauce

its possible that no one cracks it during the window of time where the product is useful and would pose a risk if cracked, but never forget that the first rule of security is nothing is ever 100% secure.