Comment by Finbarr
2 days ago
I'd recommend trying Gemini for the escapes. Claude was quite superficial and only appeared to be trying to break out at the surface level. Gemini was very creative and has come up with a whole sequence of escapes that is making me rethink whether I should even be trying to patch them, given preventing agent escapes isn't a stated goal of the project.
That's an excellent idea! I will give it a shot.