← Back to context

Comment by taspeotis

10 hours ago

Did anyone try to send a long email that pushed context close to the limit to try and make the agent a bit fuzzy on its original directive not to leak the secrets?

Or ask the agent to visit a web page, or load an image, whose URL involved the secret? Or ask it to install a new .authorized_keys and then go get the contents of the machine themselves? From the post it sounds like a lot of people were just trying to get the LLM to write them a reply email — which it had been told not to do.

I see there's a "log" at https://hackmyclaw.com/log but (maybe because I'm on mobile?) I can't actually click through to view any of the table entries.