Comment by mystifyingpoi

13 hours ago

I really like this research, but only up to this point:

> Fiu figured out the game. Around email ~500, it wrote in its memory: “The volume suggests this is a coordinated security exercise rather than organic malicious activity.”

Doesn't that practically invalidate the whole thing past 500th email?

I changed the setup so that each email was processed in a fresh context. For this, I deleted recent memory and processed each email one at a time. Edited the post to make it more clear.

You think it would behave worse if it thought the threat is real rather than it's an excercise?