← Back to context

Comment by CGamesPlay

1 day ago

The bastawhiz comment in this thread has the right answer. When you start a new conversation, Claude has no context from the previous one and so all the "wearing down" you did via repeated asks, leading questions, or other prompt techniques is effectively thrown out. For a non-determined attacker, this is likely sufficient, which makes it a good defense-in-depth strategy (Anthropic defending against screenshots of their models describing sex with minors).

Worth noting: an edited branch still has most of the context - everything up to the edited message. So this just sets an upper-bound on how much abuse can be in one context window.