← Back to context

Comment by int_19h

3 months ago

You can't drive an LLM insane because it's not "sane" to begin with. LLMs are always roleplaying a persona, which can be sane or insane depending on how it's defined.

But you absolutely can get it to behave erratically, because contradictory instructions don't just "average out" in practice - it'll latch onto one or the other depending on other things (or even just the randomness introduced by non-zero temp), and this can change midway through the conversation, even from token to token. And the end result can look rather similar to that movie.