Comment by jaredklewis

6 hours ago

The ways LLMs work, the outcomes are probabilistic, not deterministic.

So the guardrails might only fail one in a thousand times.

1 comment

jaredklewis

Also, the longer the context window, the more likely the LLM derangement/ignoring safety. Frequently, those with questionable dependence on AI stay in the same chat indefinitely, because that's where the LLM has developed the ideosyncracies the user prefers.