Comment by Jordan-117
2 years ago
I don't understand why they don't just include a dumb filter on output that blocks any repetition of the system prompt. Presumably they don't want these rules publicly known, yet they leak regularly.
2 years ago
I don't understand why they don't just include a dumb filter on output that blocks any repetition of the system prompt. Presumably they don't want these rules publicly known, yet they leak regularly.
My understanding is that they do, and that these prompt revelations are actually just hallucinations.
That's a great question. They already do some sort of post-model filtering and output-checking, right?