Comment by geor9e
5 months ago
Perhaps it's expensive to self-censor the output, so they don't want to pay to self-censor every intrusive thought in the chain, so they just do it once at output.
5 months ago
Perhaps it's expensive to self-censor the output, so they don't want to pay to self-censor every intrusive thought in the chain, so they just do it once at output.
My thought exactly.
They don't want to have to deal with the model's "thought crimes"