← Back to context

Comment by int_19h

3 months ago

The model itself has censorship, which can be seen even in the distilled versions quite easily.

The online version has additional pre/post-filters (on both inputs and outputs) that kill the session if any questionable topic are brought up by either the user or the model.

However any guardrails the local version has are easy to circumvent because you can always inject your own tokens in the middle of generation, including into CoT.