Comment by blackeyeblitzar

3 months ago

I have seen a lot of people claim the censorship is only in the hosted version of DeepSeek and that running the model offline removes all censorship. But I have also seen many people claim the opposite, that there is still censorship offline. Which is it? And are people saying different things because the offline censorship is only in some models? Is there hard evidence of the offline censorship?

There is bias in the training data as well as the fine-tuning. LLMs are stochastic, which means that every time you call it, there's a chance that it will accidentally not censor itself. However, this is only true for certain topics when it comes to DeepSeek-R1. For other topics, it always censors itself.

We're in the middle of conducting research on this using the fully self-hosted open source version of R1 and will release the findings in the next day or so. That should clear up a lot of speculation.

  • > LLMs are stochastic, which means that every time you call it, there's a chance that it will accidentally not censor itself.

    A die is stochastic, but that doesn't mean there's a chance it'll roll a 7.

This system comes out of China. Chinese companies have to abide with certain requirements that are not often seen elsewhere.

DeepSeek is being held up by Chinese media as an example of some sort of local superiority - so we can imply that DeepSeek is run by a firm that complies completely with local requirements.

Those local requirements will include and not be limited to, a particular set of interpretations of historic events. Not least whether those events even happened at all or how they happened and played out.

I think it would be prudent to consider that both the input data and the output filtering (guard rails) for DeepSeek are constructed rather differently to those that are used by say ChatGPT.

There is minimal doubt that DeepSeek represents a superb innovation in frugality of resources required for its creation (training). However, its extant implementation does not seem to have a training data set that you might like it to have. It also seems to have some unusual output filtering.

The model itself has censorship, which can be seen even in the distilled versions quite easily.

The online version has additional pre/post-filters (on both inputs and outputs) that kill the session if any questionable topic are brought up by either the user or the model.

However any guardrails the local version has are easy to circumvent because you can always inject your own tokens in the middle of generation, including into CoT.

Western models are also both trained for "safety", and have additional "safety" guardrails when deployed.

there's a bit of censorship locally. abliterated model makes it easy to bypass

People are stupid.

What is censorship to a puritan? It is a moral good.

As an American, I have put a lot of time into trying to understand Chinese culture.

I can't connect more with the Confucian ideals of learning as a moral good.

There are fundamental differences though from everything I know that are not compatible with Chinese culture.

We can find common ground though on these Confucian ideals that DeepSeek can represent.

I welcome China kicking our ass in technology. It is exactly what is needed in America. America needs a discriminator in an adversarial relationship to progress.

Otherwise, you get Sam Altman and Worldcoin.

No fucking way. Lets go CCP!

  • I don't really understand what you're getting at here, and how it relates to the comment you're replying to.

    You seem to be making the point that censorship is a moral good for some people, and that the USA needs competition in technology.

    This is all well and good as it's your own opinion, but I don't see what this has to do with the aforementioned comment.