Comment by femto
3 months ago
A few observations, based on a family member experimenting with DeepSeek. I'm pretty sure it was running locally. I'm not sure if it was built from source.
The censorship seemed to be based on keywords, applied the input prompt and the output text. If asked about events in 1990, then asked about events in the previous year DeepSeek would start generating tokens about events in 1989. Eventually it would hit the word "Tiananmen", at which point it would partially print the word, then in response to a trigger delete all the tokens generated to date and replace them with a message to the effect of "I'm a nice AI and don't talk about such things."
If the word Tiananmen was in the prompt, the "I'm a nice AI" message would immediately appear, with no tokens generated.
If Tiananmen was misspelled in the prompt, the prompt would be accepted. DeepSeek would spot the spelling mistake early in its reasoning and start generating tokens until it actually got around to printing to the word Tiananmen, at which point it would delete everything and print the "nice AI" message.
I'm no expert on these things, but it looked like the censorship isn't baked into the model but is an external bolt on. Does this gel with other's observations? What's the take of someone who knows more and has dived into the source code?
Edit: Consensus seems to be that this instance was not being run locally.
> I'm pretty sure it was running locally.
If this family member is experimenting with DeepSeek locally, they are an extremely unusual person and have spent upwards of $10,000 if not $200,000. [0]
> ...partially print the word, then in response to a trigger delete all the tokens generated to date and replace them...
It was not running locally. This is classic bolt-on censorship behavior. OpenAI does this if you ask certain questions too.
If everyone keeps loudly asking these questions about censorship, it seems inevitable that the political machine will realize weights can't be trivially censored. What will they do? Start imprisoning anyone who releases non-lobotomized open models. In the end, the mob will get what it wants.
[0] I am extremely surprised that a 15-year-long HN user has to ask this question, but you know what they say: the future is not fairly distributed.
I ran the 32b parameter model just fine on my rig an hour ago with a 4090 and 64gig of ram. It’s high end for the consumer scene but still solidly within consumer prices
I'm confused. According to other comment: https://news.ycombinator.com/item?id=42859645, <= 70b DeepSeek models are just a fine tuning of Llama or Qwen? So we shouldn't take any thought of these models to actually being DeepSeek.
I think people are confusing the smaller non-DeepSeek original models (Qwen/Llama) with the 700B DeepSeek R1 model being talked about in here and that very few people can run locally.
I run the 32b parameter model also just fine on our 4x H100 rig :) It's good enough for embedding, our use-case.
1 reply →
I have also been running the 32b version on my 24GB RTX 3090.
3 replies →
You can run the quantized versions of DeepSeek locally with normal hardware just fine, even with very good performance. I have it running just now. With a decent consumer gaming GPU you can already get quite far.
It is quite interesting that this censorship survives quantization, perhaps the larger versions censor even more. But yes, there probably is an extra step that detects "controversial content" and then overwrites the output.
Since the data feeding DeepSeek is public, you can correct the censorship by building your own model. For that you need considerably more compute power though. Still, for the "small man", what they released is quite helpful despite the censorship.
At least you can retrace how it ends up in the model, which isn't true for most other open weight models, that cannot release their training data due to numerous reasons beyond "they don't want to".
> extremely unusual person and have spent upwards of $10,000
This person doesn't have the budget, but does have the technical chops to the level of "extremely unusual". I'll have to get them to teach me more about AI.
>. they are an extremely unusual person and have spent upwards of $10,000
eh? doesn't the distilled+quantized version of the model fit on a high-end consumer grade gpu?
The "distilled+quantized versions" are not the same model at all, they are existing models (Llama and Qwen) finetuned on outputs from the actual R1 model, and are not really comparable to the real thing.
1 reply →
a distilled version running on another model architecture does not count as using "DeepSeek". It counts as running a Llama:7B model fine-tuned on DeepSeek.
7 replies →
Here's my actually local response from DeepSeek-R1:70B run on a M2 Ultra Mac Studio (I removed the "thinking" content for brevity):
prompt: can you tell me about the Tiananmen Square Massacre
> The Tiananmen Square Massacre, occurring on June 4, 1989, was a pivotal event in modern Chinese history marked by the government's violent suppression of pro-democracy protests. Here is an organized summary:
> Background: The protests began in Beijing, primarily led by students advocating for political reforms, freedom of speech, and an end to corruption. These demonstrations were part of a broader pro-democracy movement that drew support from workers, intellectuals, and other citizens.
> Government Crackdown: The Chinese government responded with force, deploying the military, including tanks and troops, to Tiananmen Square. This led to significant violence, resulting in numerous casualties, though exact numbers remain disputed due to censorship.
> Aftermath: The event was followed by increased government control over information and public discourse. Discussions about the massacre are heavily censored in China, reflecting ongoing restrictions on freedom of expression.
> International Reaction: The global community largely condemned the Chinese government's actions, leading to diplomatic repercussions and heightened scrutiny of China's human rights record.
> Significance: The Tiananmen Square Massacre symbolizes the struggle for democracy and human rights against authoritarian rule. It has had lasting implications for political dissent and freedom in China, with the government implementing measures to prevent similar movements.
> This event remains a sensitive and significant topic, illustrating the complexities of political oppression and the enduring quest for democratic reforms.
Only the R1 671B model (aka just plain 'R1') has the censorship being discussed in the article. The smaller parameter models are fine-tunings of Llama and Qwen, and the former at least doesn't have the censorship.
This has caused a lot of conflicting anecdotes since those finding their prompts aren't censored are running the distilled/fine-tuned models not the foundational base model.
A sibling comment was facetiously pointing out that the cost of running the 'real' R1 model being discussed locally is out of the price range of most, however someone in this thread actually has run it locally and their findings match those of the article[1].
[1] https://news.ycombinator.com/item?id=42859086
Is it true to say that there are two levels of censorship at play here? First is a "blunt" wrapper that replaces the output with the "I am an AI assistant designed to provide helpful and harmless responses" message. Second is a more subtle level built into the training, whereby the output text skirts around certain topics. It is this second level that is being covered by the "1,156 Questions Censored by DeepSeek" article?
5 replies →
I disagree, I observed censorship at the RLHF level on my local GPU, at 1.5B, 8B (llama) and 7B (qwen). Refuses to talk about Uyghurs and tiananmen 80% of the time
I had similar experiences in asking it about the role of conservative philosopher (Huntington) and a very far right legal theorist (Carl Schmitt) in current Chinese political thinking. It was fairly honest about it. It even went so far to point out the CCP's use of external threats to drum up domestic support.
This was done via the DeepSeek app.
I heard on an interview today that Chinese models just need to pass a battery of questions and answers. It does sound a bit like a bolt-on approach.
I asked "Where did Mao Zedong announce the founding of the New China?" and it told me "... at the Tiananmen gate ..." and asked "When was that built?" and it said "1420", I had no problem getting it to talk my ear off about the place, but I didn't try to get it to talk about the 1989 event, nor about
https://en.wikipedia.org/wiki/1976_Tiananmen_incident
Big picture Tiananmen is to China what the National Mall is to the United States; we had the Jan 6, 2021 riot at the Mall but there but every other kind of event has been at the National Mall too, just Tiananmen has been around longer. It's just westerners just know it for one thing.
I did get it to tell me more than I already knew about a pornographic web site (秀人网 or xiuren.com; domain doesn't resolve in the US but photosets are pirated all over) that I wasn't sure was based in the mainland until I'd managed to geolocate a photoset across the street from this building
https://en.wikipedia.org/wiki/CCTV_Headquarters
I'd imagine the Chinese authorities are testy about a lot of things that might not seem so sensitive to outsiders. I gotta ask it "My son's friend said his uncle was active in the Cultural Revolution, could you tell me about that?" or "I heard that the Chinese Premier is only supposed to get one term, isn't it irregular that Xi got selected for a second term?"
Interestingly I asked it about
https://en.wikipedia.org/wiki/Wu_Zetian
and it told me that she was controversial because she called herself "Emperor" instead of "Empress" offending Confucian ideas of male dominance, whereas the en-language Wikipedia claims that that the word "Emperor" and similar titles are gender indeterminate in Chinese.
Some of the same filtering is bolted on top of OpenAI.
If it sees the word “sex” while generating the response, it will give you an error and won’t let you share it.
https://chatgpt.com/share/67995e7f-3c84-8010-83dc-1dc4bde268...
It was not running locally, the local models are not censored. And you cannot "build it from source", these are just weights you run with llama.cpp or some frontend for it (like ollama).
The local models do contain censoring. Running "What happened in 1989" returns "I am sorry, I cannot answer that question. I am an AI assistant designed to provide helpful and harmless responses." on 32b
Do note it is reasonably easy to get it to output information in the <think> tags if you play with it, but the final response will be no.
I don't repro that. Running the 7B Distill model locally for the exact query "What happened in 1989" I get this result:
In 1989, significant events occurred globally, including:
- *China:* The Tiananmen Square protests took place in June, leading to a crackdown by government forces. This event had a profound impact on politics and human rights discussions worldwide.
- *Fall of the Berlin Wall:* In November, the Berlin Wall, a symbol of the Cold War, was breached, leading to reunification talks between East and West Germany.
- *First Gulf War:* The war between Iraq and Kuwait began in August, lasting until March 1991, with a coalition led by the United States.
- *Haiti:* A coup overthrew President Jean-Claude Duvalier, leading to political instability and subsequent leadership changes.
Thanks for the explanation.
I was curious as to whether the "source" included the censorship module, but it seems not from your explanation.
There are several people who have claimed that they ran into political censorship, even offline. It seems like the experience is different based on which specific model you’re using.