Comment by srameshc

3 months ago

So does that mean if Heretic is used for models like Deepseek and Qwen it can talk about subjects 1989 Tiananmen Square protests, Uyghur forced labor claims, or the political status of Taiwan. I am trying to understand the broader goals around such tools.

10 comments

srameshc

NitpickLawyer 3 months ago

That's an interesting testing case, not for the political aspect, but for the data aspect. One would assume that the totality of "sensitive" data (especially in chinese) that gets thrown into the training dataset is quite limited. Getting a model that wasn't trained on such data (presumably) to actually talk about it would be an interesting exercise. Tho I'd suggest doing it with smaller models first.

totetsu 3 months ago

There is already ablated Deepseek models out there that will do just that.

https://huggingface.co/NaniDAO/deepseek-r1-qwen-2.5-32B-abla...

https://www.lesswrong.com/posts/jGuXSZgv6qfdhMCuJ/refusal-in...

throwawaymaths 3 months ago

Yes, you can also achieve this, presumably less efficiently, with Lora training.

kachapopopow 3 months ago

the models already talk about it just fine if you load them up yourself, only the web api from official deepseek has these issues because they are required to do so by law.

throwawaymaths 3 months ago
That is not the case.
- kachapopopow 3 months ago
  
  not sure about all chinese models, but deepseek has absolutely no problem and qwen just avoids anything controvertial including completely unrelated subjects to china such as the lgbtq movement. also any kind of safeguards like those are easily bypassed since there isn't that much effort that was put in to prevent models talking about these subjects which to be sounds like dataset tainting rather than intentional bias.
- ls612 3 months ago
  
  I just tested this with Deepseek in Nvidia's AI sandbox and in Groq (so the inference was performed in the US) and it happily told me what happened on June 4, 1989. Stop spreading disinformation.
  
  3 replies →