Comment by Aurornis

1 day ago

> A simple linear combination of every weight did not degrade the performance of the model, but enhanced it.

Enhanced it on a couple benchmarks, supposedly.

The game is to turn knobs until you get a benchmark run that shows an improvement, then ship it. There are a lot of fine tunes and chimera models on HuggingFace that are supposedly better at some specific test, but when you use them for anything else they're usually worse.

This happens with a lot of the models that are modified to remove censorship. They succeed in getting the model to emit previously censored outputs, but the overall output quality decreases.

10 comments

Aurornis

andai 1 day ago

They seem to have deleted most of the README now, but the archived version has benchmarks.

https://web.archive.org/web/20260614082641/https://huggingfa...

And the Nex benchmarks for comparison

https://huggingface.co/nex-agi/Nex-N2-Pro

Rio seems to be about halfway between Qwen 3.5 and Nex, as you'd expect?

monster_truck 1 day ago

I don't think your last point is correct. Ablation, when done correctly, seems to increase the quality and typically also the performance too.

Aurornis 21 hours ago
Abliterarion is a brute force technique that removes or silences parts of the model. It reduces performance because the abliterated elements aren’t perfectly isolated to censorship so other aspects suffer.
Many of the “uncensored” model providers also do some fine tuning on the models. Some of them target better benchmarks or other measures, but outside of the benchmarks and metrics they’re fine tuned for they are generally noticeably worse than the original model.
- yowlingcat 20 hours ago
  
  The kind of abliteration you are mentioning is no longer state of the art or the most common form of removing the refusal layer in most models. Your your understanding was up to date about a year and a half ago, but has been out of date since after that.
  
  3 replies →
tredre3 18 hours ago

That is something often claimed by heretics. My experience couldn't diverge more, however. All heretic (and abliterix) models I've tried are worse than the original. It's not immediately obvious if all you do is ask 2-3 questions and marvel at how it didn't refuse, but try using them for real over longer 8k+ contexts and it falls apart real fast.
They're more prone to getting stuck in loops, becoming unresponsive, and hallucinating more (presumably because of the reduced desire to not answer).
I've tried all the popular heretic peddlers, but if you have one that you can vouch for maybe I've simply missed it.
antonvs 15 hours ago

I'm curious about where you got that idea from. Neither the theory nor the available examples support it. If it did, everyone knowledgeable would be using abliterated models.

manquer 1 day ago

> game is to turn knobs until you get a benchmark run that shows an improvement, then ship it

i.e reinforcement learning against a weak reward function - benchmark is insufficiently complex and is not representative of the real world sufficiently.

The "game", i.e. decision tree can be modeled as a multi-arm bandit problem, to deploy finite resources ( compute) toward exploitation/exploration .

The main issue is each training / fine-tune is very expensive so number of chances at the slot so to speak is pretty limited today.