Comment by avadodin
12 hours ago
What OP is describing wasn't called abliteration at all.
Abliteration whilst a neologism implies a surgical ablation of refusal.
Earlier approaches post–trained the model to refuse less and, much like other kinds of fine–tuning, it degraded performance. They were "uncensored".
Abliteration has seen some improvement to this day but it always was close to equivalent performance to the original when compared to those earlier techniques.
No comments yet
Contribute on Hacker News ↗