← Back to context

Comment by 0x7d

3 months ago

Hi HN! This is my article!

It was great to put together a writeup of a fun evening or two of work. It looks like this goes much deeper.

I'm learning a lot from some of the linked articles, one of the base hypothesise of my work was that the filtering was distinct from the model, due to the cost of training with pre-filtered or censored data at scale: https://news.ycombinator.com/item?id=42858552 on Chain-Of-Thought abandonment when certain topics are discussed.

I'll have to look at served vs trained censorship, in different context.

In the HN discussion you link to, I went through exactly the process that you are going through now! I too thought the censorship was just a thin wrapper around the model, as I had not understood the article I had read until it was explained to me.