Comment by orbital-decay
3 days ago
>unaligned LLMs
What makes you think it's unaligned? There's no way it's a base model. The point of alignment is not making LLMs follow spherical humanity values in a vacuum. This is an excuse used by AI ethicists to cover the fact that LLMs are aligned with what creators' want to see in the output, because most often they are the same people.
Wait, so, do you believe this was intentional?
Wild if true.
It's most likely not intentional - it rather looks like a side effect of "de-woking" it in a heavy-handed manner without fully understanding the outcome. But of course this is a result of alignment training. Fits pretty well with previous Grok shenanigans/prompt injections and the ego of the person behind it.
That’s literally what the article says. Musk announced a few days ago that they were removing the “woke filters” from Grok.
Does removing filters from an LLM, make it more aligned or less aligned?
A few weeks ago musk had a Twitter post asking people for non-woke things to train grok on.