It's most likely not intentional - it rather looks like a side effect of "de-woking" it in a heavy-handed manner without fully understanding the outcome. But of course this is a result of alignment training. Fits pretty well with previous Grok shenanigans/prompt injections and the ego of the person behind it.
It's most likely not intentional - it rather looks like a side effect of "de-woking" it in a heavy-handed manner without fully understanding the outcome. But of course this is a result of alignment training. Fits pretty well with previous Grok shenanigans/prompt injections and the ego of the person behind it.
That’s literally what the article says. Musk announced a few days ago that they were removing the “woke filters” from Grok.
Does removing filters from an LLM, make it more aligned or less aligned?
A few weeks ago musk had a Twitter post asking people for non-woke things to train grok on.