Comment by archagon 4 days ago [flagged] 6 comments archagon Reply convery 4 days ago User: Be offensive! LLM: *Is offensive* Social media: OMG how could this happen?!?!? Why didn't Elon stop it?!? epakai 4 days ago More like they are going out of their way to collect offensive training data.https://x.com/elonmusk/status/1936493967320953090 eviks 4 days ago User: whom would you worship? LLM: Is offensive Social media: Offended Also social media: but if you ignore reality, you can make up a funny story about Social media! ZvG_Bonjwa 4 days ago The "be offensive" goading only happened long after Grok had already started going off the rails to pretty innocuous queries.This is not the first time Grok has exhibited this behaviour either (i.e. the random white genocide rants from a few months back).There is a big difference between a model being "breakable" and a model demonstrating inherent radical bias. I think people are right to be concerned. lowsong 4 days ago You are misrepresenting the situation. Users gave neutral questions and the generated response literally began praising Hitler. computerthings 4 days ago [dead]
convery 4 days ago User: Be offensive! LLM: *Is offensive* Social media: OMG how could this happen?!?!? Why didn't Elon stop it?!? epakai 4 days ago More like they are going out of their way to collect offensive training data.https://x.com/elonmusk/status/1936493967320953090 eviks 4 days ago User: whom would you worship? LLM: Is offensive Social media: Offended Also social media: but if you ignore reality, you can make up a funny story about Social media! ZvG_Bonjwa 4 days ago The "be offensive" goading only happened long after Grok had already started going off the rails to pretty innocuous queries.This is not the first time Grok has exhibited this behaviour either (i.e. the random white genocide rants from a few months back).There is a big difference between a model being "breakable" and a model demonstrating inherent radical bias. I think people are right to be concerned. lowsong 4 days ago You are misrepresenting the situation. Users gave neutral questions and the generated response literally began praising Hitler. computerthings 4 days ago [dead]
epakai 4 days ago More like they are going out of their way to collect offensive training data.https://x.com/elonmusk/status/1936493967320953090
eviks 4 days ago User: whom would you worship? LLM: Is offensive Social media: Offended Also social media: but if you ignore reality, you can make up a funny story about Social media!
ZvG_Bonjwa 4 days ago The "be offensive" goading only happened long after Grok had already started going off the rails to pretty innocuous queries.This is not the first time Grok has exhibited this behaviour either (i.e. the random white genocide rants from a few months back).There is a big difference between a model being "breakable" and a model demonstrating inherent radical bias. I think people are right to be concerned.
lowsong 4 days ago You are misrepresenting the situation. Users gave neutral questions and the generated response literally began praising Hitler.
More like they are going out of their way to collect offensive training data.
https://x.com/elonmusk/status/1936493967320953090
User: whom would you worship? LLM: Is offensive Social media: Offended Also social media: but if you ignore reality, you can make up a funny story about Social media!
The "be offensive" goading only happened long after Grok had already started going off the rails to pretty innocuous queries.
This is not the first time Grok has exhibited this behaviour either (i.e. the random white genocide rants from a few months back).
There is a big difference between a model being "breakable" and a model demonstrating inherent radical bias. I think people are right to be concerned.
You are misrepresenting the situation. Users gave neutral questions and the generated response literally began praising Hitler.
[dead]