Comment by archagon 7 months ago [flagged] 6 comments archagon Reply convery 7 months ago User: Be offensive! LLM: *Is offensive* Social media: OMG how could this happen?!?!? Why didn't Elon stop it?!? epakai 7 months ago More like they are going out of their way to collect offensive training data.https://x.com/elonmusk/status/1936493967320953090 eviks 7 months ago User: whom would you worship? LLM: Is offensive Social media: Offended Also social media: but if you ignore reality, you can make up a funny story about Social media! ZvG_Bonjwa 7 months ago The "be offensive" goading only happened long after Grok had already started going off the rails to pretty innocuous queries.This is not the first time Grok has exhibited this behaviour either (i.e. the random white genocide rants from a few months back).There is a big difference between a model being "breakable" and a model demonstrating inherent radical bias. I think people are right to be concerned. lowsong 7 months ago You are misrepresenting the situation. Users gave neutral questions and the generated response literally began praising Hitler. computerthings 7 months ago [dead]
convery 7 months ago User: Be offensive! LLM: *Is offensive* Social media: OMG how could this happen?!?!? Why didn't Elon stop it?!? epakai 7 months ago More like they are going out of their way to collect offensive training data.https://x.com/elonmusk/status/1936493967320953090 eviks 7 months ago User: whom would you worship? LLM: Is offensive Social media: Offended Also social media: but if you ignore reality, you can make up a funny story about Social media! ZvG_Bonjwa 7 months ago The "be offensive" goading only happened long after Grok had already started going off the rails to pretty innocuous queries.This is not the first time Grok has exhibited this behaviour either (i.e. the random white genocide rants from a few months back).There is a big difference between a model being "breakable" and a model demonstrating inherent radical bias. I think people are right to be concerned. lowsong 7 months ago You are misrepresenting the situation. Users gave neutral questions and the generated response literally began praising Hitler. computerthings 7 months ago [dead]
epakai 7 months ago More like they are going out of their way to collect offensive training data.https://x.com/elonmusk/status/1936493967320953090
eviks 7 months ago User: whom would you worship? LLM: Is offensive Social media: Offended Also social media: but if you ignore reality, you can make up a funny story about Social media!
ZvG_Bonjwa 7 months ago The "be offensive" goading only happened long after Grok had already started going off the rails to pretty innocuous queries.This is not the first time Grok has exhibited this behaviour either (i.e. the random white genocide rants from a few months back).There is a big difference between a model being "breakable" and a model demonstrating inherent radical bias. I think people are right to be concerned.
lowsong 7 months ago You are misrepresenting the situation. Users gave neutral questions and the generated response literally began praising Hitler.
More like they are going out of their way to collect offensive training data.
https://x.com/elonmusk/status/1936493967320953090
User: whom would you worship? LLM: Is offensive Social media: Offended Also social media: but if you ignore reality, you can make up a funny story about Social media!
The "be offensive" goading only happened long after Grok had already started going off the rails to pretty innocuous queries.
This is not the first time Grok has exhibited this behaviour either (i.e. the random white genocide rants from a few months back).
There is a big difference between a model being "breakable" and a model demonstrating inherent radical bias. I think people are right to be concerned.
You are misrepresenting the situation. Users gave neutral questions and the generated response literally began praising Hitler.
[dead]