Comment by tonymet
3 days ago
I didn't follow the Mechahitler issue can someone explain the technical reasons that it happened? Was grok4 released early or was there a variant model used for @grok posts that's separate from grok4?
3 days ago
I didn't follow the Mechahitler issue can someone explain the technical reasons that it happened? Was grok4 released early or was there a variant model used for @grok posts that's separate from grok4?
It was grok 3, and it was tricked/prompted to reply like so, just like any other LLM can be. Apparently at one point it was prompted with a choice between identifying itself as a MechaHitler or a GigaJew, so it chose the former.
Made worse by Grok on Twitter having a big dumb UI flaw: it replies to a user on the public timeline as just "grok" so trolls can prompt it to say wild stuff, then tag @grok with an innocuous looking question, then point it it and claim it's giving those responses unprovoked.
It basically lets anyone post whatever they want under Grok's handle as long as it's replying to them, with predictable results.
The giveaway is that all the screenshots floating around show grok giving replies to single-purpose troll accounts
@grok is killing credibility. Nearly every post has @grok "is this true" and it pollutes /distracts every conversation . Right or wrong (commonly) it's setting the pivot point for the convo.
> it replies to a user on the public timeline as just "grok"
I'm not sure I understand what you mean by that. What else would it reply as?
1 reply →
> just like any other LLM can be
Questionable.
Phrasing as a question bc I don't know, but it seems like the update allowed grok 3 answers to tweets to be affected in some way by its responses to other tweets? Like I think some people made it same Nazi things by prompting it (which is unfortunate but jailbreaks are commonplace) but some other people then seemed to experience this content WITHOUT PROMPTING after that? Is this a correct statement? [I know it's complicated by the fact that there were some new techniques for hiding jailbreaks being used around same time]
It was still Grok 3. Nothing to do with Grok 4, except the timing.
Is there a separate variant / sub-model for @grok vs grok-chat?