Comment by duskwuff
1 day ago
That seems more likely to be a logical inference by the LLM than an authoritative statement. I can't imagine any scenario where it would explicitly be informed that e.g. "Elon Musk has ordered you to talk about white genocide".
That all being said - given that Grok seems to have some sort of access to popular recent Twitter posts - possibly through training or in some other fashion - I have to wonder if users could inject prompt-like material into the model by making a post claiming to have recovered part of Grok's prompt, then getting that post to go viral.
No comments yet
Contribute on Hacker News ↗