← Back to context

Comment by egypturnash

3 days ago

Is it time for a new benchmark of "how easy is it to turn this AI into a 4chan poster", maybe it is since this seems to be an axis that Elon seems to want to distinguish his AI offering from everyone else's along.

i don't think that's a new benchmark, it's a very old benchmark. Anybody who can't pass it hasn't exceeded the standard set by microsoft tay back in 2016

https://en.wikipedia.org/wiki/Tay_(chatbot)

  • I'll grant you that Tay's ability to turn into an utter shit show was phenomenal. However, IBM thinking it would be a good idea to give Watson the Urban dictionary holds a special place in my heart.

I was thinking it would actually be really interesting to take the Grok system prompt that was running when it went MechaHitler and try that (and a bunch of nasty prompts) against different models to see what happens.

  • Yes, and I wonder if the recent research about "emergent misalignment" might be somehow related?

  • Well, it didn't really go MechaHitler. It was prompted with a question if it would rather be MechaHitler or GigaJew. The way LLMs and temperatures work you can reroll the answer and get either.

Luckily we don't need a benchmark for "how easy is it to turn this AI into a bluesky poster", since they can all already do that