← Back to context

Comment by manuelmoreale

3 days ago

> Is it? They're all generalizing from a pretty similar pool of text, and especially for the idea of a "helpful, harmless, knowledgeable virtual assistant", I think you'd end up in the same latent design space. Encompassing, friendly, radiant.

Oh yeah I totally agree with that. What I was referring to was the fact that even though are different companies trying to build "different" products, the output is very similar which suggests that they're not all that different after all.

To massively oversimplify, they are all boxes that predict the next token based on material they’ve seen before + human training for desirable responses.

You’d have to have a very poorly RLHF’d model (or a very weird system prompt) for it to draw you a Terminator, pastoral scene, or pelican riding a bicycle as its self image :)

I think that’s what made Grok’s Mechahitler glitch interesting: it showed how astray the model can run if you mess with things.

  • > You’d have to have a very poorly RLHF’d model (or a very weird system prompt) for it to draw you a Terminator, pastoral scene, or pelican riding a bicycle as its self image :)

    How about a pastoral scene with a terminator pelican riding a bike? Jokes aside I get what you're saying, and it obviously makes total sense.