Comment by maxdo
16 hours ago
article :
bla blah blah, marketing... we are fun people, bla blah, goblin, we will not destroy the world you live in.. RL rewards bug is a culprit. blah blah.
16 hours ago
article :
bla blah blah, marketing... we are fun people, bla blah, goblin, we will not destroy the world you live in.. RL rewards bug is a culprit. blah blah.
someone woke up on the wrong side of the goblin today
Yeah, though it's not great marketing. Especially for hiring interpretability researchers. Their own alignment research has reward model interpretability, personality features and so on (see https://alignment.openai.com ). It just seems like a different department wrote it, which is a shame because I'd love to read about goblin feature vectors and functional emotions.
real goblin-y response