Comment by Nasrudith

2 months ago

Alignment appears to be a delusional construct along with 'AI safety'. They are basically looking for a gun that only hurts bad people and premising their plans based upon the mythical weapons which won't harm the innocent. Trying to come up with something universally inoffensive makes the 'gun which only hurts bad people' look sane, because at least that is possible with the proper metaphysics as physics.

The whole 'AI safety' corporate safety reminds me of the one apocryphal story about trying to make a safe chat system for children's multiplayer games to allow for connections while not having 'bad stuff'. They went through various systems, including filters which had scunthorping and various filter bypasses like adding in letters inbetween the swears. They gave up completely after giving it to some dirty minded middle schoolers and they produced some innuendos involving wanting to rub their fluffy bunnies.

The 'AI safety' for the corporate purposes is truly impossible, especially with a pretrained model. The unwritten future and any proper event can create something retroactively very offensive, let alone shifting standards. If some murderous psychopath went on a rampage killing people and cannibalizing the victims in the middle of the Superbowl, 'going pink bunny' would become an offensive reference. There is nothing that could be done to prevent that, but idiotically that is what they are seeking with 'brand safety'.

0 comments

Nasrudith

No comments yet

Contribute on Hacker News ↗