← Back to context

Comment by gehwartzen

2 days ago

Well we teach kids not to yell “Fire!” In a crowded theatre or “N***!“ at their neighbor. We also teach our industrial machines to distinguish between fingers and bolts, our cars to not say “make a left turn now” when on a bridge, etc

> Riley: Hey, what's class

> Huey: It means don't act like niggas

> Grandad: S-see, that's what I'm talkin' about right there. We don't use the n-word in this house

> Huey: Grandad, you said the word "nigga" 46 times yesterday. I counted

> Grandad: Nigga, hush

https://www.youtube.com/watch?v=TLodIw5iKX8

Funny scene, but it also illustrates a more serious point about (human) alignment - not all humans believe exactly the same things are good and bad, or consistently act in accordance with what they claim they believe is good. This is such a basic fact of human social life that it's almost banal to point it out explicitly; but if (specific) human beings or (specific) organizations of human beings are trying to align the AIs they are creating to human values, it will eventually become apparent that the notion of "human values" stops being coherent once you zoom in enough. Humans don't all share the same values, we aren't completely aligned with each other.

The critical point is who the "we" is.

Is "we" the parents teaching their children their own unique values, or is the "we" a government or corporation forcing one set of values on all children.

Why not encourage the users of AI to use a Safety.md (populated with some reasonable but optional defaults)?

  • There's nothing a meaningless document can do when the AI is not aligned in the first place.

    • "alignment" is the computer version for (philosophical not medical) "consciousness", a totally subjective, immeasurable concept.

      1 reply →