← Back to context

Comment by docjay

3 hours ago

The thing that blows my mind about language models isn't that they do what they do, it's that it's indistinguishable from what we do. We are a black box; nobody knows how we do what we do, or if we even do what we do because of a decision we made. But the funny thing is: if I can perfectly replicate a black box then you cannot say that what I'm doing isn't exactly what the black box is doing as well.

We can't measure goals, autonomy, or consciousness. We don't even have an objective measure of intelligence. Instead, since you probably look like me I think it's polite to assume you're conscious…that's about it. There’s literally no other measure. I mean, if I wanted to be a jerk, I could ask if you're conscious, but whether you say yes or no is proof enough that you are. If I'm curious about intelligence I can come up with a few dozen questions, out of a possible infinite number, and if you get those right I'll call you intelligent too. But if you get them wrong… well, I'll just give you a different set of questions; maybe accounting is more your thing than physics.

So, do you just respond with text when you’re promoted with input from your eyes or ears? You’ll instinctively say “No, I’m conscious and make my own decisions”, but that’s just a sequence of tokens with a high probability in response to that question.

Do you actually have goals, or did the system prompt of life tell you that in your culture, at this point in time, you should strive to achieve goals[] because that’s what gets positive feedback?