Comment by _carbyau_
5 days ago
> but they still fail add the assumptions that people would draw.
I have mixed feelings about this. I agree with the default assumptions you have as to "what people would draw", however what do you want from this cognitive automation?
Do you want, "what most people would do" or do you want "something creative, an outlier, that still satisfies conditions" ?
I would want to know the LLM has a reliable and realistic World Model underneath all of the next token prediction.
Whether I am building hardened engineering systems, or discussing cooking methods, or discussing sensitive health concerns, or navigating complex psychological and interpersonal issues, the model will inevitably have to make some assumptions about context I haven’t provided. I want to know that those assumptions are grounded in reality.
For what it’s worth, a slack-line over a river in front of a medieval town is too anachronistic to be interesting, let alone the idea of an old man riding a bicycle well enough over a slack-line. That is output that was not grounded in a solid world model, regardless of how “creative” it was.
Well, if Rene Magritte or some similar artist produces a man riding a bicycle over a tightrope, he's being because he knows what people expect from "a man riding a bicycle over a river" but I think the machine doesn't know the normal expectations and so it's not being creative, just failing. A splatter sheet of an industrial painting operation may look like a Jackson Pollock print. The hired painters might even notice this after their shift. But if the process that produces this is just painting tractors, it's not creative either.
I think the point is that language is compressed. There's a lot conveyed in very little. Yes, it is ambiguous, but that's exactly the feature that makes natural language useful. It's also why it is so much easier to speak with your friends than it is with some random person in your town, you've learned how to compress and decompress each other's language better.
But that's also why we invented formal languages like math and programming. Because there's a lot of times where we don't want ambiguity. Law is basically mankind's greatest attempt at making natural language unambiguous and it doesn't take a genius to realize that that's a shitshow and never going to happen. At the end of the day, to make natural language even relatively low in ambiguity requires a metric fuck ton more words than it would take to express via a formal language (which are also overly pedantic and verbose)
So the problem is that the AI doesn't share those expected decompression strategies. Sure, many humans won't either, but developing a shared language is essential for properly communicating with others. We've all worked with someone who feels like they're speaking a different language. It's exhausting, right?
Reminds me of that dad teaching their kids programming by preparing a PB sandwich [^0].
Solvers are generally really good at bending your rules, but in a context where you want that. An outlaw rule-bending maniac is not what I want from a helpful agent.
[^0]: https://www.youtube.com/watch?v=mrmqRoRDrFg