Comment by unoti

3 months ago

> Thing that I don't understand about LLMs at all, is that how it is possible to for it to "understand" and reply in hex (or any other encoding), if it is a statistical "machine"

It develops understanding because that's the best way for it to succeed at what it was trained to do. Yes, it's predicting the next token, but it's using its learned understanding of the world to do it. So this it's not terribly surprising if you acknowledge the possibility of real understanding by the machine.

As an aside, even GPT3 was able to do things like english -> french -> base64. So I'd ask a question, and ask it to translate its answer to french, and then base64 encode that. I figured there's like zero chance that this existed in the training data. I've also base64 encoded a question in spanish and asked it, in the base64 prompt, to respond in base64 encoded french. It's pretty smart and has a reasonable understanding of what it's talking about.