← Back to context

Comment by pseudosavant

1 year ago

LLMs aren't really language models so much as they are token models. That is how they can also handle input in audio or visual forms because there is an audio or visual tokenizer. If you can make it a token, the model will try to predict the following ones.

Even though I'm sure chess matches were used in some of the LLM training, I'd bet a model trained just for chess would do far better.

> That is how they can also handle input in audio or visual forms because there is an audio or visual tokenizer.

This is incorrect. They get translated into the shared latent space, but they're not tokenized in any way resembling the text part.