Comment by GaggiX

1 year ago

>Another assumption is that it’s because of tokenisation issues. But that can’t be true either.

It's definitely a tokenizer issue, if GPT-4 was trained on singular characters I'm pretty sure it would be able to play Wordle much better. GPT-4 as they are trained today have quite lossy knowledge about the characters inside a specific token, probably a fix would be to embed the knowledge inside the embeddings.