Comment by dannyw
2 years ago
Source? LLMs have no “hidden tokens” they dedicate.
Or you mean — if the tokenizer was trained differently…
2 years ago
Source? LLMs have no “hidden tokens” they dedicate.
Or you mean — if the tokenizer was trained differently…
Not hidden tokens, actual tokens. Ask a LLM to guess the letter count like 20 times and often it will converge on the correct count. I suppose all those guesses provide enough "resolution" (for lack of a better term) that it can count the letters.
> often it will converge on the correct count
That's a pretty low bar for something like counting words.
That reminds of something I've wondered about for months: can you improve a LLM's performance by including a large amount of spaces at the end of your prompt?
Would the LLM "recognize" that these spaces are essentially a blank slate and use them to "store" extra semantic information and stuff?
but then it will either overfit or you need to train it on 20 times the amount of data ...
I'm taking about when using a LLM, which doesn't involve training and thus no overfitting.
1 reply →