← Back to context

Comment by floriangoebel

12 hours ago

Wouldn't this increase your token usage because the tokenizer now can't process whole words, but it needs to go letter by letter?

It doesn't go letter by letter, so not with current tokenizers.

There will likely be some internal reasoning going "I wonder if the user meant spell check, I'm gonna go with that one".

And it'll also bias the reasoning and output to internet speak instead of what you'd usually want, such as code or scientific jargon, which used to decrease output quality. I'm not sure if it still does