← Back to context

Comment by azeirah

1 year ago

I strongly believe that the problem isn't that tokenization isn't the underlying problem, it's that, let's say bit-by-bit tokenization is too expensive to run at the scales things are currently being ran at (openai, claude etc)

It's not just a current thing, either. Tokenization basically lets you have a model with a larger input context than you'd otherwise have for the given resource constraints. So any gains from feeding the characters in directly have to be greater than this advantage. And for CoT especially - which we know produces significant improvements in most tasks - you want large context.