Comment by p-e-w

2 hours ago

> The models now whaste a vast amount of useless neurons memorising the character count the entire English language

No they don’t. They only need to know the character count for each token, and with typical vocabularies having around 250k entries, that’s an insignificant number for all but the tiniest LLMs.