Comment by londons_explore

2 years ago

The embedding method that nearly all LLM's use puts them at a severe disadvantage because they can't 'see' the spelling of common words. That makes it hard to infer things like 'past tense words end with an e'.

With small modifications, the exact characters could be exposed to the model, in addition to the current tokens, but it would require a full retraining, which would cost $$$$$$$$.