Comment by alexchamberlain
3 months ago
Thanks, that's really interesting. Do they correct for spelling mistakes or internationalised spellings? For example, does `colour` and `color` end up in the same token stream?
3 months ago
Thanks, that's really interesting. Do they correct for spelling mistakes or internationalised spellings? For example, does `colour` and `color` end up in the same token stream?
No it just looks at exact character sequences, try it out yourself here: https://platform.openai.com/tokenizer