Comment by hgs3

6 months ago

This isn't "odd" behavior. It's a consequence of using a multibyte encoding scheme. Also, when dealing with case mapping, you can't assume that the character count will remain constant. This is because in Unicode full case mappings can map a character to multiple characters, meaning you might end up with more characters than you started with, regardless of the encoding used.