Comment by exceptione
5 hours ago
What do you mean with non-english text? I don't think "Ä" will be more efficient in utf16 than in utf8. Or do you mean utf16 wins in cases of non-latin scripts with variable width? I always had the impression that utf8 wins on the vast majority of symbols, and that in case of very complex variable width char sets it depends on the wideness if utf16 can accommodate it. On a tangent, I wonder if emoji's would fit that bill too..
Japanese, Chinese, Korean and Indic scripts are mostly 2 bytes per character on UTF-16 and mostly 3 bytes per character in UTF-8.
Really, as an East Asian language user the rest of the comments here make me want to scream.
hn often makes me want to scream