Comment by jonhohle
6 months ago
If you look at the list, it’s primarily (but not completely) about oddities in their UTF-8 encoding. Most of them appear to be on the boundary of adding additional bytes when the case is changed. That’s not really Unicode’s concern.
There are also some that appear to change from single characters to grapheme clusters, which would be a Unicode quirk.
No comments yet
Contribute on Hacker News ↗