Comment by jonhohle

1 year ago

If you look at the list, it’s primarily (but not completely) about oddities in their UTF-8 encoding. Most of them appear to be on the boundary of adding additional bytes when the case is changed. That’s not really Unicode’s concern.

There are also some that appear to change from single characters to grapheme clusters, which would be a Unicode quirk.