← Back to context

Comment by newpavlov

5 hours ago

In most cases locale is encoded in character itself, i.e. Latin "a" and Cyrillic "a" are two different characters, despite being visually indistinguishable in most cases.

The "language-sensitive" section of the special casing document [0] is extremely small and contains only the cases of stupid reuse of Latin I.

[0]: https://www.unicode.org/Public/UCD/latest/ucd/SpecialCasing....