Comment by duskwuff

1 year ago

Character case is a locale-dependent mess; trying to represent it in the values of code points (which need to be universal) is a terrible idea.

For example: in English, U+0049 and U+0069 ("I" and "i") are considered an uppercase/lowercase pair. In the Turkish locale, these are considered two separate characters with their own uppercase and lowercase versions: U+0049/U+0130 ("I" / "ı") and U+0131/U+0069 ("İ" / "i").

1 comment

duskwuff

panpog 1 year ago

Of course you sometimes need tailoring to a particular language. On the other hand, I don't see how encoding untailered casing would make tailored casing harder.