Comment by Sesse__
2 days ago
It's why the Unicode Collation Algorithm exists.
If you look in allkeys.txt (the base UCA data, used if you don't have language-specific stuff in your comparisons) for the two code points in question, you'll find:
004B ; [.2514.0020.0008] # LATIN CAPITAL LETTER K
212A ; [.2514.0020.0008] # KELVIN SIGN
The numbers in the brackets are values on level 1 (base), level 2 (typically used for accents), level 3 (typically used for case). So they are to compare identical under the UCA, in almost every case except for if you really need a tiebreaker.
Compare e.g. :
1D424 ; [.2514.0020.0005] # MATHEMATICAL BOLD SMALL K
which would compare equal to those under a case-insensitive accent-sensitive collation, but _not_a case-sensitive one (case-sensitive collations are always accent-sensitive, too).
Are the meanings for the levels for each code point defined somewhere (accent, casing, etc)?
Typically it is defined by the collation. For the default collation, where all the weights are as in the file, it's none/accent/accent+case. But if you go to e.g. Japanese, you can have a fourth level of “kana-sensitive” (which distinguishes between e.g. katakana and hiragana).