Comment by timsneath
8 hours ago
And just for fun, they also support what must be the most weird encoding system -- UTF-EBCDIC (https://www.ibm.com/docs/en/i/7.5.0?topic=unicode-utf-ebcdic).
8 hours ago
And just for fun, they also support what must be the most weird encoding system -- UTF-EBCDIC (https://www.ibm.com/docs/en/i/7.5.0?topic=unicode-utf-ebcdic).
Post that stuff with a content warning, would you?
> The base EBCDIC characters and control characters in UTF-EBCDIC are the same single byte codepoint as EBCDIC CCSID 1047 while all other characters are represented by multiple bytes where each byte is not one of the invariant EBCDIC characters. Therefore, legacy applications could simply ignore codepoints that are not recognized.
Dear god.
That says roughly the following when applied to UTF-8:
"The base ASCII characters and control characters in UTF-8 are the same single byte codepoint as ISO-8859-1 while all other characters are represented by multiple bytes where each byte is not one of the invariant ASCII characters. Therefore, legacy applications could simply ignore codepoints that are not recognized."
(I know nothing of EBCDIC, but this seems to mirror UTF-8 design)