Comment by Arnt

1 year ago

When do you think that first mistake happened?

(Pick a year, then think about why it didn't happen in that year.)

5 comments

Arnt

When Unicode was being specced out originally I guess. There was more interest in unifying characters at that stage (see also the far more controversial Han unification)

Arnt 1 year ago
Uh-huh. At that time roundtrip compatiblity with all widely used 8-bit encodings was a major design criterion. Roundtrip meaning that you could take an input string in e.g. iso 8859-9, convert it to unicode, convert it back, and get the same string, still usable for purposes like database lookups. Would you have argued to break database lookups at the time?
- Macha 1 year ago
  
  ISO-8859-9 actually does have what I suggest:
  FD/49 are lower/upper dotless ı/I
  DD/69 are upper/lower dotted İ/i.
  There's nothing around the capability to round trip that through unicode that required 49 in ISO-8859-9 to be assigned the same unicode codepoint as 49 in ISO-8859-1 because they happen to be visually identical
  
  2 replies →