Comment by DemocracyFTW2

6 months ago

> * TODO what about Ideographic Description Characters?

I've never encountered them other than rendered with widths like any other CJK character, i.e. with (nominally) double width. There may be software that makes an effort to render IDSes (Ideographic Description Sequences) as existing or generated ideographs (or whacha may call those), but I have yet to see one. There may however, and IMO more likely, be situations where you want to grant the user an input of exactly one, or up to a certain number of CJK characters e.g. for the purpose of searching and grant them the ability to use IDSes for unencoded characters or incompletely known characters. But in that case you're clearly leaving the boundaries of what is Unicode and enter into the grammar of your search engine's customized search strings. Meaning that you probably don't need to handle IDC separately at all other than treating them like any other fullwidth CJK codepoint.