Comment by jrootabega
2 months ago
Here's a POC that works in emacs. Doesn't cover all of the relevant characters, but:
(setq ;;some other invisible or interesting characters
unicode-zero-width-space ?\u200b
unicode-zero-width-non-joiner ?\u200c
unicode-zero-width-joiner ?\u200d
unicode-zero-width-nbsp ?\ufeff
unicode-narrow-nbsp ?\u202f
unicode-word-joiner ?\u2060
unicode-grapheme-joiner ?\u034f
unicode-no-break-space ?\u00a0
unicode-combining-long-stroke ?\u0336
;;variation selector examples
unicode-vs-fe00 ?\ufe00
unicode-vs-fe0f ?\ufe0f
unicode-vs-e0100 ?\xe0100)
(defun show-glyphless-as-hex (char)
(let ((original (elt glyphless-char-display char)))
(aset glyphless-char-display char 'hex-code)
original)) ;;so you can see what you just replaced
(progn
(show-glyphless-as-hex unicode-zero-width-space)
(show-glyphless-as-hex unicode-zero-width-non-joiner)
(show-glyphless-as-hex unicode-zero-width-joiner)
(show-glyphless-as-hex unicode-zero-width-nbsp)
(show-glyphless-as-hex unicode-word-joiner)
(show-glyphless-as-hex unicode-grapheme-joiner)
(show-glyphless-as-hex unicode-narrow-nbsp)
(show-glyphless-as-hex unicode-no-break-space)
;;these may already be visible if the current conditions don't support them
;;but we'll force them
(show-glyphless-as-hex unicode-vs-fe00)
(show-glyphless-as-hex unicode-vs-fe0f)
(show-glyphless-as-hex unicode-vs-e0100))
And as a higher-level configuration you can set most, maybe even all, of the relevant invisible characters (still not sure how 0x34f grapheme joiner fits in) at once with something like:
This will modify values in glyphless-char-display, but it's OK to modify those directly if you need to.
Here is the bare minimum this is built on, which you can type in yourself if you're paranoid or want to start from the bottom up. Swap in the hexadecimal codepoint of the invisible character after the ?\x
I use vim. It seems like `:set binary enc=latin1` works, though I don't understand why the latin1 part is required.
[dead]