← Back to context

Comment by abujazar

7 hours ago

You're talking about a subset of ASCII then. Unicode is supposed to support different languages and advanced typography, for which those characters are necessary. You can't write e.g. Arabic or Hebrew without those "unnecessary" invisible characters.

Please explain why an invisible zero width "character" is necessary.

  • if you write كلب which is an arabic word written right to left in the middle of an english sentence, you want to preserve the order of the characters in the stream for computer processing purposes. meaning the chararacter ك must come before the ل and after the e and the space with respect to the memory layout. whereas when displayed, it must be inverted to be legible. the solution is to have an invisible character that indicates a switch in text direction. if you were wondering, the situation where you want to write text in a foreign language within your text is very common outside english speaking countries.

    • Look I'm writing sdrawkcab (amazingly, I did it without using Unicode!). Layout is the job of your text formatting program. It's easy to fix a text editor to support right-to-left text entry.

      The switch in text direction has resulted in malicious code injection attacks, as the reversed text becomes invisible. I had to change my compiler to reject those Unicode characters for that reason. It can be used in other cases to have hidden, malicious text.

      Have you checked your SQL code for invisible backwards text that injects malware?