Comment by chrismorgan

2 months ago

> Use an HTML entity, a decimal code, or a hex code.

Please no: just write the character. <, & and (in quoted attributes) " or ' are the only characters that need to be encoded; a few others have arguable benefit to being encoded (most notably NO-BREAK SPACE), but most Unicode characters should just be put in literally. The days when you couldn’t be confident of the file encoding are past: your HTML is being served as UTF-8 (or in the rare case it isn’t, you should fix that instead of avoiding non-ASCII in the source).

Same deal with CSS (" and \ are the only ones you need to escape) and JavaScript (" or ' or `, as appropriate).

URLs? Occasionally you may encounter a legacy system where you need to percent-encode it yourself (similarly around punycoding internationalised domain names), but you can almost always (and thus, in my opinion, should) just write it and leave anything that wants it to be ASCII to perform the percent-encoding itself.

Excel I can’t comment on, but I presume you can just write "≈" and UNICHAR should almost never be used.

3 comments

chrismorgan

yarlinghe 2 months ago

Yep — fully agree.

For modern HTML/CSS/JS, you should just write the character and serve UTF-8. The entities / codes are there purely as reference for legacy cases, debugging, or when you only have a code point and no rendered glyph — not as a recommendation for normal authoring.

ghusbands 2 months ago
Your site still says "HOW TO ADD ALMOST EQUAL TO IN HTML? Use an HTML entity, a decimal code, or a hex code."
That is incorrect. As you say, you should just write the character in your HTML and ensure it's served with the correct encoding. If it's just for legacy cases, debugging or such, say so on the site.
- yarlinghe 2 months ago
  
  Agreed. Updated the HTML section to recommend writing the literal character + UTF-8 by default. Numeric refs/entities are now explicitly framed as legacy / edge-case references.