Comment by simoncion

6 months ago

Let's see if surrounding that URL in the URL-surrounding character pair helps the HN linkifier:

<https://www.discogs.com/artist/207714-!!!>

Edit: It does. So, this would be yet another of the squillion-ish examples to support the advice "Please, for the love of god, always enclose your URLs in '<>'.". (And if you're writing a general-purpose URL linkifier, PLEASE just assume that everything between those characters IS part of the URL, rather than assuming you know better than the user.)

URLs can contain > too.

  • I don't believe that they can, not unencoded. Check out the grammar in the relevant RFC[0], as well as the discussion about URL-unsafe characters in the RFC that's updated by 3986 [1], from which I'll quote below.

    > Characters can be unsafe for a number of reasons. ... The characters "<" and ">" are unsafe because they are used as the delimiters around URLs in free text

    Also note the "APPENDIX" section on page 22 of RFC1738, which provides recommendations for embedding URLs in other contexts (like, suchas, in an essay, email, or internet forum post.)

    Do you have standards documents that disagree with these IETF ones?

    If you're using the observed behavior of your browser's address bar as your proof that ">" is valid in a URL, do note that the URL

      https://news.ycombinator.com/item?id=44826199>hello there
    

    might appear to contain a space and the ">" character, but it is actually represented as

      https://news.ycombinator.com/item?id=44826199%3Ehello%20there
    

    behind the scenes. Your web browser is pretty-printing it for you so it looks nicer and is easier to read.

    [0] <https://datatracker.ietf.org/doc/html/rfc3986#appendix-A>

    [1] <https://datatracker.ietf.org/doc/html/rfc1738#section-2.2>