← Back to context

Comment by Timwi

1 day ago

The reason XHTML failed is because the spec required it to be sent with a new MIME type (application/xml+xhtml I believe) which no webserver did out of the box. Everything defaulted to text/html, which all browsers would interpret as HTML, and given the mismatching doctype, would interpret as tag soup (quirks mode/lenient).

Meanwhile, local files with the doctype would be treated as XHTML, so people assumed the doctype was all you needed. So everyone who tried to use XHTML didn't realize that it would go back to being read as HTML when they upload it to their webserver/return it from PHP/etc. Then, when something went wrong/worked differently than expected, the author would blame XHTML.

Edit: I see that I'm getting downvoted here; if any of this is factually incorrect I would like to be educated please.

> The reason XHTML failed is because the spec required it to be sent with a new MIME type (application/xml+xhtml I believe) which no webserver did out of the box. Everything defaulted to text/html, which all browsers would interpret as HTML, and given the mismatching doctype, would interpret as tag soup (quirks mode/lenient).

None of that is correct.

It was perfectly spec. compliant to label XHTML as text/html. The spec. that covers this is RFC 2854 and it states:

> The text/html media type is now defined by W3C Recommendations; the latest published version is [HTML401]. In addition, [XHTML1] defines a profile of use of XHTML which is compatible with HTML 4.01 and which may also be labeled as text/html.

https://datatracker.ietf.org/doc/html/rfc2854

There’s no spec. that says you need to parse XHTML served as text/html as HTML not XHTML. As the spec. says, text/html covers both HTML and XHTML. That’s something that browsers did but had no obligation to.

The mismatched doctype didn’t trigger quirks mode. Browsers don’t care about that. The prologue could, but XHTML 1.0 Appendix C told you not to use that anyway.

Even if it did trigger quirks mode, that makes no difference in terms of tag soup. Tag soup is when you mis-nest tags, for instance <strong><em></strong></em>. Quirks mode was predominantly about how it applied CSS layout. There are three different concepts being mixed up here: being parsed as HTML, parsing tag soup, and doctype switching.

The problem with serving application/xhtml+xml wasn’t anything to do with web servers. The problem was that Internet Explorer 6 didn’t support it. After Microsoft won the browser wars, they discontinued development and there was a five year gap between Internet Explorer 6 and 7. Combined with long upgrade cycles and operating system requirements, this meant that Internet Explorer 6 had to be supported for almost 15 years globally.

Obviously, if you can’t serve XHTML in a way browsers will parse as XML for a decade and a half, this inevitably kills XHTML.

  • Okay, I guess I got a fair bit of the details wrong. However, there's one detail I want to push back on:

    > In addition, [XHTML1] defines a profile of use of XHTML which is compatible with HTML 4.01 and which may also be labeled as text/html.

    If you read this carefully, you'll see that it's not saying that text/html can be used to label XHTML. It's saying that you can use text/html if you write your XHTML in such a way that it's compatible with HTML 4.01, because the browser will parse and interpret it as HTML.

    You're correct that the doctype wasn't the reason it was treated as tag soup. It was instead because of the parts of XHTML that are not directly compatible with HTML 4.01.

    The mismatch between local files and websites served as text/html was very real and I experienced it myself. It's curious that you'd think I'd make it up. There were differences in behavior, especially when JavaScript was involved (notably: Element.tagName is all-uppercase in HTML but lowercase in XHTML) and it is absolutely the case that developers like myself blamed this on XHTML.

Isn't that what the <!DOCTYPE> tag was supposed to solve?

  • Yes, I covered that; everyone assumed that you only needed to specify the doctype, but in practice browsers only accepted it for local files or HTTP responses with Content-Type: application/xml+xhtml. I've edited the comment to make that more explicit.

    • Ah, I see. Yeah, that's a bit silly. They should've gone for "MUST have doctype, SHOULD have content type".