Comment by Animats
1 day ago
- Photon, the graphical interface for QNX. Oriented more towards real time (widgets included gauges) but good enough to support two different web browsers. No delays. This was a real time operating system.
- MacOS 8. Not the Linux thing, but Copeland. This was a modernized version of the original MacOS, continuing the tradition of no command line. Not having a command line forces everyone to get their act together about how to install and configure things. Probably would have eased the tradition to mobile. A version was actually shipped to developers, but it had to be covered up to justify the bailout of Next by Apple to get Steve Jobs.
- Transaction processing operating systems. The first one was IBM's Customer Information Control System. A transaction processor is a kind of OS where everything is like a CGI program - load program, do something, exit program. Unix and Linux are, underneath, terminal oriented time sharing systems.
- IBM MicroChannel. Early minicomputer and microcomputer designers thought "bus", where peripherals can talk to memory and peripherals look like memory to the CPU. Mainframes, though, had "channels", simple processors which connected peripherals to the CPU. Channels could run simple channel programs, and managed device access to memory. IBM tried to introduce that with the PS2, but they made it proprietary and that failed in the marketplace. Today, everything has something like channels, but they're not a unified interface concept that simplifies the OS.
- CPUs that really hypervise properly. That is, virtual execution environments look just like real ones. IBM did that in VM, and it worked well because channels are a good abstraction for both a real machine and a VM. Storing into device registers to make things happen is not. x86 has added several layers below the "real machine" layer, and they're all hacks.
- The Motorola 680x0 series. Should have been the foundation of the microcomputer era, but it took way too long to get the MMU out the door. The original 68000 came out in 1978, but then Motorola fell behind.
- Modula. Modula 2 and 3 were reasonably good languages. Oberon was a flop. DEC was into Modula, but Modula went down with DEC.
- XHTML. Have you ever read the parsing rules for HTML 5, where the semantics for bad HTML were formalized? Browsers should just punt at the first error, display an error message, and render the rest of the page in Times Roman. Would it kill people to have to close their tags properly?
- Word Lens. Look at the world through your phone, and text is translated, standalone, on the device. No Internet connection required. Killed by Google in favor of hosted Google Translate.
> MacOS 8. Not the Linux thing, but Copeland. This was a modernized version of the original MacOS, continuing the tradition of no command line. Not having a command line forces everyone to get their act together about how to install and configure things. Probably would have eased the tradition to mobile. A version was actually shipped to developers, but it had to be covered up to justify the bailout of Next by Apple to get Steve Jobs.
You have things backwards. The Copland project was horribly mismanaged. Anybody at Apple who came up with a new technology got it included in Copland, with no regard to feature creep or stability. There's a leaked build floating around from shortly before the project was cancelled. It's extremely unstable and even using basic desktop functionality causes hangs and crashes. In mid-late 1996, it became clear that Copland would never ship, and Apple decided the best course of action was to license an outside OS. They considered options such as Solaris, Windows NT, and BeOS, but of course ended up buying NeXT. Copland wasn't killed to justify buying NeXT, Apple bought NeXT because Copland was unshippable.
>- XHTML. [...] Would it kill people to have to close their tags properly?
XHTML appeals to the intuition that there should be a Strict Right Way To Do Things ... but you can't use that unforgiving framework for web documents that are widely shared.
The "real world" has 2 types of file formats:
(1) file types where consumers cannot contact/control/punish the authors (open-loop) : HTML, pdf, zip, csv, etc. The common theme is that the data itself is more important that the file format. That's why Adobe Reader will read malformed pdf files written by buggy PDF libraries. And both 7-Zip and Winrar can read malformed zip files with broken headers (because some old buggy Java libraries wrote bad zip files). MS Excel can import malformed csv files. E.g. the Citi bank export to csv wrote a malformed file and it was desirable that MS Excel imported it anyway because the raw data of dollar amounts was more important than the incorrect commas in the csv file -- and -- I have no way of contacting the programmer at Citi to tell them to fix their buggy code that created the bad csv file.
(2) file types where the consumer can control the author (closed-loop): programming language source code like .c, .java, etc or business interchange documents like EDI. There's no need to have a "lenient forgiving" gcc/clang compiler to parse ".c" source code because the "consumer-and-author" will be the same person. I.e. the developer sees the compiler stop at a syntax error so they edit and fix it and try to re-compile. For business interchange formats like EDI, a company like Walmart can tell the vendor to fix their broken EDI files.
XHTML wants to be in group (2) but web surfers can't control all the authors of .html so that's why lenient parsing of HTML "wins". XHTML would work better in a "closed-loop" environment such as a company writing internal documentation for its employees. E.g. an employee handbook can be written in strict XHTML because both the consumers and authors work at the same company. E.g. can't see the vacation policy because the XHTML syntax is wrong?!? Get on the Slack channel and tell the programmer or content author to fix it.
The problem is that group (1) results in a nightmarish race-to-the-bottom. File creators have zero incentive to create spec-compliant files, because there's no penalty for creating corrupted files. In practice this means a large proportion of documents are going to end up corrupt. Does it open in Chrome? Great, ship it! The file format is no longer the specification, but it has now become a wild guess at whatever weird garbage the incumbent is still willing to accept. This makes it virtually impossible to write a new parser, because the file format suddenly has no specification.
On the other hand, imagine a world where Chrome would slowly start to phase out its quirks modes. Something like a yellow address bar and a "Chrome cannot guarantee the safety of your data on this website, as the website is malformed" warning message. Turn it into a red bar and a "click to continue" after 10 years, remove it altogether after 20 years. Suddenly it's no longer that one weird customer who is complaining, but everyone - including your manager. Your mistakes are painfully obvious during development, so you have a pretty good incentive to properly follow the spec. You make a mistake on a prominent page and the CTO sees it? Well, guess you'll be adding an XHTML validator to your CI pipeline next week!
It is very tempting to write a lenient parser when you are just one small fish in a big ecosystem, but over time it will inevitably lead to the degradation of that very ecosystem. You need some kind of standards body to publish a validating reference parser. And like it or not, Chrome is big enough that it can act as one for HTML.
>File creators have zero incentive to create spec-compliant files, because there's no penalty for creating corrupted files
This depends. If you are a small creator with a unique corruption then you're likely out of luck. The problem with big creators is 'fuck you' I do what I want.
>"Chrome cannot guarantee the safety of your data on this website, as the website is malformed" warning message.
This would appear on pretty much every website. And it would appear on websites that are no longer updated and they'd functionally disappear from any updated browser. In addition the 10-20 year thing just won't work in US companies, simply put if they get too much pressure next quarter on it, it's gone.
>Your mistakes are painfully obvious during development,
Except this isn't how a huge number of websites work. They get html from many sources and possibly libraries. Simply put no one is going to follow your insanity, hence why xhtml never worked in the first place. They'll drop Chrome before they drop the massive amount of existing and potential bugs out there.
>And like it or not, Chrome is big enough that it can act as one for HTML.
And hopefully in a few years between the EU and US someone will bust parts of them up.
1 reply →
That would break decades of the web with no incentive for Google to do so. Plus, any change of that scale that they make is going to draw antitrust consideration from _somebody_.
You’re right, but even standards bodies aren’t enough. At the end of the day, it’s always about what the dominant market leader will accept. The standard just gives your bitching about the corrupted files some abstract moral authority, but that’s about it.
I’d argue a good comparison here is HTTPS. Everyone decided it would be good for sites to move over to serving via HTTPS so browsers incentivised people to move by gating newer features to HTTPS only. They could have easily done the same with XHTML had they wanted.
The opportunities to fix this were pretty abundant. For instance, it would take exactly five words from Google to magically make a vast proportion of web pages valid XHTML:
> We rank valid XHTML higher
It doesn’t even have to be true!
1 reply →
> That's why Adobe Reader will read malformed pdf files written by buggy PDF libraries.
No, the reason is that Adobe’s implementation never bothered to perform much validation, and then couldn’t add strict validation retroactively because it would break too many existing documents.
And it’s really the same for HTML.
This is an argument for a repair function that transforms a broken document into a well-formed one without loss but keeps the spec small, simple and consistent. It's not an argument for baking malformations into a complex messy spec.
We could've made the same arguments for supporting Adobe Flash on the iPhone.
And yet Apple decided that no, this time we do it the "right" way[1], stuck with plain HTML/CSS/JS and frankly we're all better for it.
[1] I'm aware this is a massive oversimplification and there were more cynical reasons behind dropping the flash runtime from iOS, but they're not strictly relevant to this discussion.
> - XHTML. Have you ever read the parsing rules for HTML 5, where the semantics for bad HTML were formalized? Browsers should just punt at the first error, display an error message, and render the rest of the page in Times Roman. Would it kill people to have to close their tags properly?
Amen. Postel’s Law was wrong:
https://datatracker.ietf.org/doc/html/rfc9413
We stop at the first sign of trouble for almost every other format, we do not need lax parsing for HTML. This has caused a multitude of security vulnerabilities and only makes it more difficult for pretty much everybody.
The attitude towards HTML5 parsing seemed to grow out of this weird contrarianism that everybody who wanted to do better than whatever Internet Explorer did had their head in the clouds and that the role of a standard was just to write down all the bugs.
Just to remind you that <bold> <italic> text </bold> </italic> [0] that has been working for ages in every browser ever, is NOT a valid XHTML, and should be rejected by GP's proposal.
I, for one, is kinda happy that XHTML is dead.
[0]: By <bold> I mean <b> and by <italic> I mean <i>, and the reason it's not valid HTML is that the order of closing is not reverse of the order of opening as it should properly be.
That caused plenty of incompatibilities in the past. At one point, Internet Explorer would parse that and end up with something that wasn’t even a tree.
HTML is not a set of instructions that you follow. It’s a terrible format if you treat it that way.
It’s totally valid XHTML, just not recognized.
XHTML allows you to use XML and <bold> <italic> are just XML nodes with no schema. The correct form has been and will always be <b> and <i>. Since the beginning.
8 replies →
[flagged]
I was all gung ho on XHTML back in the day until I realized that a single unclosed tag in an ad or another portion of our app that I had no control over would cause the entire page to fail. The user would see nothing except a giant ugly error. And your solution of rendering the rest of the page in Times New Roman isn’t an option. Do you try to maintain any of the HTML semantics or just render plain text? If it’s plain text, that’s useless. If you’re rendering anything with any semantics, then you need to know how to parse it. You’re back where you started.
Granted, I could ensure that my code was valid XHTML, but I’m a hypermeticulous autistic weirdo, and most other people aren’t. As much as XHTML “made sense”, it was completely unworkable in reality, because most people are slobs. Sometimes, worse really is better.
if the world was all XHTML, then you wouldn't put an ad on your site that wasn't valid XHTML, the same way you wouldn't import a python library that's not valid python.
>, then you wouldn't put an ad on your site that wasn't valid XHTML,
You're overlooking how incentives and motivations work. The gp (and their employer) wants to integrate the advertisement snippet -- even with broken XHTML -- because they receive money for it.
The semantic data ("advertiser's message") is more important than the format ("purity of perfect XHTML").
Same incentives would happen with a jobs listing website like Monster.com. Consider that it currently has lots of red errors with incorrect HTML: https://validator.w3.org/nu/?doc=https%3A%2F%2Fwww.monster.c...
If there was a hypothetical browser that refused to load that Monster.com webpage full of errors because it's for the users' own good and the "good of the ecosystem"... the websurfers would perceive that web browser as user-hostile and would choose another browser that would be forgiving of those errors and just load the page. Job hunters care more about the raw data of the actual job listings so they can get a paycheck rather than invalid <style> tags nested inside <div> tags.
Those situations above are a different category (semantic_content-overrides-fileformatsyntax) than a developer trying to import a Python library with invalid syntax (fileformatsyntax-Is-The-Semantic_Content).
EDIT reply to: >Make the advertisement block an iframe [...] If the advertiser delivers invalid XHTML code, only the advertisement won't render.
You're proposing a "technical solution" to avoid errors instead of a "business solution" to achieve a desired monetary objective. To re-iterate, they want to render the invalid XHTML code so your idea to just not render it is the opposite of the goal.
In other words, if rendering imperfect-HTML helps the business goal more than blanking out invalid XHTML in an iframe, that means HTML "wins" in the marketplace of ideas.
3 replies →
But all it takes in that world is for a single browser vendor to decide - hey, we will even render broken XHTML, because we would rather show something than nothing - and you’re back to square one.
I know which I, as a user, would prefer. I want to use a browser which lets me see the website, not just a parse error. I don’t care if the code is correct.
In practice things like that did happen, though. e.g. this story of someone's website displaying user-generated content with a character outside their declared character set: https://web.archive.org/web/20060420051806/http://diveintoma...
Yes, you would be able to put an ad on your site that wasn't XHTML, because XHTML is just text parsed in the browser at runtime. And yes, that would fail, silently, or with a cryptic error
The most sensible option would be to just show the error for the ad part of the website.
Also, the whole argument falls apart the moment the banner has a javascript error too. Should we attempt to run malformed code just in case? Or should a browser start shipping shims and compatibility fixes for known broken websites like microsoft do for windows apps?
> Would it kill people to have to close their tags properly
It would kill the approachability of the language.
One of the joys of learning HTML when it tended to be hand-written was that if you made a mistake, you'd still see something just with distorted output.
That was a lot more approachable for a lot of people who were put off "real" programming languages because they were overwhelmed by terrible error messages any time they missed a bracket or misspelled something.
If you've learned to program in the last decade or two, you might not even realise just how bad compiler errors tended to be in most languages.
The kind of thing where you could miss a bracket on line 47 but end up with a compiler error complaining about something 20 lines away.
Rust ( in particular ) got everyone to bring up their game with respect to meaningful compiler errors.
But in the days of XHTML? Error messages were arcane, you had to dive in to see what the problem actually was.
If you forget a closing quote on an attribute in html, all content until next quote is ignored and not rendered - even if it is the rest of the page. I dont think this is more helpful than an error message. It was just simpler to implement.
Let's say you forget to close a <b></b> element.
What happens?
Even today, after years of better error messages, the strict validator at https://validator.w3.org/check says:
What is line 22?
It's up to you to go hunting back through the document, to find the un-closed 'b' tag.
Back in the day, the error messages were even more misleading than this, often talking about "Extra content at end of document" or similar.
Compare that to the very visual feedback of putting this exact document into a browser.
You get more bold text than you were expecting, the bold just runs into the next text.
That's a world of difference, especially for people who prefer visual feedback to reading and understanding errors in text form.
Try it for yourself, save this document to a .html file and put it through the XHTML validator.
4 replies →
I can "handwrite" C, Python, etc. just fine and they don't assign fallback meanings to syntax errors.
> Rust ( in particular ) got everyone to bring up their game with respect to meaningful compiler errors.
This was also part of the initial draw of `clang`.
Nice list. Some thoughts:
- I think without the move to NeXT, even if Jobs had come back to Apple, they would never have been able to get to the iPhone. iOS was - and still is - a unix-like OS, using unix-like philosophy, and I think that philosophy allowed them to build something game-changing compared to the SOTA in mobile OS technology at the time. So much so, Android follows suit. It doesn't have a command line, and installation is fine, so I'm not sure your line of reasoning holds strongly. One thing I think you might be hinting at though that is a missed trick: macOS today could learn a little from the way iOS and iPadOS is forced to do things and centralise configuration in a single place.
- I think transaction processing operating systems have been reinvented today as "serverless". The load/execute/quit cycle you describe is how you build in AWS Lambdas, GCP Cloud Run Functions or Azure Functions.
- Most of your other ideas (with an exception, see below), died either because of people trying to grab money rather than build cool tech, and arguably the free market decided to vote with its feet - I do wonder when we might next get a major change in hardware architectures again though, it does feel like we've now got "x86" and "ARM" and that's that for the next generation.
- XHTML died because it was too hard for people to get stuff done. The forgiving nature of the HTML specs is a feature, not a bug. We shouldn't expect people to be experts at reading specs to publish on the web, nor should it need special software that gatekeeps the web. It needs to be scrappy, and messy and evolutionary, because it is a technology that serves people - we don't want people to serve the technology.
> XHTML died because it was too hard for people to get stuff done.
This is not true. The reason it died was because Internet Explorer 6 didn’t support it, and that hung around for about a decade and a half. There was no way for XHTML to succeed given that situation.
The syntax errors that cause XHTML to stop parsing also cause JSX to stop parsing. If this kind of thing really were a problem, it would have killed React.
People can deal with strict syntax. They can manage it with JSX, they can manage it with JSON, they can manage it with JavaScript, they can manage it with every back-end language like Python, PHP, Ruby, etc. The idea that people see XHTML being parsed strictly and give up has never had any truth to it.
> The syntax errors that cause XHTML to stop parsing also cause JSX to stop parsing. If this kind of thing really were a problem, it would have killed React.
JSX is processed during the build step, XHTML is processed at runtime, by the browser.
5 replies →
They would have gotten another modern OS instead of Next as the base for MacOSX (then iOS).
Another possibility they were exploring was buying BeOS, which would have been pretty interesting because it was an OS built from scratch in the 90's without any of the cruft from the 70's.
Also, the only thing specific to Next that survived in MacOSX and iOS was ObjectiveC and the whole NextStep APIs, which honestly I don't think it a great thing. It was pretty cool in the 90's but when the iPhone was released it was already kinda obsolete. For the kernel, Linux or FreeBSD would have worked just the same.
> without any of the cruft from the 70's
By "cruft" you mean "lessons learned", right?
There is hardly any UNIX stuff for iOS and Android applications sold via the respective app stores.
You won't get far with POSIX on any of the platforms.
Didn't Google already own Android when iOS was announced?
Yes, and they were going to position it against Windows Mobile.
When iOS was announced, Google scrambled to re-do the entire concept
3 replies →
On XHTML, I think there was room for both HTML and a proper XHTML that barks on errors. If you're a human typing HTML or using a language where you build your HTML by concatenation like early PHP, sure it makes sense to allow loosey goosey HTML but if you're using any sort of simple DOM builder which should preclude you from the possibility of outputting invalid HTML, strict XHTML makes a lot more sense.
Honestly I'm disappointed the promised XHTML5 never materialized along side HTML5. I guess it just lost steam.
But a HTML5 parser will obviously parse "strict" HTML5 just fine too, what value is there to special-case the "this was generated by a DOM builder" path client-side?
> Honestly I'm disappointed the promised XHTML5 never materialized along side HTML5. I guess it just lost steam.
The HTML Standard supports two syntaxes, HTML and XML. All browsers support XML syntax just fine—always have, and probably always will. Serve your file as application/xhtml+xml, and go ham.
2 replies →
> Modula. Modula 2 and 3 were reasonably good languages. Oberon was a flop. DEC was into Modula, but Modula went down with DEC.
If you appreciate Modula's design, take a look at Nim[1].
I remember reading the Wikipedia page for Modula-3[2] and thinking "huh, that's just like Nim" in every other section.
[1] https://nim-lang.org
[2] https://en.wikipedia.org/wiki/Modula-3
Swift, D and C# are also quite close to Modula-3 in spirit and features.
> Would it kill people to have to close their tags properly?
Probably not, but what would be the benefit of having more pages fail to render? If xhtml had been coupled with some cool features which only worked in xhtml mode, it might have become successful, but on its own it does not provide much value.
> but what would be the benefit of having more pages fail to render?
I think those benefits are quite similar to having more programs failing to run (due to static and strong typing, other static analysis, and/or elimination of undefined behavior, for instance), or more data failing to be read (due to integrity checks and simply strict parsing): as a user, you get documents closer to valid ones (at least in the rough format), if anything at all, and additionally that discourages developers from shipping a mess. Then parsers (not just those in viewers, but anything that does processing) have a better chance to read and interpret those documents consistently, so even more things work predictably.
Sure, authoring tools should help authors avoid mistakes and produce valid content. But the browser is a tool for the consumer of content, and there is no benefit for the user if it fails to to render some existing pages.
It is like Windows jumping through hoops to support backwards compatibility even with buggy software. The interest of the customer is that the software runs.
6 replies →
HTML5 was the answer for the consistency part: where before browsers did different things to recover from "invalid" HTML, HTML5 standardizes it because it doesn't care about valid/invalid as much, it just describes behavior anyways.
I used to run an RSS feed consolidator, badly formed XML was the bane of my life for a very long time.
If devs couldn't even get RSS right, a web built on XHTML was a nonstarter.
XHTML is XML. XML-based markup for content can be typeset into PDF, suitable for print media. I invite you to check out the PDFs listed in the intro to my feature matrix comparison page, all being sourced from XHTML:
https://keenwrite.com/blog/2025/09/08/feature-matrix/
> XHTML. Have you ever read the parsing rules for HTML 5, where the semantics for bad HTML were formalized?
I actually have, and its not that bad.
If anything, the worst part is foreign content (svg, mathml) which have different rules more similar to xml but also not the same as xml.
Just as an aside, browsers still support xhtml, just serve with application/xhtml+xml mime type, and it all works including aggressive error checking. This is very much a situation where consumers are voting with their feet not browser vendors forcing a choice.
> - XHTML. Have you ever read the parsing rules for HTML 5, where the semantics for bad HTML were formalized? Browsers should just punt at the first error, display an error message, and render the rest of the page in Times Roman. Would it kill people to have to close their tags properly?
IMO there's a place for XHTML as a generated output format, but I think HTML itself should stay easy to author and lightweight as a markup format. Specifically when it comes to tag omission, if I'm writing text I don't want to see a bunch of `</li>` or `</p>` everywhere. It's visual noise, and I just want a lightweight markup.
+1 Copland
BeOS. I like to daydream about an alternate reality where it was acquired by Sony, and used as the foundation for PlayStation, Sony smartphones, and eventually a viable alternative to Windows on their Vaio line.
Neal Stephenson, https://web.stanford.edu/class/cs81n/command.txt :
> Imagine a crossroads where four competing auto dealerships are situated… (Apple) sold motorized vehicles--expensive but attractively styled cars with their innards hermetically sealed, so that how they worked was something of a mystery.
> (Microsoft) is much, much bigger… the big dealership came out with a full-fledged car: a colossal station wagon (Windows 95). It had all the aesthetic appeal of a Soviet worker housing block, it leaked oil and blew gaskets, and it was an enormous success.
> On the other side of the road… (Be, Inc.) is selling fully operational Batmobiles (the BeOS). They are more beautiful and stylish even than the Euro-sedans, better designed, more technologically advanced, and at least as reliable as anything else on the market--and yet cheaper than the others.
> … and Linux, which is right next door, and which is not a business at all. It's a bunch of RVs, yurts, tepees, and geodesic domes set up in a field and organized by consensus. The people who live there are making tanks.
It would be years before OS X could handle things that wouldn’t cause BeOS to break a sweat, and BeOS still has a bit of a responsiveness edge that OS X still can't seem to match (probably due to the PDF rendering layer).
Somehow I feel C# has become the right successor to Modula-3 ideas, even if has taken 25 years to get there.
GCC nowadays has Modula-2 as official frontend, not sure how much it will get used though.
XHTML, yep I miss it, was quite into it back then.
In addition to Photon, I would say QNX itself (the desktop OS). I ran QNX 6 Neutrino on my PIII 450 back in the day, and the experience was so much more better than every other mainstream OS on the market. The thing that blew me away was how responsive the desktop was while multitasking, something Linux struggled with even decades later.
Similarly, I'm also gutted that the QNX 1.44MB demo floppy didn't survive past the floppy era - they had some really good tech there. Imagine if they pitched it as a rescue/recovery OS for PCs, you could've run it entirely from the UEFI. Or say as an OS for smart TVs and other consumer smart devices.
> IBM MicroChannel. Early minicomputer and microcomputer designers thought "bus", where peripherals can talk to memory and peripherals look like memory to the CPU. Mainframes, though, had "channels", simple processors which connected peripherals to the CPU.
TIL: what microchannel meant by micro and channel.
Also it had OS independent device-class drivers.
And you could stuff a new CPU on a card and pop it right in. Went from a 286+2MB to a 486dx2+32MB.
Word lens team was bought by google, its far better in google translate then the local app ever was. You could repeat the old app with a local LLM now pretty easily but it still won't be as close in quality as using google translate
> word lens
I don't know if you know it, that's a feature of Google Lens
CICS is still going strong as part of ZOS. There are industries where green screen, mainframe terminal apps still rule and CICS is driving them.
CICS seems perfectly fine in problem spaces where requirements change slowly enough than one can trade development time for reliability (read: finance and insurance).
I love this mismatched list of grievances and I find myself agreeing with most of them. XHTML and proper CPU hypervisors in particular.
People being too lazy to close the <br /> tag was apparently a gateway drug into absolute mayhem. Modern HTML is a cesspool. I would hate to have to write a parser that's tolerant enough to deal with all the garbage people throw at it. Is that part of the reason why we have so few browsers?
> People being too lazy to close the <br /> tag was apparently a gateway drug into absolute mayhem.
Your chronology is waaaaaaaaaaaay off.
<BR> came years before XML was invented. It was a tag that didn’t permit children, so writing it <BR></BR> would have been crazy, and inventing a new syntax like <BR// or <BR/> would have been crazy too. Spelling it <BR> was the obvious and reasonable choice.
The <br /> or <br/> spelling was added to HTML after XHTML had already basically lost, as a compatibility measure for porting back to HTML, since those enthusiastic about XHTML had taken to writing it and it was nice having a compatible spelling that did the same in both. (In XHTML you could also write <br></br>, but that was incorrect in HTML; and if you wrote <br /> in HTML it was equivalent to <br /="">, giving you one attribute with name "/" and value "". There were a few growing pains there, such as how <input checked> used to mean <input checked="checked">—it was actually the attribute name that was being omitted, not the value!—except… oh why am I even writing this, messy messy history stuff, engines doing their own thing blah blah blah, these days it’s <input checked="">.
Really, the whole <… /> thing is more an artefact of an arguably-misguided idea after a failed reform. The absolute mayhem came first, not last.
> I would hate to have to write a parser that's tolerant enough to deal with all the garbage people throw at it.
The HTML parser is magnificent, by far the best spec for something reasonably-sized that I know of. It’s exhaustively defined in terms of state machines. It’s huge, far larger than one would like it to be because of all this compatibility stuff, but genuinely easy to implement if you have the patience. Seriously, go read it some time, it’s really quite approachable.
> The <br /> or <br/> spelling was added to HTML after XHTML had already basically lost
This is untrue. This is the first public draft of XHTML from 1998:
> Include a space before the trailing / and > of empty elements, e.g. <br />, <hr /> and <img src="karen.jpg" alt="Karen" />.
— https://www.w3.org/TR/1998/WD-html-in-xml-19981205/#guidelin...
1 reply →
Not really. HTML5 parsing is very well documented and quite easy compared to all the other things a browser needs.
CICS and HATS are perhaps the most annoying pieces of technology I’ve ever encountered.
The reason XHTML failed is because the spec required it to be sent with a new MIME type (application/xml+xhtml I believe) which no webserver did out of the box. Everything defaulted to text/html, which all browsers would interpret as HTML, and given the mismatching doctype, would interpret as tag soup (quirks mode/lenient).
Meanwhile, local files with the doctype would be treated as XHTML, so people assumed the doctype was all you needed. So everyone who tried to use XHTML didn't realize that it would go back to being read as HTML when they upload it to their webserver/return it from PHP/etc. Then, when something went wrong/worked differently than expected, the author would blame XHTML.
Edit: I see that I'm getting downvoted here; if any of this is factually incorrect I would like to be educated please.
> The reason XHTML failed is because the spec required it to be sent with a new MIME type (application/xml+xhtml I believe) which no webserver did out of the box. Everything defaulted to text/html, which all browsers would interpret as HTML, and given the mismatching doctype, would interpret as tag soup (quirks mode/lenient).
None of that is correct.
It was perfectly spec. compliant to label XHTML as text/html. The spec. that covers this is RFC 2854 and it states:
> The text/html media type is now defined by W3C Recommendations; the latest published version is [HTML401]. In addition, [XHTML1] defines a profile of use of XHTML which is compatible with HTML 4.01 and which may also be labeled as text/html.
— https://datatracker.ietf.org/doc/html/rfc2854
There’s no spec. that says you need to parse XHTML served as text/html as HTML not XHTML. As the spec. says, text/html covers both HTML and XHTML. That’s something that browsers did but had no obligation to.
The mismatched doctype didn’t trigger quirks mode. Browsers don’t care about that. The prologue could, but XHTML 1.0 Appendix C told you not to use that anyway.
Even if it did trigger quirks mode, that makes no difference in terms of tag soup. Tag soup is when you mis-nest tags, for instance <strong><em></strong></em>. Quirks mode was predominantly about how it applied CSS layout. There are three different concepts being mixed up here: being parsed as HTML, parsing tag soup, and doctype switching.
The problem with serving application/xhtml+xml wasn’t anything to do with web servers. The problem was that Internet Explorer 6 didn’t support it. After Microsoft won the browser wars, they discontinued development and there was a five year gap between Internet Explorer 6 and 7. Combined with long upgrade cycles and operating system requirements, this meant that Internet Explorer 6 had to be supported for almost 15 years globally.
Obviously, if you can’t serve XHTML in a way browsers will parse as XML for a decade and a half, this inevitably kills XHTML.
Okay, I guess I got a fair bit of the details wrong. However, there's one detail I want to push back on:
> In addition, [XHTML1] defines a profile of use of XHTML which is compatible with HTML 4.01 and which may also be labeled as text/html.
If you read this carefully, you'll see that it's not saying that text/html can be used to label XHTML. It's saying that you can use text/html if you write your XHTML in such a way that it's compatible with HTML 4.01, because the browser will parse and interpret it as HTML.
You're correct that the doctype wasn't the reason it was treated as tag soup. It was instead because of the parts of XHTML that are not directly compatible with HTML 4.01.
The mismatch between local files and websites served as text/html was very real and I experienced it myself. It's curious that you'd think I'd make it up. There were differences in behavior, especially when JavaScript was involved (notably: Element.tagName is all-uppercase in HTML but lowercase in XHTML) and it is absolutely the case that developers like myself blamed this on XHTML.
Isn't that what the <!DOCTYPE> tag was supposed to solve?
Yes, I covered that; everyone assumed that you only needed to specify the doctype, but in practice browsers only accepted it for local files or HTTP responses with Content-Type: application/xml+xhtml. I've edited the comment to make that more explicit.
1 reply →