Serious question: why would you ever want to not close tags? It saves a couple of key strokes, but we have snippets in our editors, so the amount of typing is the same. Closed tags allow editors like Vim or automated tools to handle the source code easier; e.g. I can type `dit` in Vim to delete the contents of a tag, something that's only possible because the tag's content is clearly delimited. It makes parsing HTML easier because there are fewer syntax rules.
I learned HTML quite late, when HTML 5 was already all the rage, and I never understood why the more strict rules of XML for HTML never took off. They seem so much saner than whatever soup of special rules and exceptions we currently have. HTML 5 was an opportunity to make a clear cut between legacy HTML and the future of HTML. Even though I don't have to, I strive to adhere to the stricter rules of closing all tags, closing self-closing tags and only using lower-case tag names.
> I never understood why the more strict rules of XML for HTML never took off
Internet Explorer failing to support XHTML at all (which also forced everyone to serve XHTML with the HTML media type and avoid incompatible syntaxes like self-closing <script />), Firefox at first failing to support progressive rendering of XHTML, a dearth of tooling to emit well-formed XHTML (remember, those were the days of PHP emitting markup by string concatenation) and the resulting fear of pages entirely failing to render (the so-called Yellow Screen of Death), and a side helping of the WHATWG cartel^W organization declaring XHTML "obsolete". It probably didn't help that XHTML did not offer any new features over tag-soup HTML syntax.
I think most of those are actually no longer relevant, so I still kind of hope that XHTML could have a resurgence, and that the tag-soup syntax could be finally discarded. It's long overdue.
What I never understood was why, for HTML specifically, syntax errors are such a fundamental unsolvable problem that it's essential that browsers accept bad content.
Meanwhile, in any other formal language (including JS and CSS!), the standard assumption is that syntax errors are fatal, the responsibility for fixing lies with the page author, but also that fixing those errors is not a difficult problem.
I was there, Gandalf. I was there 30 years ago. I was there when the strength of men failed.
Netscape started this. NCSA was in favor of XML style rules over SGML, but Netscape embraced SGML leniency fully and several tools of that era generated web pages that only rendered properly in Netscape. So people voted with their feet and went to the panderers. If I had a dollar for every time someone told me, “well it works in Netscape” I’d be retired by now.
> It probably didn't help that XHTML did not offer any new features over tag-soup HTML syntax.
Well, this is not entirely true: XML namespaces enabled attaching arbitrary data to XHTML elements in a much more elegant, orthogonal way than the half-assed solution HTML5 ended up with (the data-* attribute set), and embedding other XML applications like XForms, SVG and MathML (though I am not sure how widely supported this was at the time; some of this was backported into HTML5 anyway, in a way that later led to CVEs). But this is rather niche.
Emitting correct XHTML was not that hard. The biggest problem was that browsers supported plugins that could corrupt whole page. If you created XHTML webpage you had to handle bug reports caused by poorly written plugins.
Why did markdown become popular when we already have html? Because markdown is much easier to write by hand in a simple text editor.
Original SGML was actually closer to markdown. It had various options to shorten and simplify the syntax, making it easy to write and edit by hand, while still having an unambiguous structure.
The verbose and explicit structure of xhtml makes it easier to process by tools, but more tedious for humans.
Personally I think Markdown got _really_ popular not because it is easier to write but because it is easier to read.
It’s kind of a huge deal that I can give a Markdown file of plain text content to somebody non-technical and they aren’t overwhelmed by it in raw form.
Imho the real strength of markdown is it forces people to stick to classes instead of styling. "I want to write in red comic Sans" " I don't care, you can't".
And markdown tables are harder to write than HTML tables. However, they are generally easier to read. Unless multi line cell.
User input data is always to be treated as suspect when it reaches the server and needs to be scanned and sanitised (if necessary) before accepting it for processing. Markdown makes this a lot easier to do and this is why it became popular.
A lot of HTML tags never have a body, so it makes no sense to close them. XML has self-closing tag syntax but it wasn't always handled well by browsers.
A p or li tag, at least when used and nested properly, logically ends where either the next one begins or the enclosing block ends. Closing li also creates the opportunity for nonsensical content inside of a list but not in any list item. Of course all of these corner cases are now well specified because people did close their tags sometimes.
> A p or li tag, at least when used and nested properly, logically ends where either the next one begins or the enclosing block ends
While this is true I’ve never liked it.
<p>blah<p>blah2</p>
Implies a closing </p> in the middle. But
<p>blah<span>blah2</p>
Does not. Obviously with the knowledge of the difference between what span and p represent I understand why but in terms of pure markup it’s always left a bad taste in my mouth. I’ll always close tags whenever relevant even if it’s not necessary.
> I never understood why the more strict rules of XML for HTML never took off.
Because of the vast quantity of legacy HTML content, largely.
> HTML 5 was an opportunity to make a clear cut between legacy HTML and the future of HTML.
WHATWG and its living standard that W3C took various versions of and made changes to and called it HTML 5, 5.1, etc., to pretend that they were still relevant in HTML, before finally giving up on that entirely, was a direct result of the failure of XHTML and the idea of a clear cut between legacy HTML and the future of HTML. It was a direct reaction against the “clear cut” approach based on experience, not an opportunity to repeat its mistakes. (Instead of a clear break, HTML incorporated the “more strict rules of XML” via the XML serialization for HTML; for the applications where that approach offers value, it is available and supported and has an object model 100% compatible with the more common form, and they are maintained together rather than competing.)
Because I want my hand-written HTML to look more like markdown-style languages. If I close those tags it adds visual noise and makes the text harder to read.
Besides, at this point technologies like tree-sitter make editor integration a moot point: once tree-sitter knows how to parse it, the editor does too.
For the same reason css still works if you make a typo and javascript super dynamic: its a friendly interface.
Html, css and js got used so much because you could mess around and still get something to work. While other languages that people use to write “serious” applications just screamed at you for not being smart enough to know how to allocate memory correctly.
Html and css is not a competitor to C. Its more like an alternative to file formats like txt or rtf. Meant to be written by hand in a text editor to get styled pages. So easy and forgiving your mom could do it! (And did, just like everyone else in the myspace days)
I built a testing framework, and I wanted it to generate HTML reports during testing with not post-processing report compilation step. I wanted the html in real-time so if a test was cut short for any reason from killing the job to power failure, you'd have a readable html report showing where things stopped.
I could do this by just appending divs as rows without closing any of the parent divs, body or html tags.
So the more general answer, anytime you want to continuously stream html and not want to wait until the end of the document to begin rendering.
I would argue the stricter rules did take off, most people always close <p>, it's pretty common to see <img/> over <img>—especially from people who write a lot of React.
But.
The future of HTML will forever contain content that was first handtyped in Notepad++ in 2001 or created in Wordpress in 2008. It's the right move for the browser to stay forgiving, even if you have rules in your personal styleguide.
> I learned HTML quite late, when HTML 5 was already all the rage, and I never understood why the more strict rules of XML for HTML never took off. They seem so much saner than whatever soup of special rules and exceptions we currently have.
XHTML came out at a time when Internet Explorer, the most popular browser, was essentially frozen apart from security fixes because Microsoft knew that if the web took off as a viable application platform it would threaten Windows' dominance. XHTML 1.1 Transitional was essentially HTML 4.01 except that if it wasn't also valid XML, the spec required the browser to display a yellow "parsing error" page rather than display the content. This meant that any "working" XHTML site might not display because the page author didn't test in your browser. It also meant that any XHTML site might break at any time because a content writer used a noncompliant browser like IE 6 to write an article, or because the developers missed an edge case that causes invalid syntax.
XHTML 2.0 was a far more radical design. Because IE 6 was frozen, XHTML 2.0 was written with the expectation that no current web browser would implement it, and instead was a ground-up redesign of the web written "the right way" that would eventually entirely replace all existing web browsers. For example, forms were gone, frames were gone, and all presentational elements like <b> and <i> were gone in favor of semantic elements like <strong> and <samp> that made it possible for a page to be reasoned about automatically by a program. This required different processing from existing HTML and XHTML documents, but there was no way to differentiate between "old" and "new" documents, meaning no thought was given to adding XHTML 2.0 support to browsers that supported existing web technologies. Even by the mid-2000s, asking everyone to restart the web from scratch was obviously unrealistic compared to incrementally improving it. See here for a good overview of XHTML 2.0's failure from a web browser implementor's perspective: https://dbaron.org/log/20090707-ex-html
Imagine if you were authoring and/or editing prose directly in html, as opposed to using some CMS. You're using your writing brain, not your coding brain. You don't want to think about code.
It's still a little annoying to put <p> before each paragraph, but not by that much. By contrast, once you start adding closing tags, you're much closer to computer code.
I'm not sure if that makes sense but it's the way I think about it.
In the case of <br/> and <img/> browsers will never use the content inside of the tag, so using a closing tag doesn't make sense. The slash makes it much clearer though, so missing it out is silly.
"Self-closing tags" are not a thing in HTML5. From the HTML standard:
> On void elements, [the trailing slash] does not mark the start tag as self-closing but instead is unnecessary and has no effect of any kind. For such void elements, it should be used only with caution — especially since, if directly preceded by an unquoted attribute value, it becomes part of the attribute value rather than being discarded by the parser.
It was mainly added to HTML5 to make it easier to convert XHTML pages to HTML5. IMO using the trailing slash in new pages is a mistake. It makes it appear as though the slash is what closes the element when in reality it does nothing and the element is self-closing because it's part of a hardcoded set of void elements. See here for more information: https://github.com/validator/validator/wiki/Markup-%C2%BB-Vo...
Self-closing tags do nothing in HTML though. They are ignored. And in some cases, adding them
obfuscates how browser’s will actually interpret the markup, or introduce subtle differences between HTML and JSX, for example.
How does the slash make it clearer? It's totally inert, so if you try to do the same thing with a non-void tag the results will not be what you expect!
Because browsers close some tags automatically. And if your closing tag is wrong, it'll generate empty element instead of being ignored. Without even emitting warning in developer console. So by closing tags you're risking introducing very subtle DOM bugs.
If you want to close tags, make sure that your building or testing pipeline ensures strict validation of produced HTML.
Guess what, you're not required to open <html>, <head>, or <body> either. It all follows from SGML tag inference rules, and the rules aren't that difficult to understand. What makes them appear magical is WHATWG's verbose ad-hoc parsing algorithm presentation explicitly listing eg. elements that close their parents originally captured from SGML but having become unmaintained as new elements were added. This already started to happen in the very first revision after Ian Hickson's initial procedural HTML parsing description ([1]).
I'd also wish people would stop calling every element-specific behavior HTML parsers do "liberal and tag-soup"-like. Yes WHATWG HTML does define error recovery rules, and HTML had introduced historic blunders to accomodate inline CSS and inline JS, but almost always what's being complained about are just SGML empty elements (aka HTML void elements) or tag omission (as described above) by folks not doing their homework.
HTML becomes pretty delightful for prototyping when you embrace this. You can open up an empy file and start typing tags with zero boilerplate. Drop in a script tag and forget about getElementById(); every id attribute already defines a JavaScript variable name directly, so go to town. Today the specs guarantee consistent behavior so this doesn't introduce compatiblity issues like it did in the bad old days of IE6. You can make surprisingly powerful stuff in a single file application with no fluff.
I just wish browsers weren't so anal about making you load things from http://localhost instead of file:// directly. Someone ought to look into fixing the security issues of file:// URLs so browsers can relax about that.
Welcome, kids, to how all web development was done 25-30 years ago. You typed up html, threw in some scripts (once JavaScript became a thing) and off you went. No CMS, no frameworks. I know a guy who wrote a fully functional client-side banking back office app in IE4 JS by posting into different frames and observing the DOM returned by the server. In 1999. Worked a treat on network speeds and workstation capabilities you literally can’t imagine today.
Things do not have to be complicated. That abstraction layer you are adding sure is elegant, but is it also necessary? Does it add more value than it consumes not just at the time of coding but throughout the entire lifecycle of the system? People have piled abstraction on top of hardware from day one, but one has to ask, if and when did we get past the point of diminishing returns? Kubernetes was supposed to be the thing that makes managing vms simple. Now there are things supposedly making managing Kubernetes simple. Maybe, just maybe, this computer-stuff is inherently complicated and we’re just adding to it by hoping all of it can eventually be made “simple”? Just look at the messages around vibe coding…
I liked learning this so much that I created a VSCode Extension to enable goto clicking and autocomplete and errors for single page html files and type hover so I can properly use it when i am prototyping.
> Someone ought to look into fixing the security issues of file:// URLs
If you mean full sandboxing of applications with a usable capability system, then yeah, someone ought to do that. But I wouldn't hold my breath, there's a reason why nobody did yet.
Yes i love quickly creating tools in a single file, if the tool gets really complex I'll switch to a sveltekit Static site. I have a default css file I use for all of them to make it even quicker and not look so much like AI slop.
I think every dev should have a tools.TheirDomain.zzz where they put different tools they create. You can make so many static tools and I feel like everyone creates these from time to time when they are prototyping things. There's so many free options for static hosting and you can write bash deploy scripts so quickly with AI, so its literally just ./deploy.sh to deploy. (I also recommend writing some reusable logic for saving to local storage/indexedDB so its even nicer.)
I guess you're replying to my comment because you were triggered by my last sentence. I wasn't criticizing you specifically, but yeah, in another comment you're writing
> It probably didn't help that XHTML did not offer any new features over tag-soup HTML syntax.
which unfortunately reaks of exactly the kind of roundabout HTML criticism that is not so helpful IMO. We have to face the possibility that most HTML documents have already been written at this point, at least if you value text by humans.
The CVEs you're referencing are due to said historic blunders allowing inline JS or otherwise tunneling foreign syntax in markup constructs (mutation XSSs are only triggered by serialising and reparsing HTML as part of bogus sanitizer libs anyway).
If you look at past comments of mine, you'll notice I'm staunchly criticizing inline JS and CSS (should always be placed in external "resources") and go as far as saying CSS or other ad-hoc item-value syntax should not even exist when attributes already serve this purpose.
The remaining CVE is made possible by Hickson's overly liberal rules for what's allowed or needs escaping in attributes vs SGML's much stricter rules.
Netscape Navigator did, in fact, reject invalid HTML. Then along came Internet Explorer and chose “render invalid HTML dwim” as a strategy. People, my young naive self included, moaned about NN being too strict.
NN eventually switched to the tag soup approach.
XHTML 1.0 arrived in 2000, attempting to reform HTML by recasting it as an XML application. The idea was to impose XML’s strict parsing rules: well-formed documents only, close all your tags, lowercase element names, quote all attributes, and if the document is malformed, the parser must stop and display an error rather than guess. XHTML was abandoned in 2009.
When HTML5 was being drafted in 2004-onwards, the WHATWG actually had to formally specify how browsers should handle malformed markup, essentially codifying IE’s error-recovery heuristics as the standard.
The article itself falsifies this explanation; IE wasn't released until August 1995. The HTML draft specs published prior to this already specified that these tags didn't need closing; these simply weren't invalid HTML in the first place.
The oldest public HTML documentation there is, from 1991, demonstrates that <li>, <dt>, and <dd> tags don't need to be closed! And the oldest HTML DTD, from 1992, explicitly specifies that these, as well as <p>, don't need closing. Remember, HTML is derived from SGML, not XML; and SGML, unlike XML, allows for the possibility of tags with optional close. The attempt to make HTML more XML-like didn't come until later.
But not closing <p> etc has always been valid HTML. Back from SGML it was possible for closing tags to be optional (depending on the DTD), and Netscape supported this from the beginning.
Leaving out closing tags is possible when the parsing is unambigous. E.g <p>foo<p>bar is unambiguous becuse p elements does not nest, so they close automatically by the next p.
The question about invalid HTML is a sepearate issue. E.g you can’t nest a p inside an i according to the spec, so how does a browser render that? Or lexical error like illegal characters in a non-quoted attribute value.
This is where it gets tricky. Render anyway, skip the invalid html, or stop rendering with an error message? HTML did not specify what to do with invalid input, so either is legal. Browsers choose to go with the “render anyway” approach, but this lead to different outputs in different browsers, since it wasn’t agreed upon how to render invald html.
The difference between Netscape and IE was that Netscape in more cases would skip rendering invalid HTML, where IE would always render the content.
Optinal tags have always been allowed in HTML, for the simple if debatable reason (hence xhtml) that some humans still author documents by hand, knowingly skip md et al _and_ want to write as few characters as possible (I do!).
This is clear in Tim Berners-Lee's seminal, pre-Netscape "HTML Tags" document [0], through HTML 4 [4] and (as you point out) through the current living standard [5].
I didn't know that Navigator was ever strict, and bit funny story about when I complained that they hadn't been strict...
Around 2000, I was meeting with Tim Berners-Lee, and I mentioned I'd been writing a bunch of Web utility code. He wanted to see, so I handed him some printed API docs I had with me. (He talked and read fast.)
Then I realized he was reading the editorializing in my permissive parser docs, about how browser vendors should've put a big error/warning message on the window for invalid HTML.
Which suddenly felt presumptuous of me, to be having opinions about Web standards, right in front of Tim Berners-Lee at the time.
(My thinking with the prominent warning message that every visitor would see, in mid/late-'90s, was that it would've been compelling social pressure at the time. It would imply that this gold rush dotcom or aspiring developer wasn't good at Web. Everyone was getting money in the belief that they knew anything at all about Web, with little way to evaluate how much they knew.)
Former NCSA employee here. The fuck they did. Netscape caught us out time and again for accepting SGML garbage that we didn’t handle properly. It’s a big part of why Netscape won that round of the browser wars. Such recovery then wound up in tools that generated web pages for you and it was all over but the crying. JavaScript was just the last straw. Which I tried to talk them into adopting but got no traction.
I have bad memories of Netscape 4 and IE4 (I think those were the versions) which both allowed invalid HTML but had different rules for doing it. Accidentally missed off a closing table tag once, and one browser displayed the remainder of the page, but the other didn't.
The “loose” standards of HTML led to some really awful things happening in the early web. I remember seeing, e.g.,
<large><li></large> item text
to get a bigger bullet on a list item which worked fine in Netscape but broke other browsers (and since I was on OS/2 at the time, it was an issue for me).
Really, in 2025 people should just write XHTML and better yet, shouldn’t be generating HTML by hand at all except for borderline cases not handled by their tools.
Unfortunately XHTML5 doesn't exist and if you try to force the issue, you have to re-declare all of the non-numeric HTML entities in your own DTD (I abandoned the idea here). I'd love to use XHTML, its just not viable anymore.
As for generating all HTML, that's simply not possible given the current state (of open-source at least) WYSIWYG HTML editors.
I stopped using entities once we had UTF-8. I suppose there’s a case for the occasional < > but beyond that, I have no problem typing “‘—’” or üçě when I need to.
Some tags do require ending tags, others do not. Personally I find it hard to remember which ones, so I just close things out of caution. That way you’re always spec-correct.
The author has a point, but I object to this mischaracterization:
> XHTML, being based on XML as opposed to SGML, is notorious for being author-unfriendly due to its strictness
This strictness is a moot point. Most editors will autocomplete the closing tag for you, so it's hardly "unfriendly". Besides, if anything, closing tags are reader-friendly (which includes the author), since they make it clear when an element ends. In languages that don't have this, authors often add a comment like `// end of ...` to clarify this. The article author even acknowledges this in some of their examples ("explicit end tags added for clarity").
But there were other potential benefits of XHTML that never came to pass. A strict markup language would make documents easier to parse, and we wouldn't have ended up with the insanity of parsing modern HTML, which became standardized. This, in turn, would have made it easier to expand the language, and integrate different processors into the pipeline. Technologies like XSLT would have been adopted and improved, and perhaps we would have already had proper HTML modules, instead of the half-baked Web Components we have today. All because browser authors were reluctant to force website authors to fix their broken markup. It was a terrible tradeoff, if you ask me.
So, sure, feel free to not close HTML tags if you prefer not to, and to "educate" everyone that they shouldn't either. Just keep it away from any codebases I maintain, thank you very much.
To be fair, I don't mind not closing empty elements, such as `<img>` or `<br>`. But not closing `<p>` or `<div>` is hostile behavior, for no actual gain.
For ebook production, you need to use xhtml, the epub standard is defined that way. And it is indeed useful to be able to treat them as xml files and use xslt and xquery, etc. with them.
<p>some sentence here <img src="img.jpeg"/> <p> some other sentence.
In that example, the image could be part of the first paragraph, as it is there, or if i moved the second <p> before the <img> it would be part of the second. but if I want neither, do I not have to close the first paragraph?
I don't know what "not required" means, but it makes a difference with <p> at least in my opinion. I think the author meant that if the succeeding element is of the same type, you don't need to close the previous one.
But even then, this is not a good feature, browsers aren't the only things processing html content, any number of tooling, or even human readers can get confused.
I didn’t have any rigorous training or anything, but my understanding since learning HTML way back in high school was, like the author mentions in TFA, tags like <br> and <p> can simply be used “as-is” as markup and don’t need the concept of being closed.
I write virtually zero HTML anymore, but the one time this sort of thing comes up is in writing PR descriptions in GitHub using Markdown. Sometimes I want to add a <br> or two for space. I guess I’ve never stopped to notice that I never close those tags after adding them, or wondered why in my head it makes sense not to!
Yeah but it's better for your mental sanity. It's not just a habit, the closure reduces the mental load and helps to keep track of structure in the messy world of html documents. So it is actually more efficient
It depends completely on how nested your HTML tags are.
I hand write my HTML sometimes, and in those cases it’s often very basic documents consisting of maybe an outer container div, a header and a nav with a ul of li for the navigation items and then an inner container div and maybe an article element, and then the contents are mostly p and figure elements and various level headings.
In this case, there is no mental overhead of omitting closing li and closing p at the end of the line, and I omit them because I am allowed to and it’s still readable and fine.
How does all this interact with html minification? IIRC you can't nest p tags but you can nest li, so does browser just make assumptions about your intent. Is it not better to state your intent as clearly as possible if not for a browser, then for the next developer?
Perhaps the most distinguishing characteristic of HTML5 is that it specifies exactly what to do with tag soup. The rules are worth a glance at some time, just to see how rather absurdly complicated they are to do the job of picking up the pieces of who knows how many terabytes and petabytes of garbage HTML were generated before they were codified in an attempt to remain backwards compatible with the various browsers prior to that. And then you'll understand why I'm not going to even begin to attempt to answer your question about how browsers handle various tag combinations. Instead my point is only that, with HTML5, there is in fact a very concrete answer that is no longer up to the browsers trying to each individually spackle over the various and sundry gaps in the standards.
But honestly no answer to "what does the browser do with this sort of thing" fits into an HN comment anymore. I'm glad there's a standard, but there's a better branch of the multiverse where the specification of what to do with bad HTML was written from the beginning and is much, much simpler.
I get why you may not close <img> or <br> since they don't contain anything inside, but <p> and <li> should be closed to indicate the end of the content, otherwise it's shows you are mentally lazy and relying on some magic to do the work and guess what you wanted
To each their own. In simple lists for navigation menus I always omit closing li. There is no ambiguity on what I am intending even with those closing li omitted in such simple cases:
<p> indicates a new paragraph. <li> indicates a new list item. Unless otherwise specified, the existing paragraph/list item continues. There's nothing magic about any of this, it's part of the HTML spec.
Laziness doesn't play a role. This isn't XML where you need to repeat yourself over and over again or abusing a bug in the rendering logic; it's following the definitions markdown language you're writing content in.
If you're not too familiar with the HTML language then it's always a safe bet to close your tags, of course.
There are ways for not closing HTML tags to backfire in some scenarios.
Some rules of thumb, perhaps:
— Do not omit if it is a template and another piece of HTML is included in or after this tag. (The key fact, as always, is that we all make errors sometimes—and omitting a closing tag can make an otherwise small markup error turn your tree into an unrecognisable mess.)
— Remember, the goal in the first place is readability and improved SNR. Use it only if you already respect legibility in other ways, especially the lower-hanging fruit like consistent use of indentation.
— Do not omit if takes more than a split-second to get it. (Going off the HTML spec, as an example, you could have <a> and <p> as siblings in one container, and in that case if you don’t close some <p> it may be non-obvious if an <a> is phrasing or flow content.)
The last thing you want is to require the reader of your code to be more of an HTML parser than they already have to be.
For me personally this makes omitting closing tags OK only in simpler hand-coded cases with a lot of repetition, like tables, lists, definition lists (often forgotten), and obviously void elements.
Article actually argues/states that a lot of the times not closing elements is more readable. It mentions tables without a concrete example, but I think e.g.
is valid and reads better than if the row and data elements were closed (and on separate rows because it would be too much noise otherwise) (of course the whitespaces are different, if they matter for some reason). For a 3x3 table 5 lines vs ~15 lines.
I don’t think tables are human readable in any machine readable format, not even markdown.
The problem is when you have long cells that you’d normally word wrap inside the cell, everything else ends up misaligned in your markup language. Or when you need to add styling to text in a cell, suddenly it’s unreadable again. Or when there’s more than a small few number of columns thus causing each row to word wrap inside your IDE, etc
I think it makes far more sense to just acknowledge that tables are going to ugly, compose them elsewhere, and then export them to your markup language following that language’s specification strictly.
I know some (or even the official?) JavaDoc style guidelines require <p> woithout closing counterparts. But to me this feels the same as omitting semicolins in JS -yes, xou can get away with it, but it's bad style in my opinion.
You do you in terms of taste, but the article explains very clearly that there's nothing wrong about not closing them, as they are reliably auto-closed by a well-defined standard.
I tried using XHTML when we were told, loudly and repeatedly, that it was the inevitable future. Thank god it wasn’t.
You should close your tags. It’s good hygiene. It helps IDEs help you. But. Trust me, you do not want the browser enforcing it at runtime, lest your idea of fun is end users getting helpful error messages like an otherwise blank screen saying “Invalid syntax”.
For fun, imagine that various browsers are not 100.00% compatible (“Inconceivable!”), so that it wasn’t possible to write HTML that every browser agreed was completely valid. Now it’s guaranteed that some of your users will get the error page, even when you’re sure your page is valid.
Conceptually, XHTML and its analogs are better. In practice, they’re much, much worse.
My experience with parsing html gathered from the wild is you're pretty much not required to do anything. "Go wild" and "have fun" seem to be the mottoes.
Works fine on every browser I've thrown at it. Even Gnome Web (WebKit) allows scrolling just fine. Sounds like a Safari bug? Maybe a content blocker interfering with the page?
Whether you can or can't omit a closing element is one thing, but it seems like a useful thing to be able to quickly determine if some content is inside or outside a tag, so why complicate things?
(This is especially relevant with "void" tags. E.g. if someone wrote "<img> hello </img>" then the "hello" is not contained in the tag. You could use the self closing syntax to make this more obvious -- Edit: That's bad advice, see below.)
The inert self closing syntax is misleading, though, because if you use it for a non-void element then whatever follows will be contained within the tag.
e.g. how do you think a browser will interpret this markup?
<div />
<img />
A lot of people think it ends up like this (especially because JSX works this way):
> "Did you forget to close your p tags or is that on purpose?"
> [...]
> These are all adapted from real comments;
If that's a comment you get, write better code. It does not matter to me whether closing p-tags is mandatory or optional. If you don't do it, I don't want you working on the same code base as me.
This kind of knowledge makes for fun blog posts, but if people direct these kind of comments to me. You're obviously using your knowledge to just patronize and lecture people.
> Browsers do not treat missing optional end tags as errors that need to be recovered from
Just because it worked on the one browser you tested it on, doesn't mean it's always worked that way, or that it will always work that way in the future...
Every browser treats html/etc differently... I've run into css issues before on Chrome for android, because I was writing using Chrome for desktop as a reference.
You'd think they should be the same because they come from the same heritage, but no...
> Just because it worked on the one browser you tested it on, doesn't mean it's always worked that way, or that it will always work that way in the future...
All browsers have worked this way for decades. It’s standard HTML that has been in widespread use since the beginning of the web. The further back you go, the more normal it was to write HTML in this style. You can see in this specification from 1992 that <p> and <li> don’t have closing tags at all:
Maybe there were obscure browsers that had bugs relating to this back in the mid 90s, but I don’t recall any from the late 90s onwards. Can you name a browser released this millennium that doesn’t understand optional closing tags?
Maybe you are not required to close, browsers are ridiculously tolerant to html mistakes. But do you think it would be easy to maintain that HTML file and not accidentally include something in that unclosed tag? I often format with indentation in order to work with the file comfortably.
Also it annoys me when people are still closing tags with '/>'.
Did you read the article? An unclosed <p> is not a mistake that the browser is tolerating, it's 100% well-defined html5 according to spec.
If your tooling can't handle the language as defined, maybe the problem is with the tooling.
This a very verbose and confuse article. Mixing P/LI and IMG/BR is wrong. I think the situation could be explained with two points:
1. The autoclose syntax does not exist in HTML5, and a trailing slash after a tag is always ignored. It's therefore recommended to avoid this syntax. I.e write <br> instead of <br />. For details and a list of void elements, see https://developer.mozilla.org/en-US/docs/Glossary/Void_eleme...
2. It's not mandatory to close tags when the parser can guess where they end. E.g. a paragraph cannot contain any line-block, so <p>a<div>b</div> is the same as <p>a</p><div>b</div>. It depends on the context, but putting an explicit end tag is usually less error-prone.
Putting an explicit end tag is more error-prone. It won't do anything for valid HTML but it'll add empty tag for invalid HTML. If you want to improve human readability, put end tag enclosed in HTML comment. At least it won't add empty elements.
Serious question: why would you ever want to not close tags? It saves a couple of key strokes, but we have snippets in our editors, so the amount of typing is the same. Closed tags allow editors like Vim or automated tools to handle the source code easier; e.g. I can type `dit` in Vim to delete the contents of a tag, something that's only possible because the tag's content is clearly delimited. It makes parsing HTML easier because there are fewer syntax rules.
I learned HTML quite late, when HTML 5 was already all the rage, and I never understood why the more strict rules of XML for HTML never took off. They seem so much saner than whatever soup of special rules and exceptions we currently have. HTML 5 was an opportunity to make a clear cut between legacy HTML and the future of HTML. Even though I don't have to, I strive to adhere to the stricter rules of closing all tags, closing self-closing tags and only using lower-case tag names.
> I never understood why the more strict rules of XML for HTML never took off
Internet Explorer failing to support XHTML at all (which also forced everyone to serve XHTML with the HTML media type and avoid incompatible syntaxes like self-closing <script />), Firefox at first failing to support progressive rendering of XHTML, a dearth of tooling to emit well-formed XHTML (remember, those were the days of PHP emitting markup by string concatenation) and the resulting fear of pages entirely failing to render (the so-called Yellow Screen of Death), and a side helping of the WHATWG cartel^W organization declaring XHTML "obsolete". It probably didn't help that XHTML did not offer any new features over tag-soup HTML syntax.
I think most of those are actually no longer relevant, so I still kind of hope that XHTML could have a resurgence, and that the tag-soup syntax could be finally discarded. It's long overdue.
What I never understood was why, for HTML specifically, syntax errors are such a fundamental unsolvable problem that it's essential that browsers accept bad content.
Meanwhile, in any other formal language (including JS and CSS!), the standard assumption is that syntax errors are fatal, the responsibility for fixing lies with the page author, but also that fixing those errors is not a difficult problem.
Why is this a problem for HTML - and only HTML?
14 replies →
I was there, Gandalf. I was there 30 years ago. I was there when the strength of men failed.
Netscape started this. NCSA was in favor of XML style rules over SGML, but Netscape embraced SGML leniency fully and several tools of that era generated web pages that only rendered properly in Netscape. So people voted with their feet and went to the panderers. If I had a dollar for every time someone told me, “well it works in Netscape” I’d be retired by now.
> It probably didn't help that XHTML did not offer any new features over tag-soup HTML syntax.
Well, this is not entirely true: XML namespaces enabled attaching arbitrary data to XHTML elements in a much more elegant, orthogonal way than the half-assed solution HTML5 ended up with (the data-* attribute set), and embedding other XML applications like XForms, SVG and MathML (though I am not sure how widely supported this was at the time; some of this was backported into HTML5 anyway, in a way that later led to CVEs). But this is rather niche.
Emitting correct XHTML was not that hard. The biggest problem was that browsers supported plugins that could corrupt whole page. If you created XHTML webpage you had to handle bug reports caused by poorly written plugins.
Why did markdown become popular when we already have html? Because markdown is much easier to write by hand in a simple text editor.
Original SGML was actually closer to markdown. It had various options to shorten and simplify the syntax, making it easy to write and edit by hand, while still having an unambiguous structure.
The verbose and explicit structure of xhtml makes it easier to process by tools, but more tedious for humans.
Personally I think Markdown got _really_ popular not because it is easier to write but because it is easier to read.
It’s kind of a huge deal that I can give a Markdown file of plain text content to somebody non-technical and they aren’t overwhelmed by it in raw form.
HTML fails that same test.
5 replies →
Imho the real strength of markdown is it forces people to stick to classes instead of styling. "I want to write in red comic Sans" " I don't care, you can't".
And markdown tables are harder to write than HTML tables. However, they are generally easier to read. Unless multi line cell.
1 reply →
Is it really that much easier to write `<br>` and know that it isn't a problem, than just write `<br />`?
7 replies →
User input data is always to be treated as suspect when it reaches the server and needs to be scanned and sanitised (if necessary) before accepting it for processing. Markdown makes this a lot easier to do and this is why it became popular.
A lot of HTML tags never have a body, so it makes no sense to close them. XML has self-closing tag syntax but it wasn't always handled well by browsers.
A p or li tag, at least when used and nested properly, logically ends where either the next one begins or the enclosing block ends. Closing li also creates the opportunity for nonsensical content inside of a list but not in any list item. Of course all of these corner cases are now well specified because people did close their tags sometimes.
> A p or li tag, at least when used and nested properly, logically ends where either the next one begins or the enclosing block ends
While this is true I’ve never liked it.
Implies a closing </p> in the middle. But
Does not. Obviously with the knowledge of the difference between what span and p represent I understand why but in terms of pure markup it’s always left a bad taste in my mouth. I’ll always close tags whenever relevant even if it’s not necessary.
1 reply →
> XML has self-closing tag syntax but it wasn't always handled well by browsers.
So we'll add another syntax for browsers to handle.
https://xkcd.com/927/
2 replies →
> I never understood why the more strict rules of XML for HTML never took off.
Because of the vast quantity of legacy HTML content, largely.
> HTML 5 was an opportunity to make a clear cut between legacy HTML and the future of HTML.
WHATWG and its living standard that W3C took various versions of and made changes to and called it HTML 5, 5.1, etc., to pretend that they were still relevant in HTML, before finally giving up on that entirely, was a direct result of the failure of XHTML and the idea of a clear cut between legacy HTML and the future of HTML. It was a direct reaction against the “clear cut” approach based on experience, not an opportunity to repeat its mistakes. (Instead of a clear break, HTML incorporated the “more strict rules of XML” via the XML serialization for HTML; for the applications where that approach offers value, it is available and supported and has an object model 100% compatible with the more common form, and they are maintained together rather than competing.)
I'd argue XHTML did take off and was very widely adopted for the first 5-10 years of the century.
4 replies →
Because I want my hand-written HTML to look more like markdown-style languages. If I close those tags it adds visual noise and makes the text harder to read.
Besides, at this point technologies like tree-sitter make editor integration a moot point: once tree-sitter knows how to parse it, the editor does too.
For the same reason css still works if you make a typo and javascript super dynamic: its a friendly interface.
Html, css and js got used so much because you could mess around and still get something to work. While other languages that people use to write “serious” applications just screamed at you for not being smart enough to know how to allocate memory correctly.
Html and css is not a competitor to C. Its more like an alternative to file formats like txt or rtf. Meant to be written by hand in a text editor to get styled pages. So easy and forgiving your mom could do it! (And did, just like everyone else in the myspace days)
I built a testing framework, and I wanted it to generate HTML reports during testing with not post-processing report compilation step. I wanted the html in real-time so if a test was cut short for any reason from killing the job to power failure, you'd have a readable html report showing where things stopped. I could do this by just appending divs as rows without closing any of the parent divs, body or html tags. So the more general answer, anytime you want to continuously stream html and not want to wait until the end of the document to begin rendering.
I would argue the stricter rules did take off, most people always close <p>, it's pretty common to see <img/> over <img>—especially from people who write a lot of React.
But.
The future of HTML will forever contain content that was first handtyped in Notepad++ in 2001 or created in Wordpress in 2008. It's the right move for the browser to stay forgiving, even if you have rules in your personal styleguide.
> I learned HTML quite late, when HTML 5 was already all the rage, and I never understood why the more strict rules of XML for HTML never took off. They seem so much saner than whatever soup of special rules and exceptions we currently have.
XHTML came out at a time when Internet Explorer, the most popular browser, was essentially frozen apart from security fixes because Microsoft knew that if the web took off as a viable application platform it would threaten Windows' dominance. XHTML 1.1 Transitional was essentially HTML 4.01 except that if it wasn't also valid XML, the spec required the browser to display a yellow "parsing error" page rather than display the content. This meant that any "working" XHTML site might not display because the page author didn't test in your browser. It also meant that any XHTML site might break at any time because a content writer used a noncompliant browser like IE 6 to write an article, or because the developers missed an edge case that causes invalid syntax.
XHTML 2.0 was a far more radical design. Because IE 6 was frozen, XHTML 2.0 was written with the expectation that no current web browser would implement it, and instead was a ground-up redesign of the web written "the right way" that would eventually entirely replace all existing web browsers. For example, forms were gone, frames were gone, and all presentational elements like <b> and <i> were gone in favor of semantic elements like <strong> and <samp> that made it possible for a page to be reasoned about automatically by a program. This required different processing from existing HTML and XHTML documents, but there was no way to differentiate between "old" and "new" documents, meaning no thought was given to adding XHTML 2.0 support to browsers that supported existing web technologies. Even by the mid-2000s, asking everyone to restart the web from scratch was obviously unrealistic compared to incrementally improving it. See here for a good overview of XHTML 2.0's failure from a web browser implementor's perspective: https://dbaron.org/log/20090707-ex-html
This really does feel like a job for auto-complete -slash- Generative ai tools.
Imagine if you were authoring and/or editing prose directly in html, as opposed to using some CMS. You're using your writing brain, not your coding brain. You don't want to think about code.
It's still a little annoying to put <p> before each paragraph, but not by that much. By contrast, once you start adding closing tags, you're much closer to computer code.
I'm not sure if that makes sense but it's the way I think about it.
It's honestly no worse than Markdown, reST, or any of the other text-based "formats." It's just another format.
Any time I have to write Markdown I have to open a cheat sheet for reference. With HTML, which I have used for years, I just write it.
In the case of <br/> and <img/> browsers will never use the content inside of the tag, so using a closing tag doesn't make sense. The slash makes it much clearer though, so missing it out is silly.
"Self-closing tags" are not a thing in HTML5. From the HTML standard:
> On void elements, [the trailing slash] does not mark the start tag as self-closing but instead is unnecessary and has no effect of any kind. For such void elements, it should be used only with caution — especially since, if directly preceded by an unquoted attribute value, it becomes part of the attribute value rather than being discarded by the parser.
It was mainly added to HTML5 to make it easier to convert XHTML pages to HTML5. IMO using the trailing slash in new pages is a mistake. It makes it appear as though the slash is what closes the element when in reality it does nothing and the element is self-closing because it's part of a hardcoded set of void elements. See here for more information: https://github.com/validator/validator/wiki/Markup-%C2%BB-Vo...
1 reply →
Self-closing tags do nothing in HTML though. They are ignored. And in some cases, adding them obfuscates how browser’s will actually interpret the markup, or introduce subtle differences between HTML and JSX, for example.
How does the slash make it clearer? It's totally inert, so if you try to do the same thing with a non-void tag the results will not be what you expect!
4 replies →
> why would you ever want to not close tags?
Because browsers close some tags automatically. And if your closing tag is wrong, it'll generate empty element instead of being ignored. Without even emitting warning in developer console. So by closing tags you're risking introducing very subtle DOM bugs.
If you want to close tags, make sure that your building or testing pipeline ensures strict validation of produced HTML.
Guess what, you're not required to open <html>, <head>, or <body> either. It all follows from SGML tag inference rules, and the rules aren't that difficult to understand. What makes them appear magical is WHATWG's verbose ad-hoc parsing algorithm presentation explicitly listing eg. elements that close their parents originally captured from SGML but having become unmaintained as new elements were added. This already started to happen in the very first revision after Ian Hickson's initial procedural HTML parsing description ([1]).
I'd also wish people would stop calling every element-specific behavior HTML parsers do "liberal and tag-soup"-like. Yes WHATWG HTML does define error recovery rules, and HTML had introduced historic blunders to accomodate inline CSS and inline JS, but almost always what's being complained about are just SGML empty elements (aka HTML void elements) or tag omission (as described above) by folks not doing their homework.
[1]: https://sgmljs.sgml.net/docs/html5.html#tag-omission (see also XML Prague 2017 proceedings pp. 101ff)
HTML becomes pretty delightful for prototyping when you embrace this. You can open up an empy file and start typing tags with zero boilerplate. Drop in a script tag and forget about getElementById(); every id attribute already defines a JavaScript variable name directly, so go to town. Today the specs guarantee consistent behavior so this doesn't introduce compatiblity issues like it did in the bad old days of IE6. You can make surprisingly powerful stuff in a single file application with no fluff.
I just wish browsers weren't so anal about making you load things from http://localhost instead of file:// directly. Someone ought to look into fixing the security issues of file:// URLs so browsers can relax about that.
Welcome, kids, to how all web development was done 25-30 years ago. You typed up html, threw in some scripts (once JavaScript became a thing) and off you went. No CMS, no frameworks. I know a guy who wrote a fully functional client-side banking back office app in IE4 JS by posting into different frames and observing the DOM returned by the server. In 1999. Worked a treat on network speeds and workstation capabilities you literally can’t imagine today.
Things do not have to be complicated. That abstraction layer you are adding sure is elegant, but is it also necessary? Does it add more value than it consumes not just at the time of coding but throughout the entire lifecycle of the system? People have piled abstraction on top of hardware from day one, but one has to ask, if and when did we get past the point of diminishing returns? Kubernetes was supposed to be the thing that makes managing vms simple. Now there are things supposedly making managing Kubernetes simple. Maybe, just maybe, this computer-stuff is inherently complicated and we’re just adding to it by hoping all of it can eventually be made “simple”? Just look at the messages around vibe coding…
1 reply →
Love the single file html tool paradigm! See https://simonwillison.net/2025/Dec/10/html-tools/
Opus and I have made a couple of really cool internal tools for work. It's really great.
A workaround for the file:// security deny is to use a JavaScript file for data (initialized array) rather than something more natural like JSON.
Apparently JavaScript got grandfathered in as ok for direct access!
1 reply →
Wow, I had never heard of that ID -> variable feature
7 replies →
I liked learning this so much that I created a VSCode Extension to enable goto clicking and autocomplete and errors for single page html files and type hover so I can properly use it when i am prototyping.
https://marketplace.visualstudio.com/items?itemName=carsho.h...
> Someone ought to look into fixing the security issues of file:// URLs
If you mean full sandboxing of applications with a usable capability system, then yeah, someone ought to do that. But I wouldn't hold my breath, there's a reason why nobody did yet.
Yes i love quickly creating tools in a single file, if the tool gets really complex I'll switch to a sveltekit Static site. I have a default css file I use for all of them to make it even quicker and not look so much like AI slop.
I think every dev should have a tools.TheirDomain.zzz where they put different tools they create. You can make so many static tools and I feel like everyone creates these from time to time when they are prototyping things. There's so many free options for static hosting and you can write bash deploy scripts so quickly with AI, so its literally just ./deploy.sh to deploy. (I also recommend writing some reusable logic for saving to local storage/indexedDB so its even nicer.)
Mine for example is https://tools.carsho.dev (100% offline/static tools, no monetization)
What are the security issues of file:// URLs?
6 replies →
This is what I complain about:
https://nvd.nist.gov/vuln/detail/CVE-2020-26870
https://sirre.al/2025/08/06/safe-json-in-script-tags-how-not...
https://bughunters.google.com/blog/5038742869770240/escaping...
None of those problems exist in XHTML.
I guess you're replying to my comment because you were triggered by my last sentence. I wasn't criticizing you specifically, but yeah, in another comment you're writing
> It probably didn't help that XHTML did not offer any new features over tag-soup HTML syntax.
which unfortunately reaks of exactly the kind of roundabout HTML criticism that is not so helpful IMO. We have to face the possibility that most HTML documents have already been written at this point, at least if you value text by humans.
The CVEs you're referencing are due to said historic blunders allowing inline JS or otherwise tunneling foreign syntax in markup constructs (mutation XSSs are only triggered by serialising and reparsing HTML as part of bogus sanitizer libs anyway).
If you look at past comments of mine, you'll notice I'm staunchly criticizing inline JS and CSS (should always be placed in external "resources") and go as far as saying CSS or other ad-hoc item-value syntax should not even exist when attributes already serve this purpose.
The remaining CVE is made possible by Hickson's overly liberal rules for what's allowed or needs escaping in attributes vs SGML's much stricter rules.
1 reply →
Omitting <body> can lead to weird surprises. I once had some JavaScript mysteriously breaking because document.body was null during inline execution.
Since then I always write <body> explicitly even though it is optional.
Go back a bit further for why.
Netscape Navigator did, in fact, reject invalid HTML. Then along came Internet Explorer and chose “render invalid HTML dwim” as a strategy. People, my young naive self included, moaned about NN being too strict. NN eventually switched to the tag soup approach. XHTML 1.0 arrived in 2000, attempting to reform HTML by recasting it as an XML application. The idea was to impose XML’s strict parsing rules: well-formed documents only, close all your tags, lowercase element names, quote all attributes, and if the document is malformed, the parser must stop and display an error rather than guess. XHTML was abandoned in 2009. When HTML5 was being drafted in 2004-onwards, the WHATWG actually had to formally specify how browsers should handle malformed markup, essentially codifying IE’s error-recovery heuristics as the standard.
The article itself falsifies this explanation; IE wasn't released until August 1995. The HTML draft specs published prior to this already specified that these tags didn't need closing; these simply weren't invalid HTML in the first place.
The oldest public HTML documentation there is, from 1991, demonstrates that <li>, <dt>, and <dd> tags don't need to be closed! And the oldest HTML DTD, from 1992, explicitly specifies that these, as well as <p>, don't need closing. Remember, HTML is derived from SGML, not XML; and SGML, unlike XML, allows for the possibility of tags with optional close. The attempt to make HTML more XML-like didn't come until later.
But not closing <p> etc has always been valid HTML. Back from SGML it was possible for closing tags to be optional (depending on the DTD), and Netscape supported this from the beginning.
Leaving out closing tags is possible when the parsing is unambigous. E.g <p>foo<p>bar is unambiguous becuse p elements does not nest, so they close automatically by the next p.
The question about invalid HTML is a sepearate issue. E.g you can’t nest a p inside an i according to the spec, so how does a browser render that? Or lexical error like illegal characters in a non-quoted attribute value.
This is where it gets tricky. Render anyway, skip the invalid html, or stop rendering with an error message? HTML did not specify what to do with invalid input, so either is legal. Browsers choose to go with the “render anyway” approach, but this lead to different outputs in different browsers, since it wasn’t agreed upon how to render invald html.
The difference between Netscape and IE was that Netscape in more cases would skip rendering invalid HTML, where IE would always render the content.
Optinal tags have always been allowed in HTML, for the simple if debatable reason (hence xhtml) that some humans still author documents by hand, knowingly skip md et al _and_ want to write as few characters as possible (I do!).
This is clear in Tim Berners-Lee's seminal, pre-Netscape "HTML Tags" document [0], through HTML 4 [4] and (as you point out) through the current living standard [5].
[0] https://www.w3.org/History/19921103-hypertext/hypertext/WWW/...
[4] https://www.w3.org/TR/html401/intro/sgmltut.html#h-3.2.1
[5] https://html.spec.whatwg.org/multipage/syntax.html#optional-...
NN did not reject invalid HTML. It could not incrementally render tables, while IE could. That's all.
Because table layout was common, a missing </table> was a common error that resulted in a blank page in NN. That was a completely unintentional bug.
Optional closing tags were inherited from SGML, and were always part of HTML. They're not even an error.
I didn't know that Navigator was ever strict, and bit funny story about when I complained that they hadn't been strict...
Around 2000, I was meeting with Tim Berners-Lee, and I mentioned I'd been writing a bunch of Web utility code. He wanted to see, so I handed him some printed API docs I had with me. (He talked and read fast.)
Then I realized he was reading the editorializing in my permissive parser docs, about how browser vendors should've put a big error/warning message on the window for invalid HTML.
Which suddenly felt presumptuous of me, to be having opinions about Web standards, right in front of Tim Berners-Lee at the time.
(My thinking with the prominent warning message that every visitor would see, in mid/late-'90s, was that it would've been compelling social pressure at the time. It would imply that this gold rush dotcom or aspiring developer wasn't good at Web. Everyone was getting money in the belief that they knew anything at all about Web, with little way to evaluate how much they knew.)
Former NCSA employee here. The fuck they did. Netscape caught us out time and again for accepting SGML garbage that we didn’t handle properly. It’s a big part of why Netscape won that round of the browser wars. Such recovery then wound up in tools that generated web pages for you and it was all over but the crying. JavaScript was just the last straw. Which I tried to talk them into adopting but got no traction.
I have bad memories of Netscape 4 and IE4 (I think those were the versions) which both allowed invalid HTML but had different rules for doing it. Accidentally missed off a closing table tag once, and one browser displayed the remainder of the page, but the other didn't.
You are also not required to indent code (in most languages); please do if you want me to read it though.
You can also indent with spaces and tabs at the same time, who's judging?
Not in python, which is how I always discover someone is using tabs ..
1 reply →
You monster.
2 replies →
Closing optional HTML tags just adds more ambiguity. How many HTMLParagraphElements here, what do you think?
2. And there’s no ambiguity there, just invalid HTML because paragraphs aren’t nestable.
17 replies →
Why would you nest paragraph tags?
1 reply →
That is invalid syntax. Only phrasing content is allowed the p element (https://developer.mozilla.org/en-US/docs/Web/HTML/Guides/Con...)
1 reply →
This is invalid html, p tag can be nested in a p tag.
1 reply →
HTML is more content than programming logic, so it shouldn't be indented.
The “loose” standards of HTML led to some really awful things happening in the early web. I remember seeing, e.g.,
to get a bigger bullet on a list item which worked fine in Netscape but broke other browsers (and since I was on OS/2 at the time, it was an issue for me).
Really, in 2025 people should just write XHTML and better yet, shouldn’t be generating HTML by hand at all except for borderline cases not handled by their tools.
Unfortunately XHTML5 doesn't exist and if you try to force the issue, you have to re-declare all of the non-numeric HTML entities in your own DTD (I abandoned the idea here). I'd love to use XHTML, its just not viable anymore.
As for generating all HTML, that's simply not possible given the current state (of open-source at least) WYSIWYG HTML editors.
> Unfortunately XHTML5 doesn't exist
This is a mirage, apparently: https://html.spec.whatwg.org/multipage/xhtml.html
1 reply →
I stopped using entities once we had UTF-8. I suppose there’s a case for the occasional < > but beyond that, I have no problem typing “‘—’” or üçě when I need to.
I wish HTML 6 will actually be XHTML 6.
Early Google did not close their tags, I think it was for the sake of payload size?
That said, your linter is going to drive you crazy if you don't close tags, no?
How the mighty have fallen. A search on YouTube now pulls in 2.14 MB of HTML alone.
This is more of a problem with linters.
Yea but it feels gross when I don't.
You just confessed to having taste.
To see the actual errors, just paste your HTML here and see: https://emilstenstrom.github.io/justhtml/playground/ - any parsing errors show up below the input box.
Some tags do require ending tags, others do not. Personally I find it hard to remember which ones, so I just close things out of caution. That way you’re always spec-correct.
The author has a point, but I object to this mischaracterization:
> XHTML, being based on XML as opposed to SGML, is notorious for being author-unfriendly due to its strictness
This strictness is a moot point. Most editors will autocomplete the closing tag for you, so it's hardly "unfriendly". Besides, if anything, closing tags are reader-friendly (which includes the author), since they make it clear when an element ends. In languages that don't have this, authors often add a comment like `// end of ...` to clarify this. The article author even acknowledges this in some of their examples ("explicit end tags added for clarity").
But there were other potential benefits of XHTML that never came to pass. A strict markup language would make documents easier to parse, and we wouldn't have ended up with the insanity of parsing modern HTML, which became standardized. This, in turn, would have made it easier to expand the language, and integrate different processors into the pipeline. Technologies like XSLT would have been adopted and improved, and perhaps we would have already had proper HTML modules, instead of the half-baked Web Components we have today. All because browser authors were reluctant to force website authors to fix their broken markup. It was a terrible tradeoff, if you ask me.
So, sure, feel free to not close HTML tags if you prefer not to, and to "educate" everyone that they shouldn't either. Just keep it away from any codebases I maintain, thank you very much.
To be fair, I don't mind not closing empty elements, such as `<img>` or `<br>`. But not closing `<p>` or `<div>` is hostile behavior, for no actual gain.
Img and br are not allowed to be closed.
Worse, due to the aforementioned permissive error handling in HTML parsers, a closing </br> tag will end up inserting a second line break
You close them in the same tag:
7 replies →
Wrong.[1]
> if the element is one of the void elements, or if the element is a foreign element, then there may be a single U+002F SOLIDUS character (/)
If you're going to be pedantic, at least be correct about it.
[1]: https://html.spec.whatwg.org/multipage/syntax.html#start-tag...
5 replies →
For ebook production, you need to use xhtml, the epub standard is defined that way. And it is indeed useful to be able to treat them as xml files and use xslt and xquery, etc. with them.
Just because you can, doesn't mean you should.
<p>some sentence here <img src="img.jpeg"/> <p> some other sentence.
In that example, the image could be part of the first paragraph, as it is there, or if i moved the second <p> before the <img> it would be part of the second. but if I want neither, do I not have to close the first paragraph?
Here is a demo of what i mean on this random html paste site: https://htmlbin.online/closetagdemo
I don't know what "not required" means, but it makes a difference with <p> at least in my opinion. I think the author meant that if the succeeding element is of the same type, you don't need to close the previous one.
But even then, this is not a good feature, browsers aren't the only things processing html content, any number of tooling, or even human readers can get confused.
<p> was the only thing I knew when I constructed Finland's First Ever WEB-page in 1992.
Correction: there was also the issue of Ä and Ö. Those were &AUML; and &OUML; I think.
https://timonoko.github.io/alaska/index.htm
On 640x480 VGA-display this page looks quite alright.
I didn’t have any rigorous training or anything, but my understanding since learning HTML way back in high school was, like the author mentions in TFA, tags like <br> and <p> can simply be used “as-is” as markup and don’t need the concept of being closed.
I write virtually zero HTML anymore, but the one time this sort of thing comes up is in writing PR descriptions in GitHub using Markdown. Sometimes I want to add a <br> or two for space. I guess I’ve never stopped to notice that I never close those tags after adding them, or wondered why in my head it makes sense not to!
Yeah but it's better for your mental sanity. It's not just a habit, the closure reduces the mental load and helps to keep track of structure in the messy world of html documents. So it is actually more efficient
It depends completely on how nested your HTML tags are.
I hand write my HTML sometimes, and in those cases it’s often very basic documents consisting of maybe an outer container div, a header and a nav with a ul of li for the navigation items and then an inner container div and maybe an article element, and then the contents are mostly p and figure elements and various level headings.
In this case, there is no mental overhead of omitting closing li and closing p at the end of the line, and I omit them because I am allowed to and it’s still readable and fine.
Please just close them though.
I always thought this was funny. Practical but not the usual open/close order.
I just tried this!
Which works and is much cleaner than the usual table tag soup. It makes sense as on their own <td> and <tr> tags have no meaning.
You don't need to close these tags. But if you value your sanity more than saving a few bytes, you as well may close them, it's not an error.
I use this daily with local HTML files, specifically not closing <li> tags
For example, I generate numbered lists of URLs something like
This is for text-only browser
If I am viewing in graphical browser I wrap the lists in <pre> tags
I don't think I've ever closed <br> tags
Solid.js and Vue uses these sorts of tricks to shave the amount of bytes from the HTML templates they generate
https://github.com/vuejs/core/commit/30dbdc101a48d747b56bcdd...
How does all this interact with html minification? IIRC you can't nest p tags but you can nest li, so does browser just make assumptions about your intent. Is it not better to state your intent as clearly as possible if not for a browser, then for the next developer?
Payload size is a moot point given gzip.
Perhaps the most distinguishing characteristic of HTML5 is that it specifies exactly what to do with tag soup. The rules are worth a glance at some time, just to see how rather absurdly complicated they are to do the job of picking up the pieces of who knows how many terabytes and petabytes of garbage HTML were generated before they were codified in an attempt to remain backwards compatible with the various browsers prior to that. And then you'll understand why I'm not going to even begin to attempt to answer your question about how browsers handle various tag combinations. Instead my point is only that, with HTML5, there is in fact a very concrete answer that is no longer up to the browsers trying to each individually spackle over the various and sundry gaps in the standards.
But honestly no answer to "what does the browser do with this sort of thing" fits into an HN comment anymore. I'm glad there's a standard, but there's a better branch of the multiverse where the specification of what to do with bad HTML was written from the beginning and is much, much simpler.
You cannot nest <li>. An <li> may only be a child of <ol>, <ul> or <menu>.
https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/...
I get why you may not close <img> or <br> since they don't contain anything inside, but <p> and <li> should be closed to indicate the end of the content, otherwise it's shows you are mentally lazy and relying on some magic to do the work and guess what you wanted
To each their own. In simple lists for navigation menus I always omit closing li. There is no ambiguity on what I am intending even with those closing li omitted in such simple cases:
<p> indicates a new paragraph. <li> indicates a new list item. Unless otherwise specified, the existing paragraph/list item continues. There's nothing magic about any of this, it's part of the HTML spec.
Laziness doesn't play a role. This isn't XML where you need to repeat yourself over and over again or abusing a bug in the rendering logic; it's following the definitions markdown language you're writing content in.
If you're not too familiar with the HTML language then it's always a safe bet to close your tags, of course.
The personal judgement isn't really helpful here?
If you don't close your <p> and <li> tags, you risk accidentally having content in the wrong place.
It's something to avoid because it can have bad consequences, not because it (somehow?) makes you a bad person.
There are ways for not closing HTML tags to backfire in some scenarios.
Some rules of thumb, perhaps:
— Do not omit if it is a template and another piece of HTML is included in or after this tag. (The key fact, as always, is that we all make errors sometimes—and omitting a closing tag can make an otherwise small markup error turn your tree into an unrecognisable mess.)
— Remember, the goal in the first place is readability and improved SNR. Use it only if you already respect legibility in other ways, especially the lower-hanging fruit like consistent use of indentation.
— Do not omit if takes more than a split-second to get it. (Going off the HTML spec, as an example, you could have <a> and <p> as siblings in one container, and in that case if you don’t close some <p> it may be non-obvious if an <a> is phrasing or flow content.)
The last thing you want is to require the reader of your code to be more of an HTML parser than they already have to be.
For me personally this makes omitting closing tags OK only in simpler hand-coded cases with a lot of repetition, like tables, lists, definition lists (often forgotten), and obviously void elements.
It would be nice though, to close 'm ... Makes it more readable and less prone to mistakes.
Article actually argues/states that a lot of the times not closing elements is more readable. It mentions tables without a concrete example, but I think e.g.
is valid and reads better than if the row and data elements were closed (and on separate rows because it would be too much noise otherwise) (of course the whitespaces are different, if they matter for some reason). For a 3x3 table 5 lines vs ~15 lines.
I don’t think tables are human readable in any machine readable format, not even markdown.
The problem is when you have long cells that you’d normally word wrap inside the cell, everything else ends up misaligned in your markup language. Or when you need to add styling to text in a cell, suddenly it’s unreadable again. Or when there’s more than a small few number of columns thus causing each row to word wrap inside your IDE, etc
I think it makes far more sense to just acknowledge that tables are going to ugly, compose them elsewhere, and then export them to your markup language following that language’s specification strictly.
I know some (or even the official?) JavaDoc style guidelines require <p> woithout closing counterparts. But to me this feels the same as omitting semicolins in JS -yes, xou can get away with it, but it's bad style in my opinion.
In the context of javadoc, you can think of <p> as a paragraph break, just like <br> is a line break.
I cannot fathom writing <p> without</p> to wrap a text.
You do you in terms of taste, but the article explains very clearly that there's nothing wrong about not closing them, as they are reliably auto-closed by a well-defined standard.
I feel like this is in the same vein as semicolons being "optional" in JavaScript.
It's wrong, but the engines know how to make it work anyways.
I tried using XHTML when we were told, loudly and repeatedly, that it was the inevitable future. Thank god it wasn’t.
You should close your tags. It’s good hygiene. It helps IDEs help you. But. Trust me, you do not want the browser enforcing it at runtime, lest your idea of fun is end users getting helpful error messages like an otherwise blank screen saying “Invalid syntax”.
For fun, imagine that various browsers are not 100.00% compatible (“Inconceivable!”), so that it wasn’t possible to write HTML that every browser agreed was completely valid. Now it’s guaranteed that some of your users will get the error page, even when you’re sure your page is valid.
Conceptually, XHTML and its analogs are better. In practice, they’re much, much worse.
My experience with parsing html gathered from the wild is you're pretty much not required to do anything. "Go wild" and "have fun" seem to be the mottoes.
Considering I can’t scroll down on ios safari, maybe you should…
Works fine on every browser I've thrown at it. Even Gnome Web (WebKit) allows scrolling just fine. Sounds like a Safari bug? Maybe a content blocker interfering with the page?
I find it very sad that XHTML didn't win over HTML5.
More strictness I think would've helped remove a lot of ambiguity.
Whether you can or can't omit a closing element is one thing, but it seems like a useful thing to be able to quickly determine if some content is inside or outside a tag, so why complicate things?
(This is especially relevant with "void" tags. E.g. if someone wrote "<img> hello </img>" then the "hello" is not contained in the tag. You could use the self closing syntax to make this more obvious -- Edit: That's bad advice, see below.)
The inert self closing syntax is misleading, though, because if you use it for a non-void element then whatever follows will be contained within the tag.
e.g. how do you think a browser will interpret this markup?
A lot of people think it ends up like this (especially because JSX works this way):
but it's actually equal to this:
Ah, that's true. I think the WHATWG discouraged the syntax, so this might be why.
This is really easy to detect though, unlike arbitrary rules on what belongs on the inside of an unclosed tag.
I absolutely agree. Also, if you want to parse HTML as XML, it's a lot more reliable having it in a known 'good' format to begin with.
This is where I always end up as well. Just because you can do something doesn’t mean you should or shouldn’t.
> "Did you forget to close your p tags or is that on purpose?" > [...] > These are all adapted from real comments;
If that's a comment you get, write better code. It does not matter to me whether closing p-tags is mandatory or optional. If you don't do it, I don't want you working on the same code base as me.
This kind of knowledge makes for fun blog posts, but if people direct these kind of comments to me. You're obviously using your knowledge to just patronize and lecture people.
> Browsers do not treat missing optional end tags as errors that need to be recovered from
Just because it worked on the one browser you tested it on, doesn't mean it's always worked that way, or that it will always work that way in the future...
Every browser treats html/etc differently... I've run into css issues before on Chrome for android, because I was writing using Chrome for desktop as a reference.
You'd think they should be the same because they come from the same heritage, but no...
> Just because it worked on the one browser you tested it on, doesn't mean it's always worked that way, or that it will always work that way in the future...
All browsers have worked this way for decades. It’s standard HTML that has been in widespread use since the beginning of the web. The further back you go, the more normal it was to write HTML in this style. You can see in this specification from 1992 that <p> and <li> don’t have closing tags at all:
https://info.cern.ch/hypertext/WWW/MarkUp/Tags.html
Maybe there were obscure browsers that had bugs relating to this back in the mid 90s, but I don’t recall any from the late 90s onwards. Can you name a browser released this millennium that doesn’t understand optional closing tags?
You’re not really required to with anything in HTML or to even use HTML, you could do everything plain text if you wanted
If I enter the following:-
Should the second <p> be nested or not?
No. P elements does not nest, so this is parsed as: <p></p><p></p>
I never knew that. I don't know why. I've never seen this mentioned in a book, tutorial, or anything.
This is like saying you do not have to use semicolons in JavaScript
If I remember correctly Google's front page did this in the early days to save a few bytes.
You're technically right, it seems like they had one single unclosed <p> right at the end of the page, for the copyright footer: https://web.archive.org/web/20010510221642/http://www.google...
Literally saving four bytes.
Also true: You are not required to bathe IRL.
(But it might be better if you make a habit of doing so.)
This is ridiculous. You're genuinely applying social pressure to close tags. I remember when this began, drove me nuts.
An anonymous account on HN is applying social pressure?
Or am I pointing out that closing tags is a human social issue, with aspects ranging from practical & reasonable, to ridiculous & widely exploited?
Same with <svg> but Firefox's XML parser will not greenlight you.
I'd think handling these cases would be nightmare for html parsers.
Maybe you are not required to close, browsers are ridiculously tolerant to html mistakes. But do you think it would be easy to maintain that HTML file and not accidentally include something in that unclosed tag? I often format with indentation in order to work with the file comfortably.
Also it annoys me when people are still closing tags with '/>'.
Did you read the article? An unclosed <p> is not a mistake that the browser is tolerating, it's 100% well-defined html5 according to spec. If your tooling can't handle the language as defined, maybe the problem is with the tooling.
I mean even many mistakes.
Needs a (2017) in the title.
This may have been relevant 9 years ago, but today, just pick and auto-formatter like prettierjs and have it close these tags for you.
You are if I’m doing your code review.
…but civilized people do close them.
Please do, though :)
But I want to!
:)
[dead]
[flagged]
This a very verbose and confuse article. Mixing P/LI and IMG/BR is wrong. I think the situation could be explained with two points:
1. The autoclose syntax does not exist in HTML5, and a trailing slash after a tag is always ignored. It's therefore recommended to avoid this syntax. I.e write <br> instead of <br />. For details and a list of void elements, see https://developer.mozilla.org/en-US/docs/Glossary/Void_eleme...
2. It's not mandatory to close tags when the parser can guess where they end. E.g. a paragraph cannot contain any line-block, so <p>a<div>b</div> is the same as <p>a</p><div>b</div>. It depends on the context, but putting an explicit end tag is usually less error-prone.
Putting an explicit end tag is more error-prone. It won't do anything for valid HTML but it'll add empty tag for invalid HTML. If you want to improve human readability, put end tag enclosed in HTML comment. At least it won't add empty elements.