← Back to context

Comment by jraph

7 hours ago

> there is no such thing as invalid HTML

There is. There are things that are still considered invalid, like nesting form elements for instance.

(this doesn't take away your argument though, and you were focusing on the parsing aspect).

The things that are invalid should all have defined behaviour. For example, a <label> is not allowed to contain two form controls, but is defined as applying to the first such control.

As far as parse errors is concerned, https://html.spec.whatwg.org/multipage/parsing.html#parse-er... says:

> This specification defines the parsing rules for HTML documents, whether they are syntactically correct or not. Certain points in the parsing algorithm are said to be parse errors. The error handling for parse errors is well-defined (that's the processing rules described throughout this specification), but user agents, while parsing an HTML document, may abort the parser at the first parse error that they encounter for which they do not wish to apply the rules described in this specification.

  • > The things that are invalid should all have defined behaviour

    100% agree.

    And then I guess the philosophical question is "What's invalid when everything is defined?"

    • The idea of almost all of HTML’s errors (parsing and conformance) is that they indicate likely errors (though it’s definitely quite possible to deliberately skirt the edges, e.g. content=width=device-width,initial-scale=1).