Comment by jokoon
11 years ago
People can read text and understand it. Computers need parsers. Parsers are hard to write, and the time needed to parse one text will be just proportionate to the length of the text. Parsing can't be parallelized.
I'm sure the browser industry could benefit from a open, compiled html format, it would be so fast. I still wonder why there is no such format.
It's not about filesize though, gzip does a really great job at compressing text, but it's just about making a page load faster. It's no surprise to see web browser use so much memory: html is very flexible (there's nothing better), but it's fat.
That is a problem somewhat similar to the RISC vs x86. Risc has a simpler set of instructions, is a faster processor, but executables are much much bigger, requiring more cache. x86 has a more complex set of instructions, so it's slower, but the executables are much smaller. It's a balance to find.
I wonder if you could extend battery life by using compiled html. I would love to test that kind of tech on "normal" cellphones and see if how it performs.
> I'm sure the browser industry could benefit from a open, compiled html format, it would be so fast. I still wonder why there is no such format.
Has anybody even tried making one and it just hasn't been adopted or is this a new idea?
Microsoft already has the .chm format, so there might be a patent, but I'm no expert. I don't really know how their format works though. It might be compiled as in "obfuscated".
But I don't think there's any existing, open format like that. Plain text html has the advantage of being easily diagnosed and immediatly readable, but you could easily make a binary format decompilable. I guess most programmers prefer having plain text because it's right before their eyes, it also sort of is "open", but that's not what open source really means.
It's not a new idea, but when I think about it, compiled html is a good solution to speed up web browsers. Now the .CHM format is not what you would want, as it's more targeted towards documentation, and is not extensible with CSS like html is now. It's an abandoned format I guess. Lighter than PDF I think.
By the way, when I say compiled HTML, I mean a binary version of a webpage that is already parsed. Would be a tree structured file. The goal is to remove the parsing phase.
But indeed, that could be a good opportunity for big tech firms to push that format. As long as it's open it might be a big success.