← Back to context

Comment by watermelon0

5 years ago

Have you actually read the article? SQLite is unmodified, and thinks it runs on a virtual file system, which fetches file chunks via HTTP range headers.

It's REALLY impressive that you only need to read 54 KB out of 700 MB, to fetch the records.

> It's REALLY impressive that you only need to read 54 KB out of 700 MB, to fetch the records.

the harsh reality is that doing sensible queries that only reference and return the data actually needed always makes things faster. Even with server DBMS. Oh, how many times have I lamented the naive "select *" for forcing all the row contents even when there was index coverage for the actually needed data.

Do most static site hosters support range requests?

  • Most web servers do out of the box, so I would assume most do. Basically all unless they have some reason to turn range processing off or are running a custom/experimental/both server that have implemented the feature (yet).

    Not supporting range requests would be a disadvantage for any service hosting large files. Resuming failed long downloads wouldn't work so users might not be happy and there would be more load on your bandwidth and other resources as the AU falls back to performing a full download.

  • Generally yes. Because not having range support means you can't resume file downloads. Which is a pretty essential feature for a static file host.

  • More interestingly, do reverse-proxies like Varnish / CDNs like Cloudflare support range requests? If so, do they fetch the whole content on the back, and then allow arbitrary range requests within the cached content on the front?

  • I was wondering that too. Support was spotty in general ~20 years ago but I assume things have improved since then.

It's impressive on one hand.

On the other it's still a lot of overhead.

  • I would say it's less overhead than downloading the entire db to query it locally...? What is your suggestion for accessing a static database with less overhead?

    • I would bet that if you compare it to a traditional server-client database (which functionally does essentially the same thing: you send it a query over the network, and get a result back), the overhead is probably massive. This is a very clever way to cram that kind of functionality into a static hosting site, and you can imagine some uses for it, but it's clearly not a particularly efficient compared to doing it the "right" way.

    • The idea is that you're weighing the pros cons vs an actual live database. This is basically only a good idea if you're having someone else paying the hosting fees.

    • Since it's a static database and the queries against it are most likely going to be static, just pre-run the queries and store the results statically in a more space-efficient format. When you're on a dogshit internet connection in a 3rd world country 50kb can actually be pretty unpleasant. Try rate limiting your internet to EDGE rated to see what I mean.

      I'm not saying the whole thing isn't impressive, just that the concept itself is one of those "because I can" rather "because I should" things, which kinda devalues it a whole lot.

  • For a casual or personal use case though, the alternative of running a client-server database on something like a VPS is probably more overhead than this. It's unlikely to be a very scalable option, but for use cases as described by the author it seems like a good fit.

    • I know of the drawbacks of the approach and wouldn't chose it for a lot of my projects, but I would say it is very scalable in those cases where I would. Put the DB on GitHub Pages for free along with your HTML/JS code and you can scale to whatever GitHub is willing and capable of delivering. Yes, your users might transfer way more data than needed but you pay nothing for it and do not have to maintain servers.

      In the standard scenario for personal projects (not enterprise) I would have a small VPS/Dedicated server with a REST service running - that would be hugged to death immediately if a link would make it to some site like HN. And also, I completely share the experience of the Author that after a couple of years you have moved on, the VPS is dead etc and you don't want to invest time.

      Again, before considering using solution, be sure to understand how it works and the resulting limitations or you will likely chose wrong.