Comment by ori_b

5 years ago

Please no. This is an incredibly complicated file format that is not particularly well suited for file saving. It's also very slow ((edit: correction) ~10 transactions per second) if you aim for integrity, or unsafe if you tune for performance (~50k transactions per second, but if your program or computer dies half way, your file is hosed). So, keeping files up to date in place makes your ui janky.

You can work around it by working in memory and writing out a whole new database from in memory structures on save and then do the atomic rename. But if you do that, you are probably better off with json, protobuf, or similar. The libraries around these formats are similarly battle tested, but they fit the needs better, supporting working in ram fully and then saving cleanly and easily.

> Please no. This is an incredibly complicated file format that is not particularly well suited for file saving. It's also very slow (~100 transactions per second) if you aim for integrity, or unsafe if you don't.

The kind of application files they are talking about (things like word processor documents, spreadsheets, drawings, source code control system data) would only be writing sporadically. During one of those sporadic writes they might need to update thousands of rows but those could all be done in one transaction.

  • I made a mistake -- it's an order of magnitude slower. That means a single transaction is within the threshold for human perception. If you block the UI thread on a transaction, you're now dropping 6 frames at 60fps. If you block other operations on the transaction completing, you end up with lag.

    Reference: https://www.sqlite.org/faq.html, question 19. (I've seen similar when testing on SSDs locally).

    • If you have an FPS or interactivity target that needs to be guaranteed to be better than about 0.1-1 second - depending on platform - you can't have any disk IO at all on the UI thread, or any thread that's supposed to react at interactive rates at all times.

      It has generally become much better in the last decade or two, but one should still expect most OS's to sometimes pause for excessive amounts of time on disk IO unless the API is specifically guaranteed to never pause. Even then one would be wise to measure/log deviations if it's critical for the application. OS guarantees might also be contingent on driver/subsystem guarantees, and bad drivers might sometimes affect what seems completely unrelated upstream systems.

      9 replies →

> if your program or computer dies half way, your file is hosed

Doesn't this apply to writing files in general? It's not unique to SQLite...

Edit: Nope, SQLite is designed to be protected against crashes, even OS crashes.

> It's also very slow (~100 transactions per second) > But if you do that, you are probably better off with json, protobuf, or similar.

Are you talking about a server application?

> or unsafe if you find for performance (you can get it up to, IIRC, ~50k transactions per second, but if your program or computer dies half way, your file is hosed)

Never heard that SQLite has unsafe operations. Any source?

  • > Are you talking about a server application?

    I'm talking about doing a handful of transactions -- even a single one, now that I looked at the actual numbers that sqlite discusses -- being enough to introduce user-visible jank.

    > Never heard that SQLite has unsafe operations. Any source?

    The sqlite docs. You can improve the performance of sqlite by multiple orders of magnitude by messing with things like https://www.sqlite.org/pragma.html#pragma_synchronous, at the cost of safe atomic updates.

    • > The sqlite docs. You can improve the performance of sqlite by multiple orders of magnitude by messing with things like https://www.sqlite.org/pragma.html#pragma_synchronous, at the cost of safe atomic updates.

      Thanks for the link, but only when the OS or your computer crashes, not even the application itself. Could you introduce any application file format that is safe in that case? Obviously your JSON file format is much more unsafe than that, not to mention the huge overhead of JSON with binary data.

      1 reply →

> It's also very slow ((edit: correction) ~10 transactions per second ...

Huh?

I was pushing updates to a SQLite file on my laptop a few weeks ago with code that's not at all optimised, and wasn't too fussed with ~400 transactions a second, sustained for a minute or so.

What were you doing that only gave 10 transactions a second?