Comment by daitangio

16 days ago

I am using SQLite on paperless-ngx (an app to manage pdf [4]). It is quite difficult to beat SQLite if you do not have a very huge parallelism factor in writes.

SQLite is an embedded database: no socket to open, you directly access to it via file system.

If you do not plan to use BigData with high number of writers, you will have an hard time beating SQLite on modern hardware, on average use cases.

I have written a super simple search engine [1] using python asyncio and SQLite is not the bottleneck so far.

If you are hitting the SQLite limit, I have an happy news: PostgreSQL upgrade will be enough for a lot of use cases [2]: you can use it to play with a schemaless mongo-like database, a simple queue system [3] or a search engine with stemming. After a while you can decide if you need a specialized component (i.e. Kafka, Elastic Search, etc) for one of your services.

[1]: https://github.com/daitangio/find

[2]: https://gioorgi.com/2025/postgres-all/

[3]: https://github.com/daitangio/pque

[4]: https://docs.paperless-ngx.com

8 comments

daitangio

CuriouslyC 15 days ago

The pattern I like to advocate for now is to do customer sharding with SQLite. Cloudflare makes this easy with D1, you can tie Durable Objects to a user as an afterthought.

The nice thing about this pattern is that you can create foreign data wrappers for your customer SQLite databases and query them as if they were in postgres, cross customer aggregations are slow but individual customer analytics are quite fast, and this gives you near infinite scalability.

storystarling 15 days ago

You hit those write limits surprisingly early if you use background workers though. I had a project with very little user traffic that choked on SQLite simply because a few Celery workers were updating job statuses concurrently. It wasn't the volume of data, just the contention from the workers that forced the switch to Postgres.

liuliu 15 days ago
Are you sure it is choked on writes not on reads and writes? SQLite default setup is inefficient in many ways (as well as it's default compilation options), and that often cause issues.
(I am just asking: are you sure WAL is on?)
- conradkay 15 days ago
  
  I'd imagine that's it. With WAL you can probably hit >1000 writes a second
adityaathalye 15 days ago
SQLite emphatically warns against concurrent writes. It is not designed for that.
I'm seeing these numbers on my current scratch benchmark:
- Events append to a 10M+ record table (~4+ GiB database).
- Reads are fetched from a separate computed table, which is trigger-updated from the append-only table.
- WAL-mode ON, Auto-vacuum ON
{:dbtype "sqlite", :auto_vacuum "INCREMENTAL", :connectionTestQuery "PRAGMA journal_mode;", :preferredTestQuery "PRAGMA journal_mode;", :dataSourceProperties {:journal_mode "WAL", :limit_worker_threads 4, :page_size 4096, :busy_timeout 5000, :enable_load_extension true, :foreign_keys "ON", :journal_size_limit 0, :cache_size 15625, :maximumPoolSize 1, :synchronous "NORMAL"}},
- 1,600 sequential (in a single process) read-after-write transactions, append-only, no batching.
- With a separate writer process (sequential), and concurrently, two reader processes, I'm seeing 400+ append transactions/second (into the append-only table, no batching), and a total of 41,000 reads per second, doing `select *` on the trigger-updated table.
My schema is (deliberately) poor --- most of it is TEXT.
(edit: add clarifying text)
- TylerE 15 days ago
  
  Isn’t that schema actually the opposite of poor for SQLite, since it stores everything as text internally?
  
  1 reply →

appplication 14 days ago

I recently chose Postgres over SQLite for a project, but only after taking SQLite as far as I comfortably could. I found I pretty much immediately ran into issues with how to run migrations in my remote environment, since it wasn’t network accessible. I also wasn’t easily able to connect a db manager for admin tasks.

My data scale is quite small (hundreds on mb), so you’d think SQLite would be perfect but Postgres really was just a lot simpler to spin up in a docker container and the performance difference in a 2G VPS is not noticeable. I’m sure the above issues were solvable but it was easier for me to just use Postgres and move on.