Comment by TekMol

1 month ago

From my perspective on databases, two trends continued in 2025:

1: Moving everything to SQLite

2: Using mostly JSON fields

Both started already a few years back and accelerated in 2025.

SQLite is just so nice and easy to deal with, with its no-daemon, one-file-per-db and one-type-per value approach.

And the JSON arrow functions make it a pleasure to work with flexible JSON data.

64 comments

TekMol

delaminator 1 month ago

From my perspective, everything's DuckDB.

Single file per database, Multiple ingestion formats, full text search, S3 support, Parquet file support, columnar storage. fully typed.

WASM version for full SQL in JavaScript.

sanderjd 1 month ago
This is a funny thread to me because my frustration is at the intersection of your comments: I keep wanting sqlite for writes (and lookups) and duckdb for reads. Are you aware of anything that works like this?
- nlittlepoole 1 month ago
  
  DuckDB can read/write SQLite files via extension. So you can do that now with DuckDB as is.
  https://duckdb.org/docs/stable/core_extensions/sqlite
  
  8 replies →
- SchwKatze 1 month ago
  
  I think you could build an ETL-ish workflow where you use SQLite for OLTP and DuckDB for OLAP, but I suppose it's very workload dependent, there are several tradeoffs here.
  
  1 reply →
swyx 1 month ago
very interesting. whats the vector indexing story like in duckdb these days?
also are there sqlite-duckdb sync engines or is that an oxymoron
- cfors 1 month ago
  
  https://duckdb.org/docs/stable/core_extensions/vss
  It's not bad if you need something quick. I haven't had a large need of ANN in duckdb since it's doing more analytical/exploratory needs, but it's definitely there if you need it.

DrBazza 1 month ago

From my perspective - do you even need a database?

SQLite is kind-of the middle ground between a full fat database, and 'writing your own object storage'. To put it another way, it provides 'regularised' object access API, rather than, say, a variant of types in a vector that you use filter or map over.

TekMol 1 month ago
If I would write my own data storage I would re-implement SQLite. Why would I want to do that?
- trevor-e 1 month ago
  
  Not sure if this is quite what you are getting at, but the SQLite folks even mention this as a great use-case: https://www.sqlite.org/appfileformat.html

kopirgan 1 month ago

As a backend database that's not multi user, how many web connections that do writes can it realistically handle? Assuming writes are small say 100+ rows each?

Any mitigation strategy for larger use cases?

Thanks in advance!

loxs 1 month ago
After 2 years in production with a small (but write heavy) web service... it's a mixed bag. It definitely does the job, but not having a DB server does have not only benefits, but also drawbacks. The biggest being (lack of) caching the file/DB in RAM. As a result I have to do my own read caching, which is fine in Rust using the mokka caching library, but it's still something you have to do yourself, which would otherwise come for free with Postgres. This of course also makes it impossible to share the cache between instances, doing so would require employing redis/memcached at which point it would be better to use Postgres.
It has been OK so far, but definitely I will have to migrate to Postgres at one point, rather sooner than later.
- TekMol 1 month ago
  
  How would caching on the db layer help with your web service?
  In my experience, caching makes most sense on the CDN layer. Which not only caches the DB requests but the result of the rendering and everything else. So most requests do not even hit your server. And those that do need fresh data anyhow.
  
  1 reply →
- kopirgan 1 month ago
  
  I am no expert, but SQLite does have in memory store? At least for tables that need it..ofc sync of the writes to this store may need more work.
WJW 1 month ago
Couple thousand simultaneous should be fine, depending on total system load, whether you're running on spinning disks or on SSDs, p50/99 latency demands and of course you'd need to enable the WAL pragma to allow simultaneous writes in the first place. Run an experiment to be sure about your specific situation.
- laurencerowe 1 month ago
  
  You also need BEGIN CONCURRENT to allow simultaneous write transactions.
  https://www.sqlite.org/src/doc/begin-concurrent/doc/begin_co...
TekMol 1 month ago
Why have multiple connections in the first place?
If your writes are fast, doing them serially does not cause anyone to wait.
How often does the typical user write to the DB? Often it is like once per day or so (for example on hacker news). Say the write takes 1/1000s. Then you can serve
1000 * 60 * 60 * 24 = 86 million users
And nobody has to wait longer than a second when they hit the "reply" button, as I do now ...
- frje1400 1 month ago
  
  > If your writes are fast, doing them serially does not cause anyone to wait.
  Why impose such a limitation on your system when you don't have to by using some other database actually designed for multi user systems (Postgres, MySQL, etc)?
  
  5 replies →
- kopirgan 1 month ago
  
  That depends on the use case. HN is not a good example. I am referring to business applications where users submit data. Ofc in these cases we are looking at 00s not millions of users. The answer is good enough.
- nijave 1 month ago
  
  >How often does the typical user write to the DB
  Turns out a lot when you have things like "last accessed" timestamps on your models.
  Really depends on the app
  I also don't think that calculation is valid. Your users aren't going to be purely uniformly accessing the app over the course of a day. Invariably you'll have queuing delays above a significantly smaller user count (but maybe the delays are acceptable)

andrewinardeer 1 month ago

Pardon my ignorance, yet wasn't the prevailing thought a few years ago that you would never use SQLite in production? Has that school of thought changed?

WJW 1 month ago
SQlite as a database for web services had a little bit of a boom due to:
1. People gaining newfound appreciation of having the database on the same machine as the web server itself. The latency gains can be substantial and obviously there are some small cost savings too as you don't need a separate database server anymore. This does obviously limit you to a single web server, but single machines can have tons of cores and serve tens of thousands of requests per second, so that is not as limiting as you'd think.
2. Tools like litestream will continuously back up all writes to object storage, so that one web server having a hardware failure is not a problem as long as your SLA allow downtimes of a few minutes every few years. (and let's be real, most small companies for which this would be a good architecture don't have any SLA at all)
3. SQLite has concurrent writes now, so it's gotten much more performant in situations with multiple users at the same time.
So for specific use cases it can be a nice setup because you don't feel the downsides (yet) but you do get better latency and simpler architecture. That said, there's a reason the standard became the standard, so unless you have a very specific reason to choose this I'd recommend the "normal" multitier architectures in like 99% of cases.
- pixelesque 1 month ago
  
  > SQLite has concurrent writes now
  Just to clarify: Unless I've missed something, this is only with WAL mode and concurrent reads at the same time as writes, I don't think it can handle multiple concurrent writes at the same time?
  
  3 replies →
- chasd00 1 month ago
  
  I’m a fan of SQLite but just want to point out there’s no reason you can’t have Postgres or some other rdbms on the same machine as the webserver too. It’s just another program running in the background bound to a port similar to the web server itself.
lpil 1 month ago
SQLite is likely the most widely used production database due to its widespread usage in desktop and mobile software, and SQLite databases being a Library of Congress "sustainable format".
- zerr 1 month ago
  
  Most of the usage was/is as a local ACID-compliant replacement for txt/ini/custom local/bundled files though.
em500 1 month ago

"Production" can mean many different things to different people. It's very widely used as a backend strutured file format in Android and iOS/macOS (e.g. for appls like Notes, Photos). Is that "production"? It's not widely used and largely inappropriate for applications with many concurrent writes.
Sqlite docs has a good overview of appropriate and inappropriate uses: https://sqlite.org/whentouse.html It's best to start with Section 2 "Situations Where A Client/Server RDBMS May Work Better"
almost 1 month ago
The reason you heard that was probably because they were talking about a more specific circumstance. For example SQLite is often used as a database during development in Django projects but not usually in production (there are exceptions of course!). So you may have read when setting up Django, or a similar thing, that the SQLite option wasn't meant for production because usually you'd use a database like Postgres for that. Absolutely doesn't mean that SQLite isn't used in production, it's just used for different things.
- andrewinardeer 1 month ago
  
  You are right. Thanks!
scott_w 1 month ago
Only for large scale multiple user applications. It’s more than reasonable as a data store in local applications or at smaller scales where having the application and data layer on the same machine are acceptable.
If you’re at a point where the application needs to talk over a network to your database then that’s a reasonable heuristic that you should use a different DB. I personally wouldn’t trust my data to NFS.
- kunley 1 month ago
  
  What is a "local application"?
  
  4 replies →

quotemstr 1 month ago

FWIW (and this is IMHO of course) DuckDB makes working with random JSON much nicer than SQLite, not least because I can extract JSON fields to dense columnar representations and do it in a deterministic, repeatable way.

The only thing I want out of DuckDB core at this point is support for overriding the columnar storage representation for certain structs. Right now, DuckDB decomposes structs into fields and stores each field in a column. I'd like to be able to say "no, please, pre-materialize this tuple subset and store this struct in an internal BLOB or something".

randomtoast 1 month ago

I would say SQLite when possible, PostgreSQL (incl. extensions) when necessary, DuckDB for local/hobbyist data analysis and BigQuery (often TB or PB range) for enterprise business intelligence.

odie5533 1 month ago

For as much talk as I see about SQLite, are people actually using it or does it just have good marketers?

SJMG 1 month ago

It's the standard for mobile. That said, in server-side enterprise computing, I know no one who uses it. I'm sure there are applications, but in this domain you'd need a good justification for not following standard patterns.
I have used DuckDB on an application server because it computes aggregations lightning fast which saved this app from needing caching, background services and all the invalidation and failure modes that come with those two.
sgbeal 1 month ago

> are people actually using it or does it just have good marketers?
_You_ are using it right this second. It's storing your browser's bookmarks (at a minimum, and possibly other browser-internal data).
TekMol 1 month ago

Among people who can actually code (in contrast to just stitch together services), I see it used all around.
For someone who openly describes his stack and revenue, look up Pieter Levels, how he serves hundreds of thousands of users and makes millions of dollars per year, using SQLite as the storage layer.
asa400 1 month ago

Do you have a specific use case you're curious about? It's the most widely deployed database software of all time. https://sqlite.org/mostdeployed.html
greenavocado 1 month ago

If you use desktops, laptops, or mobile phones, there is a very good chance you have at least ten SQLite databases in your possession right now.
CyberDildonics 1 month ago
It is fantastic software, have you ever used it?
- odie5533 1 month ago
  
  I don't have a use case for it. I've used it a tiny bit for mocking databases in memory, but because it's not fully Postgres, I've switched entirely to TestContainers.

CuriouslyC 1 month ago

I think the right pattern here is edge sharding of user data. Cloudflare makes this pretty easy with D1/Hyperdrive.

phendrenad2 1 month ago

Man, I hope so. Bailing people out of horribly slow NoSQL databases is good business.