Comment by refulgentis
1 day ago
> Isn't concurrency also limited by your machines disk speed for writes
Yes, in theory: given a large enough database, and a disk that can only do one operation at a time, and a large enough operation that touches enough of the database. In practice, in a SQLite single tenant scenario? No, not at all.
> what difference does it make if you write sequentially vs concurrently. Why does concurrency even matter for databases?
As soon as your codebase involves reacting to events independently of a user taking action it becomes a practical concern. Generally, this is a broad question and has 1,000,000 answers.
EDIT: Originally I had "I think you understand generally, no?" appended but realized that's not helpful at all, if you did, you wouldn't be asking.
Something that may help is imagining what'd happen if a DB wasn't thread safe / didn't allow multiple writers. Ex. in SQLite's case, it allows multiple write operations to take place but there's a one-at-a-time queue. If we didn't have databases that were able to execute multiple writes simultaneously, you'd need a separate database for each concurrent writer you expect, and you'd effectively have a global lock. Orderly scaling would be ~impossible unless you did something crazy like have a single server per user
I guess I need to dive deeper into this as I do not understand the implications you gave me, but I appreciate the attempt. Generally I understand why concurrency is good in many cases, I just dont get why its important for database stuff too.
Edit: thanks for clarifying in the edit, makes a lot more sense.
Imagine if every tweet had to go through a one-at-a-time queue before being persisted. There's about 6000 tweets per second, so you would have to be able to save them at <0.17ms per tweet or else you would become backlogged. If you are getting backlogged, you have to buffer those incoming tweets somewhere until they can be writted, and eventually that buffer gets full and you start losing tweets.
Maybe that too is a native question, but there's a large scale between single user and 6000 tweets per second - most of our apps will never reach anything approaching even one save a second. So where to draw the line? I do far have gone the sqlite route for my hobby apps as it's so easy to handle and doesn't require setting up two docker containers for a single app. Am I drawing myself in a corner in case my apps ever do become relevant?
2 replies →
If we imagine 1 tweet = 1 transaction, that's only 6k tps. 6k tps is completely achievable, dare I say even pedestrian for an optimized database. And most systems are operating far below the scale of Twitter/X.
Sqlite can quite easily do 5000+ insert+commits per second on typical NVMe drives.
Speed is rarely the constraint that makes it unsuitable for an application.
1 reply →
While I understand your point and like the explanation, I gotta make the joke that some Tweets should be lost