Comment by benmmurphy
7 hours ago
databases also solve this problem by allowing the read only transaction to see a consistent snapshot of the data at some point in time. so the reader can do its work without needing a logical mutex.
7 hours ago
databases also solve this problem by allowing the read only transaction to see a consistent snapshot of the data at some point in time. so the reader can do its work without needing a logical mutex.
> databases also solve this problem
They could, but are typically configured not to.
With default settings they let other processes' commits cut into the middle of your read. (See READ_COMMITTED.). I was flabbergasted when I read this.
When you turn MSSQL up to a higher consistency level you can get it to grind to a halt and throw Deadlock exceptions on toy code. (Moving money between accounts will do it).
I used to advertise STM as like ACID-databases for shared-memory systems, but I guess it's even more consistent than that.
Postgres's `set transaction isolation level serializable` doesn't result in deadlocks (unless you explicitly grab locks) but does require your application to retry transactions that fail due to a conflict.
Yeah, STMs can use MVCC in the same way, but that doesn't solve the long-transaction problem for bulk update transactions. The first example in Gray and Reuter is adding monthly interest to every savings account, exactly once.
long-running transactions are a really fundamentally bad idea if the length of the transaction scales with the number of accounts/users. these things need to be batched up in a way that code with at-least-once semantics correctly solves the problem. (in one TX handle just a bunch of accounts, calculating the interest, recording that these accounts already got the interest, and moving to the next batch.)
the problem with STM (or any other concurrency control mechanism) is that it's not their job to solve these issues (as usually they require conscious product design), so they are not going to solved by "just use STM".
There are some different approaches, but that's definitely one of them. However, it's not without its tradeoffs; you're sacrificing atomicity, in the sense that if updating some account throws an error, the already-updated accounts stay updated.