Comment by andriy_koval

4 hours ago

Thank you for clarification, I was wrong in my prev comment.

> - [1] An interactive transaction is a transaction where you intermingle database queries and application logic (running on the application).

could you give specific example why do you think SQlite can do batching and PG not?

6 comments

andriy_koval

hedora 3 hours ago

Not the person you are responding to, but sqlite is single threaded (even in multi process, you get one write transaction at a time).

So, if you have a network server that does BEGIN TRANSACTION (process 1000 requests) COMMIT (send 1000 acks to clients), with sqlite, your rollback rate from conflicts will be zero.

For PG with multiple clients, it’ll tend to 100% rollbacks if the transactions can conflict at all.

You could configure PG to only allow one network connection at a time, and get a similar effect, but then you’re paying for MVCC, and a bunch of other stuff that you don’t need.

andriy_koval 2 hours ago
In your example, clients can't have their own transactions? You commit/rollback all requests for all 1000 clients together?
- andersmurphy 2 hours ago
  
  Sqlite supports nested transactions with SAVEPOINT so each client can have their own logical transaction that can be rolled back. The outer transaction just batches the fsync effectively. So an individual client failing a transaction doesn't cause the batch to fail. But, a crash would cause the batch to fail. Because, it's a single writer, there's no rollback/retries from contention/MVCC.
  You could try to imitate this in postgresql but the problem is the outer transaction does not eliminate the network hops for each inner/client transaction so you don't gain anything doing it and you still have the contention problem which will cause rollbacks/retries. You could reduce your number of connections to one to eliminate contention. But, then you are just playing sqlite's game.
  
  1 reply →

andersmurphy 2 hours ago

An interactive transaction works like this in pseudo code.

beginTx

  // query to get some data (network hop)
  result = exec(query1)
  
  // application code that needs to run in the application
  safeResult = transformAndValidate(result)
  
  // query to write the data (network hop)
  exec(query2, safeResult)

endTx

How would you batch this in postgres and get any value? You can nest them all in a single transaction. But, because they are interactive transactions that doesn't reduce your number of network hops.

The only thing you can batch in postgres to avoid network hops is bulk inserts/updates.

But, the minute you have interactive transactions you cannot batch and gain anything when there is a network.

Your best bet is to not have an interactive transaction and port all of that application code to a stored procedure.

andriy_koval 1 hour ago

> How would you batch this in postgres and get any value? You can nest them all in a single transaction. But, because they are interactive transactions that doesn't reduce your number of network hops.
you can write it as stored procedure in your favorite language, or use domain socket where communication happens using through shared memory buffs without network involved.
In your post, I think big performance hit for postgres potentially comes from focus on update only statement, in SQlite updates likely happen in place, while postgress creates separate record on disk for each updated record, or maybe some other internal stuff going on.
Your benchmark is very simplistic, it is hard to tell what would be behavior of SQlite if you switch to inserts for example, or many writers which compete for the same record, or transaction would be longer. Industry built various benchmarks for this, tpc for example.
Also, if you want readers understand your posts better, you can consider using less exotic language in the future. Its hard to read what is and how is batched there.