Comment by dverite

4 months ago

The blog post says as the top:

"The network is better utilized because successive queries can be grouped in the same network packets, resulting in less packets overall"

For some reason, you don't believe it. OK, let's look at these wireshark network statistics when the test script runs inserting 100k rows, capturing the traffic on the Postgres port.

- case without pipelining (result of "tshark -r capture-file -q z io,stat,0"):

  ==================================== 
  | IO Statistics                    | 
  |                                  | 
  | Duration: 53.1 secs              | 
  | Interval: 53.1 secs              | 
  |                                  | 
  | Col 1: Frames and bytes          | 
  |----------------------------------| 
  |              |1                  | 
  | Interval     | Frames |   Bytes  | 
  |----------------------------------| 
  |  0.0 <> 53.1 | 200054 | 20304504 | 
  ====================================

- case with pipelining:

  ======================================
  | IO Statistics                      |
  |                                    |
  | Duration: 2.209 secs               |
  | Interval: 2.209 secs               |
  |                                    |
  | Col 1: Frames and bytes            |
  |------------------------------------|
  |                |1                  |
  | Interval       | Frames |   Bytes  |
  |------------------------------------|
  | 0.000 <> 2.209 |  10885 | 12219449 |
  ======================================

So compared to needing 2 packets per query in the non-pipelining case, the pipelining case needs about 10 times less.

Again, it's because the client buffers the queries to send in the same batch (this buffer seems to be 64k bytes currently), in addition to not waiting for the results of previous queries.

0 comments

dverite

No comments yet

Contribute on Hacker News ↗