How to use Linux vsock for fast VM communication

11 hours ago (popovicu.com)

Says it is fast, but presents zero benchmarks to demonstrate it is actually fast or even “faster”. It is shameful to make up adjectives just to sound cool.

  • It's probably faster than, say, an emulated UART port.

    But likely no faster than a TCP socket across a virtio-net device.

  • vsock is pretty widely used, and if you're using virtio-vsock it should be reasonably fast. Anyway if you want to do some quick benchmarks and have an existing Linux VM on a libvirt host:

    (1) 'virsh edit' the guest and check it has '<vsock/>' in the <devices> section of the XML.

    (2) On the host:

      $ nbdkit memory 1G --vsock -f
    

    (3) Inside the guest:

      $ nbdinfo 'nbd+vsock://2'
    

    (You should see the size being 1G)

    And then you can try using commands like nbdcopy to copy data into and out of the host RAM disk over vsock. eg:

      $ time nbdcopy /dev/urandom 'nbd+vsock://2' -p
      $ time nbdcopy 'nbd+vsock://2' null: -p
    

    On my machine that's copying at a fairly consistent 20 Gbps, but it's going to depend on your hardware.

    To compare it to regular TCP:

      host $ nbdkit memory 1G -f -p 10809
      vm $ time nbdcopy /dev/urandom 'nbd://host' -p
      vm $ time nbdcopy 'nbd://host' null: -p
    

    TCP is about 2.5x faster for me.

    I had to kill the firewall on my host to do the TCP test (as trying to reconfigure nft/firewalld was beyond me), which actually points to one advantage of vsock, it bypasses the firewall. It's therefore convenient for things like guest agents where you want them to "just work" without reconfiguration hassle.

    • > It's therefore convenient for things like guest agents where you want them to "just work" without reconfiguration hassle.

      This. The point of vsock is not performance, it's the zero-configuration aspect of them. No IP address plan. No firewall. No DHCP. No nothing. Just a network-like API for guest-host communication for guest agents and configuration agents. Especially useful to fetch a configuration without having a configuration.

      IMHO the "fast" in the original article should be read as "quick to setup", not as "high bandwidth".

Given how slow protobufs and grpc is, I wonder if the socket transport would ever be the bottleneck to throughput here.

Changing transports means if you want to move your grpc server process to a different box you now have new runtime configuration to implement/support and new performance characteristics to test.

I can see some of the security benefits if you are running on one host, but I also don't buy the advantages highlighted at the end of the article about using many different OS's and language environments on a single host. Seems like enabling and micro-optimising chaos instead of trying to tame it.

Particularly in the ops demo: Statically linking a C++ grpc binary, and standardising on host OS and gcc-toolset, doesn't seem that hard. On the other hand, if you're using e.g. a python rpc server are you even going to be able to feel the impact of switching to vsock?

  • > Given how slow protobufs and grpc is, I wonder if the socket transport would ever be the bottleneck to throughput here.

    I think this is supposed to be option for when you want to pass stuff to host quickly without writing another device driver or using other interface rather than replacement for any rpc between VMs. "Being fast" is just a bonus.

    For example at our job we use serial port for the communication with VM agent (it's just passing some host info about where VM is running, so our automation system can pick it up), this would be ideal replacement for that.

    And as it is "just a socket", stuff like this is pretty easy to setup https://libvirt.org/ssh-proxy.html