Comment by CoolGuySteve

5 years ago

Protobuf's abysmal performance, questionable integration into the C++ type system, append-only expandability, and annoying naming conventions and default values are why I usually try and steer away from it.

As a lingua franca between interpreted languages it's about par for the course but you'd think the fast language should be the fast path (ie: zero parsing/marshalling overhead in Rust/C/C++, no allocations) as you're usually not writing in these languages for fun but because you need the thing to be fast.

It's also the kind of choice that comes back to bite you years into a project if you started with something like Python and then need to rewrite a component in a systems language to make it faster. Now you not only have to rewrite your component but change the serialization format too.

Unfortunately Protobuf gets a ton of mindshare because nobody ever got fired for using a Google library. IMO it's just not that good and you're inheriting a good chunk of Google's technical debt when adopting it.

23 comments

CoolGuySteve

haberman 5 years ago

Zero parse wire formats definitely have benefits, but they also have downsides such as significantly larger payloads, more constrained APIs, and typically more constraints on how the schema can evolve. They also have a wire size proportional to the size of the schema (declared fields) rather than proportional to the size of the data (present fields), which makes them unsuitable for some of the cases where protobuf is used.

With the techniques described in this article, protobuf parsing speed is reasonably competitive, though if your yardstick is zero-parse, it will never match up.

CoolGuySteve 5 years ago
Situations where wire/disk bandwidth are constrained are usually better served by compressing the entire stream rather than trying to integrate some run encoding into the message format itself.
You only need to pay for decompression once to load the message into ram rather than being forced to either make a copy or pay for decoding all throughout the program whenever fields are accessed. And if the link is bandwidth constrained then the added latency of decompression is probably negligible.
The separation of concerns between compression format and encoding also allows specifically tuned compression algorithms to be used, for example like when switching zstd's many compression levels. Separating the compression from encoding also lets you compress/decompress on another processor core for higher throughput.
Meanwhile you can also do a one shot decompression or skip compression of a stream for replay/analysis; when talking over a low latency high bandwidth link/IPC; or when serializing to/from an already compressed filesystem like btrfs+zstd/lzo.
It's just more flexible this way with negligible drawbacks.
- nly 5 years ago
  
  Recently I've been looking at CapnProto which is a fixed offset/size field encoding that allows for zero copy/allocation decoding, and arena allocation during message construction.
  One nice design choice it has is to make default values zero on the wire by xor'ing all integral fields with the field default value.
  This composes well with another nice feature it has, which is an optional run-length style packed encooding that compresses these zero bytes down. Overall, not quite msgpack efficiency but still very good.
  One even more awesome feature is you can unpack the packed encoding without access to the original schema.
  Overall I think it's a well designed and balanced feature set.
  
  6 replies →

lmeyerov 5 years ago

We jumped from protobuf -> arrow in the very beginning of arrow (e.g., wrote on the main lang impls), and haven't looked back :)

if you're figuring out serialization from scratch nowadays, for most apps, I'd def start by evaluating arrow. A lot of the benefits of protobuf, and then some

cookguyruffles 5 years ago
I've played with Arrow a bunch of times and have yet to figure out what it's intended for precisely. (Not joking)
- VHRanger 5 years ago
  
  Interchange format for tabular data.
  Think pandas in python, but language agnostic.
  
  5 replies →

Cloudef 5 years ago

The protobuf itself as format isnt that bad, just the default implementations are bad. Slow compile times, code bloat and clunky apis / conventions. Nanopb is much better implementation and allows you to control code generation better too. Protobuf makes sense for large data, but for small data fixed length serialization with compression applied on top probably would be better.

gorset 5 years ago

It's obviously possible to do protobuf with zero parsing/marshalling if you stick to fixed length messages and 4/8 byte fields. Not saying that's a good idea, since there are simpler binary encodings out there when you need that type of performance.

ori_b 5 years ago
This is incompatible with protobuf. Protobuf has variable length encodings for all its integers, including field tags.
https://developers.google.com/protocol-buffers/docs/encoding
- gorset 5 years ago
  
  Actually you have both 32 and 64 bit wire types:
  - wire_type=1 64 bit: fixed64, sfixed64, double - wire_type=5 32 bit: fixed32, sfixed32, float
  Consider a valid protobuf message with such a field. If you can locate the field value bytes, you can write a new value to the same location without breaking the message. It's obviously possible to the same with the varint type too, as long as you don't change the number of bytes - not so practical, but useful for enum field which has a limited set of useful values (usually less than 128).
  Pregenerating protobuf messages you want to send and then modifying the bytes in-place before sending is going to give you a nice performance boost over "normal" protobuf serialization. It can be useful if you need to be protobuf compatible, but it's obviously better to use something like SBE - https://github.com/real-logic/simple-binary-encoding

fnord123 5 years ago

FWIW, the python protobuf library defaults to using the C++ implementation with bindings. So even if this is a blog post about implementing protobuf in C, it can also help implementations in other languages.

But yes, once you want real high performance, protobuf will disappoint you when you benchmark and find it responsible for all the CPU use. What are the options to reduce parsing overhead? flatbuffers? xdr?

TheGuyWhoCodes 5 years ago

Flatbuffers, Cap'n'Proto and Apache Arrow comes to mind.