My experience is that the practical performance achievable with Go is higher because the C++ lifetime issues are too difficult to reason about and therefore the developer is forced to copy for safety. In Go you can fairly easily alias everything from the physical buffer into your parsed object. In the official C++ library, protobuf refuses to acknowledge even the possibility of aliasing. Even if you say that your string types are "view" there is an owned buffer inside the generated class into which your data is copied. This is exasperating because inside Google they have several different ways to not copy a string into a protobuf, and they're all patched out of the open source edition, and you can read them and cry about it by looking at their git logs for "internal change" commits with baffling only-whitespaces changes that are symptomatic of where they are patching out the good stuff.
Oh it’s worse, it’s a full on marshal of the whole data. What we need is a no-allocation-protobuf that binds to existing memory, knows about aliases, can deal with a pointer. I love protobuf but I’ve moved to other messaging implementations that provide a faster marshal/unmarshal. Maybe I’ll give this a try.
Even before Hyperpb, Go was already very competitive, e.g. this article from last year: https://www.greptime.com/blogs/2024-04-09-rust-protobuf-perf...
My experience is that the practical performance achievable with Go is higher because the C++ lifetime issues are too difficult to reason about and therefore the developer is forced to copy for safety. In Go you can fairly easily alias everything from the physical buffer into your parsed object. In the official C++ library, protobuf refuses to acknowledge even the possibility of aliasing. Even if you say that your string types are "view" there is an owned buffer inside the generated class into which your data is copied. This is exasperating because inside Google they have several different ways to not copy a string into a protobuf, and they're all patched out of the open source edition, and you can read them and cry about it by looking at their git logs for "internal change" commits with baffling only-whitespaces changes that are symptomatic of where they are patching out the good stuff.
Oh it’s worse, it’s a full on marshal of the whole data. What we need is a no-allocation-protobuf that binds to existing memory, knows about aliases, can deal with a pointer. I love protobuf but I’ve moved to other messaging implementations that provide a faster marshal/unmarshal. Maybe I’ll give this a try.
4 replies →
I think you can alias the input data using Cord fields? As long as the input is Cord.
1 reply →