Comment by GuB-42
3 hours ago
What I find particularly ironic is that the title make it feel like Rust gives a 5x performance improvement when it actually slows thing down.
The problem they have software written in Rust, and they need to use the libpg_query library, that is written in C. Because they can't use the C library directly, they had to use a Rust-to-C binding library, that uses Protobuf for portability reasons. Problem is that it is slow.
So what they did is that they wrote their own non-portable but much more optimized Rust-to-C bindings, with the help of a LLM.
But had they written their software in C, they wouldn't have needed to do any conversion at all. It means they could have titled the article "How we lowered the performance penalty of using Rust".
I don't know much about Rust or libpg_query, but they probably could have gone even faster by getting rid of the conversion entirely. It would most likely have involved major adaptations and some unsafe Rust though. Writing a converter has many advantages: portability, convenience, security, etc... but it has a cost, and ultimately, I think it is a big reason why computers are so fast and apps are so slow. Our machines keep copying, converting, serializing and deserializing things.
Note: I have nothing against what they did, quite the opposite, I always appreciate those who care about performance, and what they did is reasonable and effective, good job!
>> But had they written their software in C, they wouldn't have needed to do any conversion at all. It means they could have titled the article "How we lowered the performance penalty of using Rust".
That's not really fair. The library was doing serialization/deserialization which was poor design choice from a performance perspective. They just made a more sane API that doesn't do all that extra work. It might best be titles "replacing protobuf with a normal API to go 5 times faster."
BTW what makes you think writing their end in C would yield even higher performance?
> BTW what makes you think writing their end in C would yield even higher performance?
C is not inherently faster, you are right about that.
But what I understand is that the library they use works with data structures that are designed to be used in a C-like language, and are presumably full of raw pointers. These are not ideal for working in Rust, instead, presumably, they wrote their own data model in Rust fashion, which means that now, they need to make a conversion, which is obviously slower than doing nothing.
They probably could have worked with the C structures directly, resulting in code that could be as fast as C, but that wouldn't make for great Rust code. In the end, they chose the compromise of speeding up conversion.
Also, the use of Protobuf may be a poor choice from a performance perspective, but it is a good choice for portability, it allows them to support plenty of languages for cheaper, and Rust was just one among others. The PgDog team gave Rust and their specific application special treatment.
Because then they never would have needed the poorly-designed intermediate library.
I wonder why they didn't immediately FFI it: C is the easiest lang to write rust binding for. It can get tedious if using many parts of a large API, but otherwise is straightforward.
I write most of my applications and libraries in Rust, and lament that most of the libraries I wish I would FFI are in C++ or Python, which are more difficult.
Protobuf sounds like the wrong tool. It has applications for wire serialization and similar, but is still kind of a mess there. I would not apply it to something that stays in memory.
It’s trivial to expose the raw C bindings (eg a -sys crate) because you just run bindgen on the header. The difficult part can be creating safe, high-performance abstractions.
>Protobuf sounds like the wrong too This sort of use for proto is quite common at google
No it’s not common for two pieces of code within a single process to communicate by serializing the protobuf into the wire format and deserializing it.
It’s however somewhat common to pass in-memory protobuf objects between code, because the author didn’t want to define a custom struct but preferred to use an existing protobuf definition.
1 reply →
Given they heavily used LLMs for this optimization, makes you wonder why they didn’t use them to just port the C library to rust entirely. I think the volume of library ports to more languages/the most performant languages is going to explode, especially given it’s a relatively deterministic effort so long as you have good tests and api contracts, etc
The underlying C library interacts directly with the postgres query parser (therefore, Postgres source). So unless you rewrite postgres in Rust, you wouldn't be able to do that.
> they had to use a Rust-to-C binding library, that uses Protobuf for portability reasons.
That sounds like a performance nightmare, putting Protobuf of all things between the language and Postgres, I'm surprised such a library ever got popular.
> I'm surprised such a library ever got popular.
Because it is not popular.
pg_query (TFA) has ~1 million downloads, the postgres crate has 11 million downloads and the related tokio-postgres crate has over 33 million downloads. The two postgres crates currently see around 50x as much traffic as the (special-purpose) crate from the article.
edit: There is also pq-sys with over 12 million downloads, used by diesel, and sqlx-postgres with over 16 million downloads, used by sqlx.