Comment by pengaru

1 day ago

There's a mountain of grpc-centric python code at $dayjob and it's been miserable to live with. Maybe it's less awful in c/c++, or at least confers some decent performance there. In python it's hot garbage.

Strongly agree, it’s has loads of problems, my least favourite being the schema is not checked in the way you might think, there’s not even a checksum to say this message and this version of the schema match. So when there’s old services/clients around and people haven’t versioned their schema’s safely (there was no mechanism for this apart from manually checking in PRs) you can get gibberish back for fields that should contain data. It’s basically just a binary blob with whatever schema the client has overlaid so debugging is an absolute pain. Unless you are Google scale use a text based format like JSON and save yourself a lot of hassle.

  • You can trivially make breaking changes in a JSON blob too. GRPC has well documented ways to make non-breaking changes. If you're working somewhere where breaking schema changes go in with little fanfare and much debugging then I'm not sure JSON will save you.

    The only way to know is to dig through CLs? Write a test.

    There's also automated tooling to compare protobuff schemas for breaking changes.

    • JSON contains a description of the structure of the data that is readable by both machines and humans. JSON can certainly go wrong but it’s much simpler to see when it has because of this. GRPC is usually a binary black box that adds loads of developer time to upskill, debug, figure out error cases and introduces whole new classes of potential bugs.

      If you are building something that needs binary performance that GRPC provides, go for it, but pretending there is no extra cost over doing the obvious thing is not true.

      6 replies →

  • - JSON doesn't have any schema checking either.

    - You can encode the protocol buffers as JSON if you want a text based format.

  • There is an art to having forwards and backwards compatible RPC schemas. It is easy, but it is surprisingly difficult to get people to follow easy rules. The rules are as follows:

      1) Never change the type of a field
      2) Never change the semantic meaning of a field
      3) If you need a different type or semantics, add a new field
    

    Pretty simple if you ask me.

    • If I got to choose my colleagues this would be fine, unfortunately I had people who couldn’t understand eventual consistency. One of the guys writing Go admitted he didn’t understand what a pointer was etc. etc.

      2 replies →

I'm using it for a small-to-medium sized project, and the generated files aren't too bad to work with at that scale. The actual generation of the files is very awful for Python specifically, though, and I've had to write a script to bandaid fix them after they're generated. An issue has been open for this for years on the protobuf compiler repo, and it's basically a "wontfix" as Google doesn't need it fixed for their internal use. Which is... fine I guess.

The Go part I'm building has been much more solid in contrast.

C++ generated code from protobuf/grpc is pretty awful in my experience.

  • Do you need to look at that generated code though? I haven't used gRPC yet (some poor historical decisions mean I can't use it in my production code so I'm not in a hurry - architecture is rethinking those decisions in hopes that we can start using it so ask me in 5 years what I think). My experience with other generated code is that it is not readable but you never read it so who cares - instead you just trust the interface which is easy enough (or is terrible and not fixable)

    • I meant the interfaces are horrible. As you said, as long as it has a good interface and good performance, I wouldn't mind.

      For example, here's the official tutorial for using the async callback interfaces in gRPC: https://grpc.io/docs/languages/cpp/callback/

      It encourages you to write code with practices that are quite universally considered bad in modern C++ due to a very high chance of introducing memory bugs, such as allocating objects with new and expecting them to clean themselves up via delete this;. Idiomatic modern C++ would be using smart pointers, or go a completely different route with co-routines and no heap-allocated objects.

      1 reply →