← Back to context

Comment by est

10 days ago

> The contract-first philosophy

gRPC/protobuf is largely a Google cult. I've seen too projects with complex business logic simply give up and embed JSON strings inside pb. Like WTF...?

Everything was good in the begining, as long as everyone submits their .proto to a centralized repo. Once the one team starts to host their own, things get broken quickly.

As it occured to me, gRPC could optionally just serve those .proto files in the initial h2 handshake on the wire. It add just few kilobytes but solves a big problem.

I personally really like gRPC and protobufs. I think they strike a good balance between a number of indirectly competing objectives. However I completely agree with your observation that as soon as you move beyond a single source of truth for the .proto files it all goes to shit. I've seen some horrible things--generated code being committed to version control and copied between repos, .proto files duplicated and manually kept up to date (or not). Both had hilarious failure modes. There is no viable synchronization mechanism except to ensure that each .proto file is defined in exactly one place, that each time someone touches a .proto file all the downstream dependencies on that file are updated--everyone who consumes any code generated from that .proto--and that for every such change clients are deployed before servers. Usually these invariants are maintained by meatspace protocols which invariably fail.

  • I don't see why any of that would be necessary. There are simple rules for protobuf compatibility and people only need to follow them. Never re-use a field number to mean something else. Never change the type of a field. That's it. Those are the only rules. If you follow them you don't have to think about any of that stuff that you mentioned.

    • Absolutely! Forward and backward compatibility are one of the wonderful things about protobufs. And that all goes wrong when you try to define the interface in more than one place.

      EDIT: also, although the wire protocol may tolerate unknown or missing data, almost always the application doesn't.

      EDIT AGAIN: I'm not saying this is how it should be just that this is the low energy state the socio-technical system seems to arrive at over time. So ideally it should be simple but due to imperfect decisions it gets horribly complicated over time.

      2 replies →

> Everything was good in the begining, as long as everyone submits their .proto to a centralized repo. Once the one team starts to host their own, things get broken quickly.

Is this an issue with protobufs per se though? It's a data schema. How are people supposed to develop to a shared schema if a team doesn't - you know - share their schema? That could happen with any other particular choice for how schemas are defined.

  • It's a problem with PB because it requires everything to be typed (unless you use Any), which requires all middleware to eagerly type check all data passing through. With JSON, validation will be typically done only by the endpoints, which allows for much faster development.

    There was a blog a few years ago, where an engineer working on the Google Cloud console was complaining that simply adding a checkbox to one of the pages required modifying ~20 internal protos and 6 months of rollout. That's an obvious downside that I wish I knew how to fix.

> As it occured to me, gRPC could optionally just serve those .proto files in the initial h2 handshake on the wire

Do you mean the reflection protocol, or some other .proto files?

It does have discovery built in. Is that what you want?

  • you mean grpc.reflection.v1alpha.ServerReflection? Close enough, sadly not generally enabled.