Comment by augusteo

1 month ago

Building on sluongng's point about schema evolution: we ended up in a weird middle ground on a project where we used gRPC for metadata and presigned S3 URLs for the actual bytes.

The metadata schema changed constantly (new compression formats, different checksum algorithms, retry policies). Having protobufs for that was genuinely useful. But trying to pipe multi-gigabyte files through gRPC streams was painful. Memory pressure, connection timeouts on slow clients, debugging visibility was terrible.

S3 presigned URLs are the boring answer, but they work. Your object storage handles the hard parts (resumability, CDN integration, the actual bytes), and your gRPC service handles the interesting parts (authentication, metadata, business logic).

1 comment

augusteo

jeffbee 1 month ago

Sending bulk data by reference is a common pattern. Even inside Google when I was there bulk data was sometimes placed on ephemeral storage and sent by reference, and 100MB was considered a "dangerously large" protobuf that would log a warning during decode.