Top
Best
New

Posted by b-man 9/5/2025

Protobuffers Are Wrong (2018)(reasonablypolymorphic.com)
244 points | 307 commentspage 5
jeffbee 9/5/2025|
Type system fans are so irritating. The author doesn't engage with the point of protocol buffers, which is that they are thin adapters between the union of things that common languages can represent with their type systems and a reasonably efficient marshaling scheme that can be compact on the wire.
cryptonector 9/6/2025||
I've written several screeds in the comments here on HN about protobufs being terrible over the past few years. Basically the creators of PB ignored ASN.1 and built a bad version of mid-1980s ASN.1 and DER.

Tag-length-value (TLV) encodings are just overly verbose for no good reason. They are _NOT_ "self-describing", and one does not need everything tagged to support extensibility. Even where one does need tags, tag assignments can be fully automatic and need not be exposed to the module designer. Anyone with a modicum of time spent researching how ASN.1 handles extensibility with non-TLV encoding rules knows these things. The entire arc of ASN.1's evolution over two plus decades was all about extensibility and non-TLV encoding rules!

And yes, ASN.1 started with the same premise as PB, but 40 years ago. Thus it's terribly egregious that PB's designers did not learn any lessons at all from ASN.1!

Near as I can tell PB's designers thought they knew about encodings, but didn't, and near as I can tell they refused to look at ASN.1 and such because of the lack of tooling for ASN.1, but of course there was even less tooling for PB since it hadn't existed.

It's all exasperating.

dinobones 9/5/2025||
lols, the weird protobuf initialization semantics has caused so many OMGs. Even on my team it lead to various hard to debug bugs.

It's a lesson most people learns the hard way after using PBs for a few months.

sylware 9/6/2025||
I don't recall properly (because I did selve my mapping projects for the moment), but don't openstreet map core data distribution format based on protobuffers?
mkl95 9/5/2025||
If you mostly write software with Go you'll likely enjoy working with protocol buffers. If you use the Python or Ruby wrappers you'd wish you had picked another tech.
jonathrg 9/5/2025|
The generated types in go are horrible to work with. You can't store instances of them anywhere, or pass them by value, because they contain a bunch of state and pointers (including a [0]sync.Mutex just to explicitly prohibit copying). So you have to pass around pointers at all times, making ownership and lifetime much more complicated than it needs to be. A message definition like this

    message AppLogMessage {
        sint32 Value1 = 1;
        double Value2 = 2;
    }
becomes

    type Example struct {
        state                    protoimpl.MessageState 
        xxx_hidden_Value1        int32                  
        xxx_hidden_Value2        float64                  
        xxx_hidden_unknownFields protoimpl.UnknownFields
        sizeCache                protoimpl.SizeCache
    }
For [place of work] where we use protobuf I ended up making a plugin to generate structs that don't do any of the nonsense (essentially automating Option 1 in the article):

    type ExamplePOD struct {
        Value1 int32
        Value2 float64
    }
with converters between the two versions.
shdh 9/5/2025||
I just wish protobuf had proper delta compression out of the box
cenamus 9/6/2025||
I really liked the typography/layout of the page, reminds me of gwern.net. But people will probably complain about serif fonts regardless
BobbyTables2 9/6/2025||
Even the low level implementation of protobuffers is pretty uninspiring.

Adds a lot of space overhead, specially for structs only used one yet not self descriptive either.

Doesn’t solve a lot of problems related to changes either.

Quite frankly, too many are using up in it because it came from Google and is supposed to be some sort of divinely inspired thing.

JSON, ASN.1, and even rigid C structs start to look a lot better.

fsmv 9/5/2025|
I actually really strongly prefer 0 being identical to unset. If you have an unset state then you have to check if the field is unset every time you use it. Using 0 allows you to make all of your code "just work" when you pass 0 to it so you don't need to check at all.

It's like how in go most structs don't have a constructor, they just use the 0 value.

Also oneof is made that way so that it is backwards compatible to add a new field and make it a oneof with an existing field. Not everything needs to be pure functional programming.

More comments...