Protobuffers Are Wrong (2018)

Posted by b-man 9/5/2025

Protobuffers Are Wrong (2018)(reasonablypolymorphic.com)

244 points | 307 commentspage 2

imtringued 9/5/2025|

>The solution is as follows:

> * Make all fields in a message required. This makes messages product types.

Meanwhile in the capnproto FAQ:

>How do I make a field “required”, like in Protocol Buffers?

>You don’t. You may find this surprising, but the “required” keyword in Protocol Buffers turned out to be a horrible mistake.

I recommend reading the rest of the FAQ [0], but if you are in a hurry: Fixed schema based protocols like protobuffers do not let you remove fields like self describing formats such as JSON. Removing fields or switching them from required to optional is an ABI breaking change. Nobody wants to update all servers and all clients simultaneously. At that point, you would be better off defining a new API endpoint and deprecating the old one.

The capnproto faq article also brings up the fact that validation should be handled on the application level rather than the ABI level.

[0] https://capnproto.org/faq.html

mountainriver 9/5/2025||

> Protobuffers correspond to the data you want to send over the wire, which is often related but not identical to the actual data the application would like to work with

This sums up a lot of the issues I’ve seen with protobuf as well. It’s not an expressive enough language to be the core data model, yet people use it that way.

In general, if you don’t have extreme network needs, then protobuf seems to cause more harm than good. I’ve watched Go teams spend months of time implementing proto based systems with little to no gain over just REST.

recursive 9/5/2025||

Protobuf is independent from REST. You can have either one. Or both. Or neither. One has nothing to do with the other.

mountainriver 9/6/2025|||

yes I fully understand that, the point is a lot of teams focus on protobuf for their network layer

sieabahlpark 9/5/2025|||

[dead]

nicce 9/5/2025||

On the other hand, ASN.1 is very expressive and can cover pretty much anything, but Protobuff was created because people thought ASN.1 is too complex. I guess we can't have both.

cryptonector 9/6/2025|||

If people though ASN.1 was too big all they had to do was create a small profile of it large enough for the task at hand.

X.680 is fairly small. Require AUTOMATIC TAGs, remove manual tagging, remove REAL and EMBEDDED PDV and such things, and what you're left with is pretty small.

jandrese 9/5/2025|||

"Those who cannot remember the past are condemned to repeat it" -- George Santayana

theamk 9/5/2025||

Oh, I remember ASN.1 very well, and I would not want to repeat it again.

Protobufs have lots of problems, but at least they are better than ASN.1!

cryptonector 9/6/2025||

Details please.

Things people say who know very little about ASN.1:

- it's bloated! (it's not)

- it's had lots of vulnerabilities! (mainly in hand-coded codecs)

- it's expensive (it's not -- it's free and has been for two decades)

- it's ugly (well, sure, but so is PB's IDL)

- the language is context-dependent, making it harder to write a parser for (this is quite true, but so what, it's not that big a deal)

The vulnerabilities were only ever in implementations, and almost entirely in cases of hand-coded codecs, and the thing that made many of these vulnerabilities possible was the use of tag-length-value encoding rules (BER/DER/CER) which, ironically, Protocol Buffers bloody is too.

If you have a different objections to ASN.1, please list them.

theamk 6 days ago||

Neither of those, the main problems are:

- There is no backward or forward compatibility by default.

(Sure, you can have every SEQUENCE have all fields OPTIONAL and ... at the end, but how many real-life schemas like that you have seen? Almost every ASN.1 you can find on the internet is static SEQUENCE, with no extensibility whatsoever)

- Tools are bad.

Yes, protoc can be a PITA to integrate into build system, but at least it (1) exists, (2) well-tested (3) supports many languages. Compared to ASN.1 where the good tooling is so rare, people routinely manually parse/generate the files!

- Honorable mention: using "tag" in TLV to describe only the type and not field name - that SEQUENCE(30) tag will be all over the place, and the contents will be wildly different. Compare to protobuf, where the "tag" is field index, and that's exactly what allows such a great forward/backward compatibility.

(Could ASN.1 fix those problems? Not sure. Yes, maybe one could write better tooling, but all the existing users know that extensibility is for the weak, and non-optional SEQUENCEs are the way to go. It is easier to write all-new format than try to change existing conventions.)

cryptonector 6 days ago||

> - There is no backward or forward compatibility by default.

ASN.1 in 1984 had it. Later ASN.1 evolved to have a) explicit extensibility markers, and b) the `EXTENSIBILITY IMPLIED` module option that implies every SEQUENCE, SET, ENUM, and other things are extensible by default, as if they ended in `, ...`.

There are good reasons for this change:

- not all implementors had understood the intent, so not all had implemented "ignore unexpected new fields"

- sometimes you want non-extensible things

- you may actually want to record in the syntax all the sets of extensions

> - Tools are bad.

But there were zero -ZERO!- tools for PB when Google created PB. Don't you see that "the tools that existed were shit" is not a good argument for creating tools for a completely new thing instead?

> - Honorable mention: using "tag" in TLV to describe only the type and not field name - that SEQUENCE(30) tag will be all over the place, and the contents will be wildly different. Compare to protobuf, where the "tag" is field index, and that's exactly what allows such a great forward/backward compatibility.

In a TLV encoding you can very much use the "type" as the tag for every field sometimes, namely when there would be no ambiguity due to OPTIONAL fields being present or absent, and when you do have such ambiguities you can resort to manual tagging with field numbers or whatever you want. For example:

  Thing ::= SEQUENCE {
    a UTF8String,
    b UTF8String
  }

works even though both fields get the same tag (when using a TLV encoding) because both fields are required, while this is broken:

  Broken ::= SEQUENCE {
    a UTF8String OPTIONAL,
    b UTF8String
  }

and you would have to fix it with something like:

  Fixed ::= SEQUENCE {
    a [0] UTF8String OPTIONAL,
    b UTF8String
  }

What PB does is require the equivalent of manually applying what ASN.1 calls IMPLICIT tags to every field, which is silly and makes it harder to decode data w/o reference to the module that defines its schema (this last is sketchy anyways, and I don't think it is a huge advantage for the ASN.1 BER/DER way of doing things, though others will disagree).

> (Could ASN.1 fix those problems? Not sure. Yes, maybe one could write better tooling, but all the existing users know that extensibility is for the weak, and non-optional SEQUENCEs are the way to go. It is easier to write all-new format than try to change existing conventions.)

ASN.1 does not have these problems.

Better tooling does exist and can exist -- it's no different than writing PB tooling, at least for a subset of ASN.1, because ASN.1 does have many advanced features that PB lacks, and obviously implementing all of ASN.1 is more work than implementing all of PB.

> It is easier to write all-new format than try to change existing conventions.

Maybe, but only if you have a good handle on what came before.

I strongly recommend that you actually read x.680.

ryukoposting 9/5/2025||

Protobuf's original sin was failing to distinguish zero/false from undefined/unset/nil. Confusion around the semantics of a zero value are the root of most proto-related bugs I've come across. At the same time, that very characteristic of protobuf makes its on-wire form really efficient in a lot of cases.

Nearly every other complaint is solved by wrapping things in messages (sorry, product types). Don't get the enum limitation on map keys, that complaint is fair.

Protobuf eliminates truckloads of stupid serialization/deserialization code that, in my embedded world, almost always has to be hand-written otherwise. If there was a tool that automatically spat out matching C, Kotlin, and Swift parsers from CDDL, I'd certainly give it a shot.

mdhb 9/6/2025||

Agreed the CDDL to codegen pipeline / tooling is the biggest thing holding back CBOR at the moment.

Some solutions do exist like here’s a C one[1] which maybe you could throw in some WASI / WASM compilation and get “somewhat” idiomatic bindings in a bunch of languages.

Here’s another for Rust [2] but I’m sure I’ve seen a bunch of others around. I think what’s missing is a unified protoc style binary with language specific plugins.

[1] https://github.com/NordicSemiconductor/zcbor

[2] https://github.com/dcSpark/cddl-codegen

jsnell 9/6/2025||

> Protobuf's original sin was failing to distinguish zero/false from undefined/unset/nil.

It's only proto3 that doesn't distinguish between zero and unset by default. Both the earlier and later versions support it.

Proto3 was a giant pile of poop in most respects, including removing support for field presence. They eventually put it back in as a per-field opt-in property, but by then the damage was done.

A huge unforced mistake, but I don't think a change made after the library had existed for 15 years and reverted qualifies as an "original sin".

spectraldrift 9/6/2025||

I'm not sure why this post gets boosted every few years- and unfortunately (as many have pointed out) the author demonstrates here that they do not understand distributed system design, nor how to use protocol buffers. I have found them to be one of the most useful tools in modern software development when used correctly. Not only are they much faster than JSON, they prevent the inevitable redefinition of nearly identical code across a large number of repos (which is what i've seen in 95% of corporate codebases that eschew tooling such as this). Sure, there are alternatives to protocol buffers, but I have not seen them gain widespread adoption yet.

ericpauley 9/5/2025||

I lost the plot here when the author argued that repeated fields should be implemented as in the pure lambda calculus...

Most of the other issues in the article can be solved be wrapping things in more messages. Not great, not terrible.

As with the tightly-coupled issues with Go, I'll keep waiting for a better approach any decade now. In the meantime, both tools (for their glaring imperfections) work well enough, solve real business use cases, and have a massive ecosystem moat that makes them easy to work with.

wnoise 9/6/2025|

They didn't. Pure lambda calculus would have been "a function that when applied to a number encoded as a function, extracts that value".

They did it essentially as a linked list, C-strings, or UTF-8 characters: "current data, and is there more (next pointer, non-null byte, continuation bit set)?" They also noted that it could have this semantics without necessarily following this implementation encoding, though that seems like a dodge to me; length-prefixed array is a perfectly fine primitive to have, and shouldn't be inferred from something that can map to it.

BugsJustFindMe 9/5/2025||

I went into this article expecting to agree with part of it. I came away agreeing with all of it. And I want to point out that Go also shares some of these catastrophic data decisions (automatic struct zero values that silently do the wrong thing by default).

sethammons 9/6/2025|

We got bit by a default value in a DMS task where the target column didn't exist so the data wasn't replicated and the default value was "this work needs to be done."

This is not pb nor go. A sensible default of invalid state would have caught this. So would an error and crash. Either would have been better than corrupt data.

OrangeDelonge 9/7/2025||

You mean aws dms insterted the string literal “this work needs to be done” into your db?

sethammons 7 days ago||

So, that target column was called the wrong name, meaning data intended for the column never arrived, causing the default value in the database to be used, which was an integer that mapped to "this work item needs to be processed still" which led to double processing the record post dms migration

dano 9/5/2025||

It is a 7 year old article without specifying alternatives to an "already solved problem."

So HN, what are the best alternatives available today and why?

thinkharderdev 9/5/2025||

Support across languages etc is much less mature but I find thrift serialization format to be much nicer than protobuf. The codegen somehow manages to produce types that look like types I would actually write compared to the monstrosities that protoc generates.

gsliepen 9/5/2025|||

Something like MessagePack or CBOR, and if you want versioning, just have a version field at the start. You don't require a schema to pack/unpack, which I personally think is a good thing.

fmbb 9/5/2025|||

> You don't require a schema to pack/unpack

Then it hardly solves the same problem Protobuf solves.

mgaunard 9/5/2025|||

Arrow is also becoming a good contender, with the extra benefit it is better optimized for data batches.

mdhb 9/6/2025|||

CBOR is probably the best and most standards compliant thing out there that I’m aware of.

It’s the new default in a lot of IOT specs, it’s the backbone for deep space communication networks etc..

Maintains interoperability with JSON. Is very much battle tested in very challenging environments.

rapsey 9/5/2025|||

There are none, protobufs are great.

nicce 9/5/2025||

Depends. ASN.1 is a beast and another industry standard, but unfortunately the best tooling is closed source.

cryptonector 9/6/2025||

There was ZERO PB tooling in 2000. Just write it for ASN.1 instead.

akavi 9/5/2025|||

Mentioned above: https://github.com/stepchowfun/typical

allanrbo 9/5/2025||

Sometimes you are integrating with system that already use proto though. I recently wrote a tiny, dependency-free, practical protobuf (proto3) encoder/decoder. For those situations where you need just a little bit of protobuf in your project, and don't want to bother with the whole proto ecosystem of codegen and deps: https://github.com/allanrbo/pb.py

ants_everywhere 9/5/2025||

> Maintain a separate type that describes the data you actually want, and ensure that the two evolve simultaneously.

I don't actually want to do this, because then you have N + 1 implementations of each data type, where N = number of programming languages touching the data, and + 1 for the proto implementation.

What I personally want to do is use a language-agnostic IDL to describe the types that my programs use. Within Google you can even do things like just store them in the database.

The practical alternative is to use JSON everywhere, possibly with some additional tooling to generate code from a JSON schema. JSON is IMO not as nice to work with. The fact that it's also slower probably doesn't matter to most codebases.

thinkharderdev 9/5/2025||

> I don't actually want to do this, because then you have N + 1 implementations of each data type, where N = number of programming languages touching the data, and + 1 for the proto implementation.

I think this is exactly what you end up with using protobuf. You have an IDL that describes the interface types but then protoc generates language-specific types that are horrible so you end up converting the generated types to some internal type that is easier to use.

Ideally if you have an IDL that is more expressive then the code generator can create more "natural" data structures in the target language. I haven't used it a ton, but when I have used thrift the generated code has been 100x better than what protoc generates. I've been able to actually model my domain in the thrift IDL and end up with types that look like what I would have written by hand so I don't need to create a parallel set of types as a separate domain model.

danans 9/5/2025||

> The practical alternative is to use JSON everywhere, possibly with some additional tooling to generate code from a JSON schema.

Protobuf has a bidirectional JSON mapping that works reasonably well for a lot of use cases.

I have used it to skip the protobuf wire format all together and just use protobuf for the IDL and multi-language binding, both of which IMO are far better than JSON-Schema.

JSON-Schema is definitely more powerful though, letting you do things like field level constraints. I'd love to see you tomorrow that paired the best of both.

vander_elst 9/5/2025|

Always initializing with a default and no algebraic types is an always loaded foot gun. I wonder if the people behind golang took inspiration from this.

wrsh07 9/5/2025|

The simplest way to understand go is that it is a language that integrates some of Google's best cpp features (their lightweight threads and other multi threading primitives are the highlights)

Beyond that it is a very simple language. But yes, 100%, for better and worse, it is deeply inspired by Google's codebase and needs

More comments...