Posted by b-man 9/5/2025
I haven't used these very seriously but a problem I had a while back was that that the wire format was not what the applications wanted to use, but a good application format was to space-inefficient for wire.
As far as I could see there was not a great way to do this. You could rewrite wire<->app converter in every app, or have a converter program and now you essentially have two wire formats and need to put this extra program and data movement into workflows, or write a library and maintain bindings for all your languages.
This is what Google does. We joke that our entire jobs are "convert protobuf A into protobuf B".
Instead, make codegen a function of BOTH a data schema object and a code template (eg expressed in Jinja2 template language - or ZeroMQ GSL where I first saw this approach). The codegen stage is then simply the application of the template to the data schema to produce a code artifact.
The templates are written assuming the data schema is provided following a meta-schema (eg JSON Schema for a data schema in JSON). One can develop, eg per-language templates to produce serialization code or intra-language converters between serialization forms (on wire) and application friendly forms. The extra effort to develop a template for a particular target is amortized as it will work across all data schemas that adhere to a common meta-schema.
The "codegen" stage can of course be given non "code" templates to produce, eg, reference documentation about the data schema in different formats like HTML, text, nroff/man, etc.
- https://github.com/zeromq/gsl/blob/v4.1.5/examples/fsm_c.gsl
whew, this readme has everything
- XML in, text out: https://github.com/zeromq/gsl#:~:text=feed%20it%20some%20dat...
- a whole section on software engineering https://github.com/zeromq/gsl#model-oriented-programming
- they support COBOL https://github.com/zeromq/gsl#cobol
- and then a project 11 years old with "we're going to document these functions one day" https://github.com/zeromq/gsl#global-functions
What a journey that was
It's probably not like most web application, it's hardware data loggers that produce about hundreds of millions to billions of events per second (each with minimum about 4 bytes of wire format and maximum roughly 500 bytes).
One day I got annoyed enough to dig for the original proposal and like 99.9% of initiatives like this, it was predicated on:
- building a list of existing solutions
- building an overly exhaustive list, of every facet of the problem to be solved
- declare that no existing solution hits every point on your inflated list
- "we must build it ourselves."
It's such a tired playbook, but it works so often unfortunately.
The person who architects and sells it gets points for "impact", then eventually moves onto the next company.
In the meantime the problem being solved evolves and grows (as products and businesses tend to), the homegrown solution no longer solves anything perfectly, and everyone is still stuck dragging along said solution, seemingly forever.
-
Usually eventually someone will get tired enough of the homegrown solution and rightfully question why they're dragging it along, and if you're lucky it gets replaced with something sane.
If you're unlucky that person also uses it as justification to build a new in-house solution (we're built the old one after all), and you replay the loop.
In the case of serialization though, that's not always doable. This company was storing petabytes (if not exabytes) of data in the format for example.
Despite issues, protobufs solve real problems and (imo) bring more value than cost to a project. In particular, I'd much rather work with protobufs and their generated ser/de than untyped json
funnily enough, this line alone reveals the author to be an amateur in the problem space they are writing so confidently about.
fundamentally, the author refuses to contend with the fact that the context in which Protobufs are used -- millions of messages strewn around random databases and files, read and written by software using different versions of libraries -- is NOT the same scenario where you get to design your types once and then EVERYTHING that ever touches those types is forced through a type checker.
again, this betrays a certain degree of amateurishness on the author's part.
Kenton has already provided a good explanation here: https://news.ycombinator.com/item?id=45140590
the author never claimed the types had to be designed only once, he claimed that schema evolution chosen by protobuf is inadequate for the purpose of lossless evolution.
> Kenton has already provided a good explanation here: https://news.ycombinator.com/item?id=45140590
TLDR: yada-yada [...] protobuf is practical, type algebra either doesn't exist or impractical because only PL theorists know about it, not Kenton.
Hi I'm Kenton. I, too, was enamored with advanced PL theory in college. Designed and implemented my own purely-functional programming language. Still wish someone would figure out a working version of dependent types for real-world use, mainly so we could prove array bounds-safety without runtime checks.
In two decades building real-world complex systems, though, I've found that getting PL theory right is rarely the highest-leverage way to address the real problems of software engineering.
I maintain a React app on the side, and a few other projects, and would still recommend it just due to developer availability, but there’s a saying among some of the Elm folks I know: “Good React code in 2025 looks like good Elm code from 2015.”
(To be fair: teams, and devs new to FP [myself included] will create complexity monstrosities in any paradigm, but Elm’s strong FP setup means huge subsets of those monstrosities won’t ever compile, and usually offer a clearer path for later cleanup.)
Hi Kenton, I'm not sure what kind of PL theory you studied in college, but "array bounds-safety without runtime checks" don't require dependent types. They are being proven with several available SMT solvers as of right now, just ask LLVM folks with their "LLVM_ENABLE_Z3_SOLVER" compiler flag, the one that people build their real-world solutions on.
By the way, you don't have to say "real-world" in every comment to appeal to your google years as a token of "real-world vs the rest of you". "But my team at google wouldn't use it", or something along that line, right?
https://ats-lang.sourceforge.net/DOCUMENT/INT2PROGINATS/HTML...
Please, Kenton, don't move your goalpost. Who said about "unaided"? Annotations, whether they come directly from a developer, or from IR meta, don't make a provided SAT-constraint suddenly a "dependent type" component of your type system, it needs a bit more than that. Let's not miss the "types" in "dependent types". You don't modify type systems of your languages to run SAT solvers in large codebases.
Truly, if you believe that annotations for the purpose of static bounds checking "is not realistic in a large codebase" (or is it because you assume it's unaided?), I've got "google/pytype" and the entire Python community to justify before you.
What compels you to do this? Posting just to make people angry? Do you not have anything better to do with all that PL theory expertise?
It does static type checking from _annotations_ that live _outside_ the type system of the language. Have you forgotten that you began to argue that SMT solvers need constraint annotations to be realistic for static bounds checking in large codebases, and that the constraint annotations somehow become dependent types from that fact alone?
> What compels you to do this? Posting just to make people angry? Do you not have anything better to do with all that PL theory expertise?
You're all over the place, it's frustrating that instead of fairly addressing the points about inferior aspects of the protobuf protocol design that are unnecessary for the purpose of backward-compatible distributed systems, you keep saying (or at least assuming) that it's the only realistic solution, because "I worked at google" and "reports at google prove me right".
This is just asking for trouble when the API will inevitably break as all APIs will do eventually. In our projects I mandated and pushed really hard that we create intermediary data classes that correspond one to one to the protobufs (at first).
I got a lot of angry faces and reactions in PR due to the seemingly useless boiler plate code required but it saved our butts so many times when the API changed just before a release that it became the de facto standard.
Also, protobufs and GRPCs are a de facto standards. Are there better alternatives? Yes. Should you use those? Most likely not because the point of serialization frameworks is to be used by many people in various tech stacks.
Recently, however, I had the displeasure of working with FlatBuffers. It's worse.
I filed an issue requesting this and it was denied with an explanation:
https://github.com/protocolbuffers/protobuf/issues/7791#issu...
The reason messages are initialized is that you can easily set a deep property path:
```
message SomeY { string example = 1; }
message SomeX { SomeY y = 1; }
```
later, in java:
```
SomeX some = SomeX.newBuilder();
some.getY().setExample("hello"); // does not produce npe
```
in kotlin this syntax makes even more sense:
```
some {
y.example = "hello". // does not produce npe
}```
This is purportedly fixed in proto3 and latest SDK copies (IIRC)
I do tend to agree that they are bad. I also agree that people put a little too much credence in "came from Google." I can't bring myself to have this much anger towards it. Had to have been something that sparked this.
A few years ago I moved to a large company where protobufs were the standard way APIs were defined. When I first started working with the generated TypeScript code, I was confused as to why almost all fields on generated object types were marked as optional. I assumed it was due to the way people were choosing to define the API at first, but then I learned this was an intentional design choice on the part of protobufs.
We ended up having to write our own code to parse the responses from the "helpfully" generated TypeScript client's responses. This meant we had to also handle rejecting nonsensical responses where an actually required field wasn't present, which is exactly the sort of thing I'd want generated clients to do. I would expect having to do some transformation myself, but not to that degree. The generated client was essentially useless to us, and the protocol's looseness offered no discernible benefit over any other API format I've used.
I imagine some of my other complaints could be solved with better codegen tools, but I think fundamentally the looseness of the type system is a fatal issue for me.
Couple years ago Connect released very good generator for TypeScript, we use in in production and it's great:
Philosophically, checking that a field is required or not is data validation and doesn't have anything to do with serialization. You can't specify that an integer falls into a certain valid range or that a string has a valid number of characters or is the correct format (e.g. if it's supposed to be an email or a phone number). The application code needs to do that kind of validation anyway. If something really is required then that should be the application's responsibility to deal with it appropriately if it's missing.
The Captn Proto docs also describe why being able to declare required fields is a bad idea: https://capnproto.org/faq.html#how-do-i-make-a-field-require...
But protocol buffers is not just a serialization format it is an interface definition language. And not being able to communicate that a field is required or not is very limiting. Sometimes things are required to process a message. If you need to add a new field but be able to process older versions of the message where the field wasn't required (or didn't exist) then you can just add it as optional.
I understand that in some situations you have very hard compatibility requirements and it makes sense to make everything optional and deal with it in application code, but adding a required attribute to fields doesn't stop you from doing that. You can still just make everything optional. You can even add a CI lint that prevents people from merging code with required fields. But making required fields illegal at the interface definition level just strikes me as killing a fly with a bazooka.
My issue is that people seem to like to use protobuf to describe the shape of APIs rather than just something to handle serialization. I think it's very bad at the describing API shapes.
It is amusing, in many ways. This is specifically part of what WSDL aspired to, but people were betrayed by the big companies not having a common ground for what shapes they would support in a description.
A parser has to (inherently) neither fail (compatibility mode) nor lose the new field (a passthrough mode), nor allow diverging (strict mode). The fact that capnproto/parser authors don't realize that the same single protocol can operate in three different scenarios (but strictly speaking: at boundaries vs in middleware) at the same time, should not result in your thinking that there are problems with required fields in protocols. This is one of the most bizzare kinds of FUD in the industry.
Sure! You could certainly imagine extending Protobuf or Cap'n Proto with a way to specify validation that only happens when you explicitly request it. You'd then have separate functions to parse vs. to validate a message, and then you can perform strict validation at the endpoints but skip it in middleware.
This is a perfectly valid feature idea which many people have entertained an even implemented successfully. But I tend to think it's not worth trying to do have this in the schema language because in order to support every kind of validation you might want, you end up needing a complete programming language. Plus different components might have different requirements and therefore need different validation (e.g. middleware vs. endpoints). In the end I think it is better to write any validation functions in your actual programming language. But I can certainly see where people might disagree.
A very common example I see is Vec3 (just x, y, z). In proto2 you should be checking for the presence of x,y,z every time you use them, and when you do that in math equations, the incessant existence checks completely obscure the math. Really, you want to validate the presence of these fields during the parse. But in practice, what I see is either just assuming the fields exist in code and crashing on null, or admitting that protos are too clunky to use, and immediately converting every proto into a mirror internal type. It really feels like there's a major design gap here.
Don't get me started on the moronic design of proto3, where every time you see Vec3(0,0,0) you get to wonder whether it's the right value or mistakenly unset.
That's why Protobuf and Cap'n Proto have default values. You should not bother checking for presence of fields that are always supposed to be there. If the sender forgot to set a field, then they get the default value. That's their problem.
> just assuming the fields exist in code and crashing on null
There shouldn't be any nulls you can crash on. If your protobuf implementation is returning null rather than a default value, it's a bad implementation, not just frustrating to use but arguably insecure. No implementation of mine ever worked that way, for sure.
It's an incredibly frustrating "feature" to deal with, and causes lots of problems in proto3.
But if you don't check, it should return a default value rather than null. You don't want your server to crash on bad input.
What happens if you mark a field as required and then you need to delete it in the future? You can't because if someone stored that proto somewhere and is no longer seeing the field, you just broke their code.
But in some situations you can be pretty confident that a field will be required always. And if you turn out to be wrong then it's not a huge deal. You add the new field as optional first (with all upgraded clients setting the value) and then once that is rolled out you make it required.
And if a field is in fact semantically required (like the API cannot process a request without the data in a field) then making it optional at the interface level doesn't really solve anything. The message will get deserialized but if the field is not set it's just an immediate error which doesn't seem much worse to me than a deserialization error.
2. This is the problem, software (and protos) can live for a long time). They might be used by other clients elsewhere that you don't control. What you thought might not required 10 years down the line is not anymore. What you "think" is not a huge deal then becomes a huge deal and can cause downtime.
3. You're mixing business logic and over the wire field requirement. If a message is required for an interface to function, you should be checking it anyway and returning the correct error. How is that change with proto supporting require?
It can be required in v2 but not in v1 which was my point. If the client is running v2 while the server is still on v1 temporarily, then there is no problem. The server just ignores the new field until it is upgraded.
> This is the problem, software (and protos) can live for a long time). They might be used by other clients elsewhere that you don't control. What you thought might not required 10 years down the line is not anymore. What you "think" is not a huge deal then becomes a huge deal and can cause downtime.
Part of this is just that trying to create a format that is suitable both as an rpc wire serialization format and ALSO a format suitable for long term storage leads to something that is not great for either use case. But even taking that into account, RDBMS have been dealing with this problem for decades and every RDBMS lets you define fields as non-nullable.
> If a message is required for an interface to function, you should be checking it anyway and returning the correct error. How is that change with proto supporting require?
That's my point, you have to do that check in code which clutters the implementation with validation noise. That and you often can't use the wire message in your internal domain model since you now have to do that defensive null-check everywhere the object is used.
Aside from that, protocol buffers are an interface definition language so should be able to encode some of the validation logic at least (make invalid states unrepresentable and all that). If you are just looking at the proto IDL you have no way of knowing whether a field is really required or not because there is no way to specify that.
It isn't that you can't do it. But the code side of the equation is the cheap side.
Too often I find something mildly interesting, but then realize that in order for me to try to use it I need to set up a personal mirror of half of Google's tech stack to even get it to start.
https://protobuf.dev/design-decisions/nullable-getters-sette...