Posted by gingerBill 6 days ago
Regardless of how they define these terms, producing a list of hashes which function as a commitment to specific versions of dependencies is a technique essential to modern software development. Whatever the tools are called, and whatever they do, they need to spit out a list of hashes that can be checked into version control.
You could just use git submodules, but in practice there are better user experiences provided by language package managers (`go mod` works great).
A good amount of this ranting can probably be attributed to projects and communities that aren't even playing the list of hashes game. They are resolving or upgrading dependencies in CI or at runtime or something crazy like that.
Also, use git subtrees, not git submodules. What people think submodules are, are actually subtrees and most people don't know about them.
As for "good" package managers, they are still bad because of what I said in the article.
That said I think the final takeaway is that systems that allow you to pin versions, vendor all those dependencies and resolve/reproduce the same file tree regardless of who's machine it's on (let's assume matching architectures for simplicity here) is the goal.
Note that removing 'manually' here, this still works:
> Copying and vendoring each package {manually}, and fixing the specific versions down is the most practical approach to keeping a code-base stable, reliable, and maintainable.
The article's emphasis on the manual aspect of management of dependencies is a bit of loss, as I don't particularly believe it _has to be manual_ in the sense of manually copying files from their origin into your file tree; that certainly is a real world option, but few (myself included) would take that monk-like path again. I left this exact situation in C land and would not consider going back unless adopting something like ninja.
What the OP is actually describing is a "good" package manager feature set and many (sadly not most/all) do support this exact feature set today
PS I did chuckle when they defined evil in terms of something that gets you to dependency hell faster. However, we shouldn't be advocating for committing the same sins of our fathers.
And honestly speaking: It is plain stupid.
We can all agree that abusing package management with ~10000 of micro packages everywhere like npm/python/ruby does is completely unproductive and brings its own considerable maintenance burden and complexity.
But ignoring the dependency resolution problem entirely by saying "You do not need dependencies" is even dumber.
Not every person is working in an environment where shipping a giant blob executable built out of vendored static dependencies is even possible. This is a privilege of the Gamedev industry has and the author forgets a bit too easily it is domain specific.
Some of us works in environment where the final product is an agglomerate of >100 of components developed by >20 teams around the world. Versioned over ~50 git repositories. Often mixed with some proprietary libraries provided by third-party providers. Gluing, assembling and testing all of that is far beyond the "LOL, just stick to the SDL" mindset proposed here.
Some of us are developing libraries/frameworks that are used embedded in >50 products with other libraries with a hell of multiples combinations of compilers / ABI / platforms. This is not something you want to test nor support without automation.
Some of us have to maintain cathedrals that are constructed over decades of domain specific knowhow (Scientific simulators, solvers, Petrol prospection tools, financial frameworks, ... ) in multiple languages (Fortran, C, C++, Python, Lua, ...) that can not just be re-written in few weeks because "I tell you: dependencies sucks, Bro"
Managing all of that manually is just insane. And generally finishes with an home-made half-baked bunch of scripts that try to badly mimic the behavior of a proper package manager.
So no, there is no replacement for a proper package manager: Instead of hating the tool, just learn to use it.
Package manager are tools, and like every tool, they should be used Wisely and not as a Maslow's Hammer.
> So let's handle the hell manually to feel the pain better
This is far from my position. Literally the entire point is to make it clearer you are heading to dependency hell, rather than feel the pain better whilst you are there.
I am not against dependencies but you should know the costs of them and the alternatives. Package managers hide the complexity, costs, trade-offs, and alternative approaches, thus making it easier to slip into dependency hell.
You are against the usage of a tool and you propose no alternative.
Handling the dependency by vendoring them manually, like you propose in your blog, is not an alternative.
This is an over simplification of the problem (and the problem is complex) that can be applied only to your specific usage and domain.
Again, what is wrong with saying you should know the costs of the dependencies you include AND the alternative approaches of not using the dependencies?—e.g. using the standard library, writing it yourself, using another dependency already that might fit, etc.
> Some of us works in environment where the final product is an agglomerate of >100 of components developed by >20 teams around the world. Versioned over ~50 git repositories. Often mixed with some proprietary libraries provided by third-party providers. Gluing, assembling and testing all of that is far beyond the "LOL, just stick to the SDL" mindset proposed here.
Does this somehow prevent you from vendoring everything?
Yes. Because in these environment soon or later you will be shipping libraries and not executable.
Shipping libraries means that your software will need to be integrated in other stacks where you do not control the full dependency tree nor the versions there.
Vendoring dependencies in this situation is the guarantee that you will make the life of your customer miserable by throwing the diamond dependency problem right in their face.
In the game development sphere, there's plenty of giant middleware packages for audio playback, physics engines, renderers, and other problems that are 1000x more complex and more useful than any given npm package, and yet I somehow don't have to "manage a dependency tree" and "resolve peer dependency conflicts" when using them.
And you just don't know what you are talking about.
If I am providing (lets say) a library that provides some high level features for a car ADAS system on top of a CAN network with a proprietary library as driver and interface.
This is not up to me to fix or choose the library and the driver version that the customer will use. He will choose the certified version he will ship, he will test my software on it and integrate it.
Vendoring dependency for anything which is not a final product (product as executable) is plain stupid.
It is a guarantee of pain and ABI madness for anybody having to deal with the integration of your blob later on.
If you want to vendor, do vendor, but stick to executables with well-defined IPC systems.
If you're writing an ADAS system, and you have a "dependency tree" that needs to be "resolved" by a package manager, you should be fired immediately.
Any software that has lives riding on it, if it has dependencies, must be certified against a specific version of them, that should 100% of the time, without exceptions, must be vendored with the software.
> It is a guarantee of pain and ABI madness for anybody having to deal with the integration of your blob later on.
The exact opposite. Vendoring is the ONLY way to prevent the ABI madness of "v1.3.1 of libfoo exports libfoo_a but not libfoo_b, and v1.3.2 exports libfoo_b but not libfoo_c, and in 1.3.2 libfoo_b takes in a pointer to a struct that has a different layout."
If you MUST have libfoo (which you don't), you link your version of libfoo into your blob and you never expose any libfoo symbols in your library's blob.
The vendoring step happens at something like Yocto or equivalent and that's what ends up being certified, not random library repos.
And in addition: Yocto (or equivalent) will also be the one providing you the traceability required to guarantee that what you ship is currently what you certified and not some random garbage compiled in a laptop user directory.
It used to have a really bad design flaw. Example: - building package X explicitly depends on A to be in the sysroot - building package Y explicitly depends on B in the sysroot, but implicitly will use A if present (thanks autoconf!)
In such a situation, building X before Y will result in Y effectively using A&B — perhaps enabling unintended features. Building Y then X would produce a different Y.
Coupled with the parallel build environment, it’s a recipe for highly non deterministic binaries — without even considering reproducibility.
It's better than before but you still need to sandbox manually if you want good reproducibility.
Honestly, for reproducibility alone. There is better than Yocto nowadays. It is hard to beat Nix at this game. Even Bazel based build flows are somewhat better.
But in the embedded world, Yocto is pretty widespread and almost the de-facto norm for Linux embedded.
When you want reproducibility, you need to specify what you want, not let the computer guess. Why can't you use Y/configure --without-A ? In the extreme case you can also version config.status.
Things using autotools evolved to be “manual user friendly” in the sense that application features are automatically enabled based on auto detected libraries.
But for automated builds, all those smarts get in the way when the build environment is subject to variation.
In theory, the Yocto recipe will fully specify the application configuration regardless of how the environment varies…
Of course, in theory the most Byzantine build process will always function correctly too!
You're providing a library. That library has dependencies (although it shouldn't). You've written that library to work against a specific version of those dependencies. Vendoring these dependencies means shipping them with your library, and not relying on your user, or even worse, their package manager to provide said dependencies.
I don't know what industry you work in, who the regulatory body that certifies your code is, or what their procedures are, but if they're not certifying the "random library repos" that are part of your code, I pray I never have to interact with your code.
I dabbled my fingers in enough of them to tame my hubris a bit and learn that various fields have specific needs that end up represented in their processes (and this includes gamedev as well). Highly recommended before commenting any further.
You illustrate perfectly the attitude problem of the average "gamedev" here.
You do not know shit about the realities and the development practice of an entire domain (here the safety critical domain).
But still you brag confidently about how 'My dev practices are better' and affirm without any shame that everybody else in this field that disagree is an idiot.
Just to let you know: In the safety critical field, the responsibility of the final certification is on the integrator. That is why we do not want intermediate dependency to randomly vendor and bundle crap we do not have control of.
Additionally, it is often that the entire dependency tree (including proprietary third party components like AUTOSAR) are shipped as source available and compiled / assemblied from sources during the integration.
Thats why the usage of package manager like Yocto (or equivalent) is widespread in the domain: It allows to precisely track and version what is used an how for analysis and traceability back to the requirements.
Additionally again, when the usage of binary dependencies is the only solution available (like for Neutrino QNX and its associated compilers). Any serious certification organism (like the TUV) will mandate to have the exact checksum of each certified binary that you use in your application and a process to track them back to the certification document.
This is not something you do by dumping random fu**ng blob in a git repository like you are proposing. You generally do that, again, by using a proper set of processes and generally a package manager like Yocto or similar.
Finally, your comment on "v1.3.1 of libfoo" is completely moronic. You seem to have no idea of the consequence of duplicated symbols in multiples static libraries with vendored dependencies you do not control nor the consequences it can have on functional safety.
Would you also try to build all of them on every CI run?
What about the non-source dependencies, check the binaries into git?
Anymore, as I evaluate fellow programmers, I'm looking for whether they've discovered "one more dependency" is like signing up for "one more subscription you have to remember to pay for" and what they do to try and mitigate it.
But as I got bit by the various issues with dependencies multiple times over the years, I have ended up preferring as few as possible and ideally zero beyond the standard library for hobby projects if I can get away with it.
One of the worst things working at companies shipping C++ was the myriad of meta-build systems that all tries to do dependency management as a part of the build system without having a separate concept of what a "package manager" is, this is truly the worst of both worlds, where people are happy to add dependencies, never update them, and never share code between projects and departments. I do not wish that way of working on my worst enemies.
Whatever problems package management brings is such a better problem to have than not having a package manager. That said I think everyone can get better at being more discriminatory of what they add to their project.
If you only use a package manager for libraries that you have high trust in then you don't need to worry - but there are so few projects you can have high trust in that manual management isn't a big deal. Meanwhile there are many many potentially useful packages that can save you a lot of effort if you use them - but you need to manually audit each because if you don't nobody will and that will bite you.
Yes, shared code has costs
- more general than you likely need, affecting complexity, compile times, etc
- comes with risks for today (code) and the future (governace)
But the benefits are big. My theory for one of the causes for Rust having so many good cli's is Cargo because it keeps the friction low for pulling in high quality building blocks so you can better focus on your actual problem.
Instead of resisting dependencies, I think it would be better to spend time finding ways to mitigate the costs, e.g.
- I'd love for crates.io to integrate diff.rs, provenance reporting (https://lawngno.me/blog/2024/06/10/divine-provenance.html), etc
- More direct support for security checking in cargo
- Integrating cargo-vet and/or cargo-crev into cargo
What a great quote.
Isn't this backwards? In real life, if you have a dependent, you are responsible for it. On the other hand, if you have a dependency on something, you rely on that thing, in other words it should be responsible for you. A package that is widely used in security-critical applications ought to be able to be held accountable if its failure causes harm due to downstream applications. But because that is in general impossible and most library authors would never take on the risk of making such guarantees, the risk of each dependency is taken on by the person who decides it is safe to use it, and I agree package managers sometimes make that too easy.