Then I found an HN comment I wrote a few years ago that confirmed this:
“[...] I remember that day pretty clearly because in the same lightning talk session, Solomon Hykes introduced the Python community to docker, while still working on dotCloud. This is what I think might have been the earliest public and recorded tech talk on the subject:”
YouTube link: https://youtu.be/1vui-LupKJI?t=1579
Note: starts at t=1579, which is 26:19.
Just being pedantic though. That’s about 13 years ago. The lightning talk is fun as a bit of computing history.
(Edit: as I was digging through the paper, they do cite this YouTube presentation, or a copy of it anyway, in the footnotes. And they refer to a 2013 release. Perhaps there was a multi-year delay between the paper being submitted to ACM with this title and it being published. Again, just being pedantic!)
The flip side is that the world still hasn’t settle on a language-neutral build tool that works for all languages. Therefore we resort to running arbitrary commands to invoke language-specific package managers. In an alternate timeline where everyone uses Nix or Bazel or some such, docker build would be laughed out of the window.
> running arbitrary commands to invoke language-specific package managers.
This is exactly what we do in Nix. You see this everywhere in nixpkgs.
What sets apart Nix from docker is not that it works well at a finer granularity, i.e. source-file-level, but that it has real hermeticity and thus reliable caching. That is, we also run arbitrary commands, but they don't get to talk to the internet and thus don't get to e.g. `apt update`.
In a Dockerfile, you can `apt update` all you want, and this makes the build layer cache a very leaky abstraction. This is merely an annoyance when working on an individual container build but would be a complete dealbreaker at linux-distro-scale, which is what Nix operates at.
That's not going to work if both parties get different hashes when they build the image, which won't happen as long as file modification timestamps (and other such hazards) are part of what gets hashed.
Personally I love using mkosi and while it has all the composability and deployment options I'd care for, its clear not everyone wants to build starting only with a blank set of OS templates.
In Spack [1] we do one layer per package; it's appealing, but I never checked if besides the layer limit it's actually bad for performance when doing filesystem operations.
Want to throw a requirements.txt in there? No no, why would you even ask that? Meanwhile docker says yeah sure just run pip install, why should I care?
If you care about getting it to work with minimal effort right now more thar about it being sustainable later, then sure.
Like all LLM boosters, you've ignored the fact that the largest time sink in many kinds of software is not initial development, but perpetual maintenance.
I wish we had standardized on something other than shell commands, though. Puppet or terraform or something more declarative would have been such a better alternative to “everyone cargo cults ‘RUN apt-get upgrade’ onto the top of their dockerfiles”.
Like, the layer/stage/caching behavior is fine. I just wish the actual execution parts had been standardized using something at a higher level of abstraction than shell.
Its a Buildkit frontend, so you still use "docker build".
Until you need to do something that isn't covered with its DSL, and you extend it with an external command execution declaration... At which point people will just write bash scripts anyway and use your declarative language as a glorified exec.
However, Dockerfiles are so popular because they run shell commands and permit 'socially' extending someone else shell commands; tacking commands onto the end of someone else's shell script is a natural process. /bin/sh is unreasonably effective at doing anything you need to a filesystem, and if the shell exposes a feature, it has probably been used in a Dockerfile somewhere.
Every other solution, especially declarative ones, tend to come up short when _layering_ images quickly and easily. However, I agree they're good if you control the entire declarative spec.
And if you want something weird that's not supported by your particular tool of choice, you have the escape hatch of running arbitrary commands in the Dockerfile.
What more do you want?
They sounded nice on paper but the work they replaced was somehow more annoying.
I moved over to Docker when it came out because it used shell.
I'd get much better results it I used something else to do the foreach and gave terraform only static rules.
If your dockerfile says “ensure package X is installed at version Y” that’s a lot clearer (and also more easy to make performant/cached and deterministic) than “apt-get update; apt-get install $transitive-at-specific-version; apt-get install $the-thing-you-need-atspecific-version”. I’m not thrilled at how distro-locked the shell version makes you, and how easy it is for accidental transitive changes to occur too.
But neither of those approaches is at a particularly low abstraction level relative to the OS itself; files and system calls are more or less hidden away in both package-manager-via-bash and puppet/terraform/whatever.
But as long as people want to use scripting languages (like php, python etc) i guess docker is the neccessary evil.
I'll tell that to my CI runner, how easy is it for Go to download the Android SDK and to run Gradle? Can I also `go sonarqube` and `go run-my-pullrequest-verifications` ? Or are you also going to tell me that I can replace that with a shitty set of github actions ?
I'll also tell Microsoft they should update the C# definition to mark it down as a scripting language. And to actually give up on the whole language, why would they do anything when they could tell every developer to write if err != nil instead
Just because you have an extremely narrow view of the field doesn't mean it's the only thing that matters.
E.g. systemd exposes a lot of resource control as well as sandboxing options, to the point that I would argue that systemd services can be very similar to "traditional" runtime containers, without any image involved.
Interesting. How does go build my python app?
Have others found this to be the case? Perhaps we're doing something wrong.
Genuinely fascinating and clever solution!
[1] https://github.com/rootless-containers/slirp4netns
[2] https://blog.podman.io/2024/03/podman-5-0-breaking-changes-i...
[3] https://passt.top/passt/about/#pasta-pack-a-subtle-tap-abstr...
There was another component that we didn't have room to cover in the article that has been very stable (for filesystem sharing between the container and the host) that has been endlessly criticised for being slow, but has never corrupted anyone's data! It's interesting that many users preferred potential-dataloss-but-speed using asynchronous IO, but only on desktop environments. I think Docker did the right thing by erring on the side of safety by default.
Sir, this is a hacker news.
SLIRP was useful when you had a dial up shell, and they wouldn't give you slip or ppp; or it would cost extra. SLIRP is just a userspace program that uses the socket apis, so as long as you could run your own programs and make connections to arbitrary destinations, you could make a dial script to connect your computer up like you had a real ppp account. No incomming connections though (afaik), so you weren't really a peer on the internet, a foreshadowing of ubiquitous NAT/CGNAT perhaps.
That's a mistake indeed; "popularised by" might have been better. Before my beloved Palmpilot arrived one Christmas, I was only using SLIRP to ninja in Netscape and MUD sessions onto a dialup connection which wasn't a very mainstream use.
What I want to do when running a Docker container on Mac is to be able to have the container have an IP address separate from the Mac's IP address that applications on the Mac see. No port mapping: if the container has a web server on port 80 I want to access it at container_ip:80, not 127.0.0.1:2000 or something that gets mapped to container port 80.
On Linux I'd just used Docker bridged networking and I believe that would work, but on Mac that just bridges to the Linux VM running under the hypervisor rather than to the Mac.
Is there some officially recommended and supported way to do this?
For a while I did it by running WireGuard on the Linux VM to tunnel between that and the Mac, with forwarding enabled on the Linux VM [1]. That worked great for quite a while, but then stopped and I could not figure out why. Then it worked again. Then it stopped.
I then switched to this [2] which also uses WireGuard but in a much more automated fashion. It worked for quite a while, but also then had some problems with Docker updates sometimes breaking it.
It would be great if Docker on Mac came with something like this built in.
BTW are you trying to avoid port mapping because ports are dynamic and not known in advance? If so you could try running the container with --net=host and in Docker Desktop Settings navigate to Resources / Network and Enable Host Networking. This will automatically set up tunnels when applications listen on a port in the container.
Thanks for the links, I'll dig into those!
Well, before Docker I used to work on Xen and that possible future of massive block devices assembled using Vagrant and Packer has thankfully been avoided...
One thing that's hard to capture in the article -- but that permeated the early Dockercons -- is the (positive) disruption Docker had in how IT shops were run. Before that going to production was a giant effort, and 'shipping your filesystem' quickly was such a change in how people approached their work. We had so many people come up to us grateful that they could suddenly build services more quickly and get them into the hands of users without having to seek permission slips signed in triplicate.
We're seeing the another seismic cultural shift now with coding agents, but I think Docker had a similar impact back then, and it was a really fun community spirit. Less so today with the giant hyperscalars all dominating, sadly, but I'll keep my fond memories :-)
Funny comment considering lightweight/micro-VMs built with tools like Packer are what some in the industry are moving towards.
Some of those talks strangely make more sense today (e.g. Rump Kernels or unikernels + coding agents seems like a really good combination, as the agent could search all the way through the kernel layers as well).
I sort of had the problem in mind. Docker is the answer. Not clever enough to have inventer it.
If I did I would probably have invented octopus deploy as I was a Microsoft/.NET guy.
"Ship your machine to production" isn't so bad when you have a ten-line script to recreate the machine at the push of a button.
Wonder when some enterprising OSS dev will rebrand dynamic linking in the future...
I don't care about glibc or compatibility with /etc/nsswitch.conf.
look at the hack rust does because it uses libc:
> pub unsafe fn set_var<K: AsRef<OsStr>, V: AsRef<OsStr>>(key: K, value: V)
So what do you do when you need to resolve system users? I sure hope you don't parse /etc/passwd, since plenty of users (me included) use other user databases (e.g. sssd or systemd-userdbd).
I think it’s laziness, not difficulty. That’s not meant to be snide or glib: I think gaining expertise in how to package and deploy non-containerized applications isn’t difficult or unattainable for most engineers; rather, it’s tedious and specialized work to gain that expertise, and Docker allowed much of the field to skip doing it.
That’s not good or bad per se, but I do think it’s different from “pre-container deployment was hard”. Pre-container deployment was neglected and not widely recognized as a specialty that needed to be cultivated, so most shops sucked at it. That’s not the same as “hard”.
Minus the kernel of course. What is one to do for workloads requiring special kernel features or modules?
Good luck convincing people to switch!
That means unlike Gentoo, I've never dealt with a "slot conflict" where two packages want conflicting dependencies. And unlike Ubuntu, I have new versions of everything.
Pick 2: share dependencies, be on the bleeding edge, or waste your time resolving conflicts.
Using it, solving problems with it, and building a real community around it tend to make a much greater impact in the long run.
If you have adopted a bad tool then people are likely to want the bad tool in more places. This is the opposite of a virtuous cycle and is a horrible form of tech debt.
"Docker, Guix and NixOS (stable) all had their first releases
during 2013, making that a bumper year for packaging aficionados."
Now we get coding agent updates every week, but has there been a similar year since 2013 where multiple great projects all came out at the same time?When compared to a VM, yes. But shipping a separate userspace for each small app is still bloat. You can reuse software packages and runtime environments across apps. From an I/O, storage, and memory utilization point of view, it feels baffling to me that containers are so popular.
Why? It's not virtualization, it's containerization. It's using the host kennel.
Containers are fast.
You can hardly call this efficient hardware utilization.
For running your own machine, sure. But this would become non maintainable for a sufficiently multi tenant system. Nix is the only thing that really can begin to solve this outside of container orchestration.
I've recently switched from docker compose to process compose and it's super nice not to have to map ports or mount volumes. What I actually needed from docker had to do less with containers and more with images, and nix solves that problem better without getting in the way at runtime.
What process-compose gives me is a single parent with all of that project's processes as children, and a nice TUI/CLI for scrolling through them to see who is happy/unhappy and interrogating their logs, and when I shut it down all of that project's dependencies shut down. Pretty much the same flow as docker-compose.
It's all self-contained so I can run it on MacOS and it'll behave just the same as on Linux (I don't think systemd does this, could be wrong), and without requiring me to solve the docker/podman/rancher/orbstack problem (these are dependencies that are hard to bundle in nix, so while everything else comes for free, they come at the cost of complicating my readme with a bunch of requests that the user set things up beforehand).
As a bonus, since it's a single parent process, if I decide to invoke it through libfaketime, the time inherited by subprocess so it's consistently faked in the database and the services and in observability tools...
My feeling for systemd is that it's more for system-level stuff and less for project-level dependencies. Like, if I have separate projects which need different versions of postgres, systemd commands aren't going to give me a natural way to keep track of which project's postgres I'm talking about. process-compose, however, will show me logs for the correct postgres (or whatever service) in these cases:
~/src/projA$ process-compose process logs postgres
~/src/projB$ process-compose process logs postgres
This is especially helpful because AI agents tend to be scoped to working directory. So if I have one instance of claude code on each monitor and in each directory, which ever one tries to look at postgres logs will end up looking at the correct postgres's logs without having to even know that there are separate ones running.Basically, I'm alergic to configuring my system at all. All dependencies besides nix, my text editor, and my shell are project level dependencies. This makes it easy to hop between machines and not really care about how they're set up. Even on production systems, I'd rather just clone the repo `nix run` in that dir (it then launches process compose which makes everything just like it was in my dev environment). I am however not in charge of any production systems, so perhaps I'm a bit out of touch there.
Why do you think other tools will make a comeback?