How do you deploy in 10 seconds?

Posted by mpweiher 10/28/2024

How do you deploy in 10 seconds?(paravoce.bearblog.dev)

62 points | 54 commentspage 2

hilti 10/29/2024|

I‘m using rsync too. Works great. Unfortunately my managed server at Hetzner does not allow to run Go apps as services. That‘s the last step to figure out for me.

samuli 10/29/2024||

Here's how Twitter did it 15 years ago: https://blog.x.com/engineering/en_us/a/2010/murder-fast-data...

n_ary 10/29/2024||

While the author briefly mention Ansible for the next post, it was a dramatic improvement for doing fairly medium scale deployment and maintenance. The playbook dry run was godsend magic that cured a lot of headache.

from-nibly 10/29/2024||

DevOps is about putting friction in the right places, not removing it entirely.

indulona 10/29/2024||

I too prefer simplicity and getting things done over wasted time and money disguised as a service by some company that tries to get in between me and what i want to do.

whatever1 10/30/2024||

I splurge on a fancier server and just git pull and build on the server. Less than 10 seconds for sure.

BobbyTables2 10/30/2024||

Easy - just don’t run any tests!

krick 10/30/2024||

I mean, isn't it like everybody used to do it? Some trigger to git pull to one node and rsync to others. Plus some reverse-proxy configuration to make it smooth. Then came CI/CD, Docker, k8s, ArgoCD. Honestly, I'm still not convinced that benefits outweight the costs, but the choice seems to have been made consciously. So the "secret source" is bit banal here.

vrosas 10/30/2024|

Meh. The original places that created all this tech had great reasons to do so. Somes places today still do, but most that implement these practices or pieces of infra aren't do it to solve a specific problem, they're just doing it because that's what everyone else does. Not OP, but I worked at a billion dollar company whose whole deployment process was very simple Makefile.

someothherguyy 10/30/2024||

if you don't drain connections, you are slapping users in the face. if your deploy commands fail, you are shipping downtime.

deathanatos 10/29/2024|

> Gradual rollouts are slow too, taking 5-10 minutes as k8s rolls out updates to pods across the fleet.

If you're hitting this, you need to take a look into the service as the problem, not blame the infra layer.

k8s can absolute roll out a deployment in <60s, if not <10s. The bottleneck I see, when it is slow, is slow app termination. If your service takes 5 minutes to terminate, it isn't going to matter what the infrastructure layer does. Sometimes this is failing to handle SIGTERM (resulting in k8s having to fall back on timing out & SIGKILL'ing) … but sometimes it's just the app is slow to terminate, 'cause bugs. But it's those bugs that should get fixed.

You can somewhat workaround it by setting the surge to 100%. (And … even if you have a fast app, 100% surge might be a good idea, too. As always, it depends. If surging is going to eat up all the available RAM or CPU … maybe not.)

And most importantly: the underlying principles guiding k8s's behavior are going to apply just as equally to a shell script. app.service doesn't respond to SIGTERM[1]? You're going to have to decide what to do. Surge or not? Same thing. Potential for surge to result in resource depletion…?

> bash script

A service's program/code should generally be owned by root:. A service (generally) does not need the ability to re-write its own code.

> Only a few at my company understand Docker's layer caching + mounts and Github Actions' yaml-based workflow configuration in any detail.

… the working knowledge of either of those two things is not rocket science. The docker caching is probably the worst of the two; but you only need to understand it to speed up builds.

While GHA's YAML isn't pretty … it's also hardly complex. And for the most part, if your action simply defers to a script in the repository (e.g., I keep these in ci/), then it's mostly reproducible locally, too. (And there are some tools out there to run a GHA workflow locally, too, if you need more completeness than "just run the same script as the workflow".)

> Show you how I provision a Debian server using Ansible

I have spent enough years with Ansible to know all the problems with it, and I'd rather not go back to it.

([1] although a vanilla systemd service is going to have an "advantage" in that the default SIGTERM handling is different from a container. So it might look faster, in the case of buggy-app-with-no-SIGTERM-handler will die instantly … but it's probably still a bug, as ax'ing the service is probably also just dropping requests on the floor.)

More comments...