Top
Best
New

Posted by bumbledraven 8 hours ago

I am building a cloud(crawshaw.io)
541 points | 271 comments
dajonker 5 hours ago|
> Making Kubernetes good is inherently impossible, a project in putting (admittedly high quality) lipstick on a pig.

So well put, my good sir, this describes exactly my feelings with k8s. It always starts off all good with just managing a couple of containers to run your web app. Then before you know it, the devops folks have decided that they need to put a gazillion other services and an entire software-defined networking layer on top of it.

After spending a lot of time "optimizing" or "hardening" the cluster, cloud spend has doubled or tripled. Incidents have also doubled or tripled, as has downtime. Debugging effort has doubled or tripled as well.

I ended up saying goodbye to those devops folks, nuking the cluster, booted up a single VM with debian, enabled the firewall and used Kamal to deploy the app with docker. Despite having only a single VM rather than a cluster, things have never been more stable and reliable from an infrastructure point of view. Costs have plummeted as well, it's so much cheaper to run. It's also so much easier and more fun to debug.

And yes, a single VM really is fine, you can get REALLY big VMs which is fine for most business applications like we run. Most business applications only have hundreds to thousands of users. The cloud provider (Google in our case) manages hardware failures. In case we need to upgrade with downtime, we spin up a second VM next to it, provision it, and update the IP address in Cloudflare. Not even any need for a load balancer.

adamtulinius 5 hours ago||
If you spin up Kubernetes for "a couple of containers to run your web app", I think you're doing something wrong in the first place, also coupled with your comment about adding SDN to Kubernetes.

People use Kubernetes for way too small things, and it sounds like you don't have the scale for actually running Kubernetes.

ownagefool 1 hour ago|||
It depends what you're doing it.

My app is fairly simple node process with some side car worker processes. k8s enables me to deploy it 30 times for 30 PRs, trivially, in a standard way, with standard cleanup.

Can I do that without k8s? Yes. To the same standard with the same amount of effort? Probably not. Here, I'd argue the k8s APIs and interfaces are better than trying to do this on AWS ( or your preferred cloud provider ).

Where things get complicated is k8s itself is borderline cloud provider software. So teams who were previously good using a managed service are now owning more of the stack, and these random devops heros aren't necessarily making good decisions everywhere.

So you really have three obvious use cases:

a) You're doing something interesting with the k8s APIs, that aren't easy to do on a cloud provider. Essentially, you're a power user. b) You want a cloud abstraction layer because you're multi-cloud or you want a lock-in bargaining chip. c) You want cloud semantics without being on a cloud provider.

However, if you're a single developer with a single machine, or a very small team and you're happy working through contended static environments, you can pretty much just put a process on a box and call it done. k8s is overkill here, though not as much as people claim until the devops heros start their work.

sdevonoes 1 hour ago||||
Depends. For personal projects, yeah definitely. But at work? Typically the “Platform” team can only afford to support 1 (maybe 2) ways of deployment, and k8s is quite versatile, so even if you need 1 small service, you’ll go with the self-service-k8s approach your Platform team offers. Because the alternative is for you (or your team) to own the whole infrastructure stack for your new deloyment model (ecs? lambda? Whatever): so you need to setup service accounts, secret paths, firewalls, security, pipelines, registries, and a large etc. And most likely, no one will give you access rights for all of that , and your PM won’t accept the overhead either.

So having everyone use the same deployment model (and that’s typically k8s) saves effort. I don’t like it for sure

dajonker 5 hours ago||||
I totally agree, but that's not what happens in reality: the average devops knows k8s and will slap it onto anything they see (if only so they can put in on their resume). The average manager hears about k8s, gets convinced they need and hires beforementioned devops to build it.
goombaskoop 4 hours ago|||
> the average devops knows k8s and will slap it onto anything they see

This is certainly the case from all the third person accounts I hear. Online. I never actually met a single one that is like that, if anything, those same people are the ones that are first to tell me about their Hetzner setups.

ownagefool 1 hour ago|||
To be fair, I have k8s on my hetzner :p
hkt 3 hours ago|||
DevOps here.

The trouble is that we are literally expected to do this everywhere we go. I've personally advocated for approaches which use say, a pair of dedicated servers, or VMs as in GPs example. If you want it outside of AWS/GCP/Azure, you're regarded as a crazy person. If you don't adopt "best practices" (as defined by vendors) then management are scared. Management very often trust the sales and marketing departments of big vendors more than their own staff. Many of us have given up fighting this, because what it comes down to is a massive asymmetry of information and trust.

regularfry 2 hours ago|||
There is a kernel of validity lurking in the heart of all this, which is that immutable images you have the ability to throw away and refresh regularly are genuinely better than long-running VMs with an OS you've got to maintain, with the scope for vulnerabilities unrelated to the app you actually want to run. Management has absorbed this one good thing and slapped layer after layer of pointless rubbish on it, like a sort of inverse pearl. Being able to say "we've minimised our attack surface with a scratch image" (or alpine, or something from one of the secure image vendors) is a genuinely valuable thing. It's just the all of the everything that goes along with it...
dijit 2 hours ago||
Sure.

The challenge is convincing people that "golden images" and containers share a history, and that kubernetes didn't invent containers: they just solved load balancing and storage abstraction for stateless message architectures in a nice way.

If you're doing something highly stateful, or that requires a heavy deployment (game servers are typically 10's of GB and have rich dynamic configuration in my experience) then kubernetes starts to become round-peg-square-hole. But people buy into it because the surrounding tooling is just so nice; and like GP says: those cloud sales guys are really good at their jobs, and kubernetes is so difficult to run reliably yourself that it gets you hooked on cloud.

There's a literal army of highly charismatic, charming people who are economically incentivised to push this technology and it can be made to work so- the odds, as they say, are against you.

vladvasiliu 21 minutes ago||||
> If you want it outside of AWS/GCP/Azure, you're regarded as a crazy person. If you don't adopt "best practices" (as defined by vendors) then management are scared. Management very often trust the sales and marketing departments of big vendors more than their own staff. Many of us have given up fighting this, because what it comes down to is a massive asymmetry of information and trust.

I think this is the crux of the matter. Also, "everybody is doing it, so they must be right" is also a very common way of thinking amongst this population.

nz 56 minutes ago||||
The following happened to a friend.

Around the time of the pandemic, a company wanted to make some Javascript code do a kind of transformation over large number of web-pages (a billion or so, fetched as WARC files from the web archive). Their engineers suggested setting up SmartOS VMs and deploying Manta (which would have allowed the use of the Javascript code in a totally unmodified way -- map-reduce from the command-line, that scales with the number storage/processing nodes) which should have taken a few weeks at most.

After a bit of googling and meeting, the higher ups decided to use AWS Lambdas and Google Cloud Functions, because that's what everyone else was doing, and they figured that this was a sensible business move because the job-market must be full of people who know how to modify/maintain Lambda/GCF code.

Needless to say, Lambda/GCF were not built for this kind of workload, and they could not scale. In fact, the workload was so out-of-distribution, that the GCP folks moved the instances (if you can call them that) to a completely different data-center, because the workload was causing performance problems, for _other_ customers in the original data-center.

Once it became clear that this approach cannot scale to a billion or so web-pages, it was decided to -- no, not to deploy Manta or an equivalent -- but to build a custom "pipeline" from scratch, that would do this. This system was in development for 6 months or so, and never really worked correctly/reliably.

This is the kind of thing that happens when non-engineers can override or veto engineering decisions -- and the only reason they can do that, is because the non-engineers sign the paychecks (it does not matter how big the paycheck is, because market will find a way to extract all of it).

One of the fallacies of the tech-industry (I do not mean to paint with too broad a brush, there are obviously companies out there that know what they are doing) is that there are trade-offs to be made between business-decisions and engineering-decisions. I think this is more a kind of psychological distortion or a false-choice (forcing an engineering decision on the basis of what the job market will be like some day in the future -- during a pandemic no less -- is practically delusional). Also, if such trade-offs are true trade-offs, then maybe the company is not really an engineering company (which is fine, but that is kind of like a shoe-store having a few podiatrists on staff -- it is wasteful, but they can now walk around in white lab-coats, and pretend to be a healthcare institution instead of a shoe-store).

Personally, I believe that the tech industry sustains itself via technical debt, much like the real economy sustains itself on real debt. In some sense, everyone is trying to gaslight everyone else into incurring as much technical debt as possible, so that a way to service the debt can be sold. Most of the technical debt is not necessary, and if people were empowered to just not incur it, I suspect it would orient tech companies towards making things that actually push the state of the art forward.

jcgrillo 19 minutes ago||
There was a moment ca. 2020 when everyone was losing their minds over Lambda and other cloud services like SQS and S3 because they're "so cheap!!11". Innumeracy is a hell of a drug.
jcgrillo 23 minutes ago|||
> Management very often trust the sales and marketing departments of big vendors more than their own staff.

They're getting kickbacks from cloud vendors. Prove me wrong.

darkwater 4 hours ago||||
And the average developer doesn't even know where to start to deploy things in prod. When the feature product asks passes QA... to the next sprint! we are done!
chrisweekly 2 hours ago||
Whose responsibility is it to establish the prerequisite CICD pipelines, HITL workflows, and Observability infr in order for devs to shepherd changes to prod (and track their impact)? Hint: it's not the developer's.
philipallstar 1 hour ago|||
This was the point of "devops" (the concept, not the job title): the team should be responsible for development and operations, so one isn't prioritised hugely over the other.
liveoneggs 1 hour ago|||
But those things all require more pods on the cluster! We've looped back around to the beginning.
darkwater 1 hour ago||
Exactly my point. But then developers: "I just want to go to my Heroku days again!" but then with a sufficient big company there are maaany developers doing things their slightly different way, and then other effects start compounding, and then costs go up because 15 different teams are using 27 different solutions and and and...

But yeah, let's just spin-up a shadow IT VM with Debian like GP said, it's easy!

throwup238 15 minutes ago||
> But yeah, let's just spin-up a shadow IT VM with Debian like GP said, it's easy!

That’s literally how they sold AWS in the beginning.

Cloud won not because of costs or flexibility but because it allowed teams to provision their own machines from their budget instead of going through all the red tape with their IT departments creating… a bunch of shadow IT VMs!

Everything old is new again, except it works on an accelerated ten year cycle in the IT industry.

tete 2 hours ago|||
> the average devops knows k8s

If you'd know Kubernetes, you know not to use it. I say that as someone who used to do consulting for it.

The reality is that yet again "making money" completely collides with efficient, quality, sane productive work.

For me one of the main reasons to leave that space is that I couldn't really deal with the fact that my work collides with a client's success. That said I have helped to get off that stuff and other things that they thought they needed, that just wasted time and money. It just feels odd going into a company that hired you to consult on a topic only to end up telling them "The best approach for you is not doing that at all". Often never. Like some people thought "Well, if we have hundreds of thousands or even millions of users" and the reality was that even in these scenarios if you went away from that abstract thought and discussed a hypothetical based on their product they realized that they'd still be better off without it. Besides the fact that this hypothetical often was in a future that made it likely that they said they'd likely have completely different setup so preparing for that didn't even make sense.

I think a big thing related to that was/is the microservice craze where people end up moving to a complex architecture for not many good reasons and then they increase complexity way faster than what they actually deliver in terms of the product, because it somehow feels good. I know it does, I've been there. When in reality the outcome often is just a complex mess with what could have been a relatively simple monolith. And these monoliths do work. And in the vast majority of cases they are easy to scale, because your problem switches from "how do we best allocate that huge amount of very different services across our infrastructure" to (for the most part) "how do we spin up our monolith on one more server" which tends to be a way easier to tackle service.

And nothing stops you from still using everything else if you want. Just because it's a monolith doesn't mean you need to skip on any of the cloud offerings, etc. For some reason there seems to be that idea that if you write a monolith you are somehow barred from using modern tooling, infrastructure, services, etc. Not sure where that comes from.

tjarjoura 1 hour ago||||
In some sense, Kubernetes is just a portable platform for running Linux services, even on a single node using something like K3s. I almost see it as being an extension of the Linux OS layer.
sgt 43 minutes ago|||
Then why can't we put a wrapper onto systemd and make that into a light weight k8s?
enos_feedler 34 minutes ago||
Remember fleet?
acedTrex 14 minutes ago|||
This is what I do for small stuff, debian vm, k3s on it for a nicer http based deployment api.
Thanemate 4 hours ago||||
I know that "resume-driven development" exists, where the tradeoffs between approaches aren't about the technical fit of the solution but the career trajectory. I've seen people making plain workstation preparation scripts using Rust, only to have something to flex about in interviews.

I'm not surprised even in the slightest that DevOps workers will slap k8s on everything, to show "real industry experience" in a job market where the resume matches the tools.

ororoo 4 hours ago||
there are alsp people with devops title that do not know anything else than the hammer, and then everything is a hammer problem.

I mean, I worked with people who were suprised that you can run more applications inside ec2 vm than just 1 app.

tete 2 hours ago||
> there are alsp people with devops title that do not know anything else than the hammer, and then everything is a hammer problem.

To be fair though, that's true for every profession or skill.

> I mean, I worked with people who were suprised that you can run more applications inside ec2 vm than just 1 app.

I've seen something similar where people were surprised that you can use an object storage (so effectively "make HTTP requests") from every server.

littlestymaar 2 hours ago||||
I have nom doubt that there are legit use cases for something like k8s at Google or other multi-billion companies.

But if its use was confined to this use case, pretty much nobody would be using it (unless as a customer of the organization's infra) and barely would be talking about it (like how there isn't too much talk about Borg).

The reason k8s is a thing in the first place is because it's being used by way too many people for their own goods. (Most people having worked in startups have met too many architecture astronauts in our lives).

If I had to bet, I'd wager that 99% of k8s users are in the “spin a few containers to run your web app” category (for the simple reason that for one billion-dollar tech business using it for legit reasons, there's many thousands early startups who do not).

rantanplan 2 hours ago||
The legit use case for companies like Google/Amazon etc is only to sell it to customers. None of these companies use K8s internally for real critical workloads.
bitexploder 1 hour ago|||
Ehm, that is simply not true. Google built it for themselves first. It is essentially the open source version of the internal architecture. It gets used.
zaphar 42 minutes ago|||
I worked at google. k8s does not really look at all like what they used internally when I was there, aside from sharing some similar looking building blocks.
oblio 30 minutes ago||
Yeah, but is the internal tool simpler? I'd be surprised.
akdev1l 35 minutes ago|||
Also Amazon definitely uses k8s for stuff.

Teams are free to use EKS internally.

oblio 30 minutes ago||||
Google uses Kubernetes' grandpa, called Borg, for everything.

But to quote someone: "you are not Google".

littlestymaar 42 minutes ago|||
I said “something like k8s” above, and Google for sure uses something like k8s called Borg.
altmanaltman 3 hours ago||||
yeah it's like wanting to drive to the mall in the Space Shuttle and then complaining how its too complicated
rvz 4 hours ago|||
They use it for inflating their resume for career progression rather than actually evaluating if they need it in the first place.

This is why you get many folks over-thinking the solution and picking the most hyped technologies and using them to solve the wrong problems without thinking about what they are selling.

You don't need K8s + AWS EC2 + S3 just to host a web app. That tells me they like lighting money on fire and bankrupting the company and moving to the next one.

eddythompson80 4 hours ago|||
And those devops folks just let your single debian VM be? It sounds like you have, like many of us, an organizational/people problem, not a k8s problem.

Maybe those devops folks only pay attention to k8s clusters and you're flying under their radar with your single debian VM + Kamal. But the same thinking that results in an overtly complex, impossible to debug, expensive to run k8s cluster can absolutely result in the same using regular VMs unless, again, you are just left to your own devices because their policies don't apply to VMs, yet.

The problem usually is you're one mistake away from someone shoving their nose in it. "What are you doing again? What about HA and redundancy? slow rollout and rollback? You must have at least 3 VMs (ideally 5) and can't expose all VMs to the internet of course. You must define a virtual network with policies that we can control and no wireguard isn't approved. You must split the internet facing load balancer from the backend resources and assign different identities with proper scoping to them. Install these 4 different security scanners, these 2 log processors, this watchdog and this network monitor. Are you doing mtls between the VMs on the private network? what if there is an attacker that gains access to your network? What if your proxy is compromised? do you have visibility into all traffic on the network? everything must flow throw this appliance"

onlybosshaskeys 4 hours ago||
I mean, it's pretty clear the only reason they even got to swap to a single VM and take the glory is because they fired the devops in question. As in, they're the actual boss of a small operation. That's what saying goodbye and nuking the cluster implies here.
jkukul 14 minutes ago|||
> I ended up saying goodbye to those devops folks,

The irony is that "DevOps" was supposed to be a culture and a set of practices, not a job title. The tools that came with it (=Kubernetes) turned out to be so complex that most developers didn't want to deal with them and the DevOps became a siloed role that the movement was trying to eliminate.

That's why I have an ick when someone uses devops as a job title. Just say "System Admin" or "Infrastrcutre Engineer". Admit that you failed to eliminate the siloes.

icedchai 6 minutes ago||
Yep, "Cloud Infrastructure Engineer" is what I prefer.

I am primarily a backend developer but I do a lot of ops / infra work because nobody else wants to do it. I stay as far away from k8s as possible.

bfivyvysj 5 hours ago|||
I thought we collectively learned this with stack overflows engineering blog years ago.

Scale vertically until you can't because you're unlikely to hit a limit and if you do you'll have enough money to pay someone else to solve it.

Docker is amazing development tooling but it makes for horrible production infrastructure.

KronisLV 3 hours ago|||
Docker is great development tooling (still some rough edges, of course).

Docker Compose is good for running things on a single server as well.

Docker Swarm and Hashicorp Nomad are good for multi-server setups.

Kubernetes is... enterprise and I guess there's a scale where it makes sense. K3s and similar sort of fill the gap, but I guess it's a matter of what you know and prefer at that point.

Throw on Portainer on a server and the DX is pretty casual (when it works and doesn't have weird networking issues).

Of course, there's also other options for OCI containers, like Podman.

staticassertion 25 minutes ago|||
This is why there's an endless cycle of shitty SaaS with slow APIs and high downtime. People keep thinking that scale is something you can just add later.
sibellavia 4 hours ago|||
Clearly, Kubernetes wasn’t the right solution for your case, and I also agree that using it for smaller architectures is overkill. That said, it’s the standard for large-scale production platforms that need reproducibility and high availability. As of today I don’t see many *truly* viable alternatives and honestly I haven't even seen them.
yard2010 5 hours ago|||
I don't get it, I think that k8s is the best software written since win95. It redefines computing in the same way IMHO. I have some experience in working with k8s on prod and I loved every moment of it. I'm definitely missing something.
RyanHamilton 4 hours ago||
Can you expand how it redefined computing for you personally?
dobreandl 24 minutes ago|||
We've reduced our costs on Hetzner to about 10% on what we've paid on Heroku, for 10x performance. Kamal really kicks ass, and you can have a pretty complicated infrastructure up in no time. We're using terraform, ansible + kamal for deploys, no issues whatsoever.
psviderski 3 hours ago|||
A single VM is indeed the most pragmatic setup that most apps really need. However I still prefer to have at least two for little redundancy and peace of mind. It’s just less stressful to do any upgrades or changes knowing there is another replica in case of a failure.

And I’m building and happily using Uncloud (https://github.com/psviderski/uncloud) for this (inspired by Kamal). It makes multi-machine setups as simple as a single VM. Creates a zero-config WireGuard overlay network and uses the standard Docker Compose spec to deploy to multiple VMs. There is no orchestrator or control plane complexity. Start with one VM, then add another when needed, can even mix cloud VMs and on-prem.

sgt 3 hours ago||
That looks pretty interesting. Is it being used in production yet (I mean serious installs) ?
psviderski 3 hours ago||
Yes but at small scale. Myself and a handful of others from our Discord run it in production. The core build/push/deploy workflows are stable and most of the heavy lifting at runtime is done by battle-tested projects: Docker, Caddy, WireGuard, Corrosion from Fly.io.

Radboud University recently announced they're rolling it out for managing containers across the faculty which is the most "serious install" I know about, but there could be other: https://cncz.science.ru.nl/en/news/2026-04-15_uncloud/

sgt 2 hours ago||
Did you improve the security concerns? E.g the way it executes in a `curl | bash` level. I was a bit concerned about that.
projektfu 1 hour ago||
TBF, the documentation says you can download and review the script, then run it. Or use other methods like a homebrew or (unofficial) Debian package, or you can just install the binary where you want it, which is all the install.sh script (107 lines, 407 words) does.

https://uncloud.run/docs/getting-started/install-cli/#instal...

sgt 48 minutes ago||
I mean how commands are run on the servers - indirectly or indirectly. It's likely a code quality issue?
ferngodfather 4 hours ago|||
Cloud providers have put a lot of time and effort into making you believe every web app needs 99.9999% availability. Making you pay for auto scaled compute, load balancers, shared storage, HA databases, etc, etc.

All of this just adds so much extra complexity. If I'm running Amazon.com then sure, but your average app is just fine on a single VM.

IsTom 2 hours ago|||
And funnily recently many of the Big Serious Cloud Websites are shitting the bed of availability aggressively.
gloomyday 4 hours ago|||
Marketing has such a gigantic influence in our field. It is absolutely insane. It feels unavoidable, since IT is (was?) constantly filled with new blood that picks up where people left off.
dnnddidiej 26 minutes ago|||
That is good but at bigger orgs with massive workloads and the teams to build it out k8s makes sense. It is a standard and brilliant tech.
serbrech 4 hours ago|||
Yes, I mean, I’m an engineer on a cloud Kubernetes service, and I don’t run Kubernetes for my home services. I just run podman quadlets (systems units). But that is entirely different from an enterprise scale setup with monitoring, alerting, and scale in mind…
BirAdam 1 hour ago|||
So... if you're at the point where you're using a single VM, I have to ask why bother with docker at all? You're paying a context switch overhead, memory overhead, and disk overhead that you do not need to. Just make an image of the VM in case you need to drop it behind an LB.
staticassertion 23 minutes ago|||
There's one extra process that takes up a tiny bit of CPU and memory. For that, you get an immutable host, simple configuration, a minimal SBOM, a distributable set of your dependencies, x-platform for dev, etc.
mkj 1 hour ago|||
How is docker a context switch overhead? It's the same processes running on the same kernel.
BirAdam 53 minutes ago||
You're adding all of the other supporting processes within the container that needn't be replicated.
akdev1l 30 minutes ago|||
It depends, you could have an application with something like

FROM scratch

COPY my-static-binary /my-static-binary

ENTRYPOINT “/my-static-binary”

Having multiple processes inside one container is a bit of an anti-pattern imo

dnnddidiej 24 minutes ago|||
Sidecars? Not in a simple app.
elAhmo 2 hours ago|||
Not advocating for complexity or k8s, but if your workflow can be served by a single VM, then you are magnitudes away from the volume and complexity that would push you to have k8s setup and there is even no debate of it.

There are situations where a single VM, no matter how powerful is, can do the job.

PunchyHamster 3 hours ago|||
Well, you used a tank to plow a field then complained about maintenance and fuel usage.

If you have actual need to deploy few dozen services all talking with eachother k8s isn't bad way to do it, it has its problems but it allows your devs to mostly self-service their infrastructure needs vs having to process ticket for each vm and firewall rules they need. That is saying from perspective of migrating from "old way" to 14 node actual hardware k8s cluster.

It does make debugging harder as you pretty much need central logging solution, but at that scale you want central logging solution anyway so it isn't big jump, and developers like it.

Main problem with k8s is frankly nothing technical, just the "ooh shiny" problem developers have where they see tech and want to use tech regardless of anything

m4ck_ 1 hour ago|||
And if you need a cluster, Hashicorp Nomad seems like a more reasonable option than full blown kubernetes. I've never actually used it in prod, only a lab, but I enjoyed it.
ghthor 2 minutes ago||
We run nomad at work. I’m very happy with it from an administrative standpoint.
robshep 5 hours ago|||
If you replaced k8s with a single app on a single VM then you’ve taken a hype fuelled circuitous route to where you should have been anyway.
dgb23 4 hours ago|||
> Then before you know it, the devops folks have decided that they need to put a gazillion other services and an entire software-defined networking layer on top of it.

I'm not familiar with kubernetes, but doesn't it already do SDN out of the box?

mystifyingpoi 4 hours ago||
> doesn't it already do SDN out of the box

Yes and no. Kubernetes defines specification about network behavior (in form of CNI), but it contains no actual implementation. You have to install the network plugin basically as the first setup step.

ricardo_lien 3 hours ago|||
Yes, I've had similar experiences. My life has been much easier since I migrated to ECS Fargate - the service just works great. No more 2AM calls (at least not because of infra incidents), no more cost concerns from my boss.
abdjdoeke 2 hours ago|||
I dunno the more people dig into this approach they will probably end up just reinventing Kubernetes.

I use k3s/Rancher with Ansible and use dedicated VMs on various providers. Using Flannel with wireguard connects them all together.

This I think is reasonable solution as the main problem with cloud providers is they are just price gouging.

wernerb 5 hours ago|||
DevOps lost the plot with the Operator model. When it was being widely introduced as THE pattern I was dismayed. These operators abstract entirely complex services like databases behind yaml and custom go services. When going to kubecon i had one guy tell me he collects operators like candy. Answers on Lifecycle management, and inevitable large architectural changes in an ever changing operator landscape was handwaved away with series of staging and development clusters. This adds so much cost.. Fundamentally the issue is the abstractions being too much and entirely on the DevOps side of the "shared responsibility model". Taking an RDBMS from AWS of Azure is so vastly superior to taking all that responsibility yourself in the cluster.. Meanwhile (being a bit of an infrastructure snob) I run Nixos with systemd oci containers at home. With AI this is the easiest to maintain ever.
lifty 4 hours ago||
Those managed databases from the big cloud providers have even more machinery and operator patterns behind them to keep them up and running. The fact that it's hidden away is what you like. So the comparison makes no sense.
gregdelhon 2 hours ago|||
Not so surprised that the architecture approach pushed by cloud vendors are... increasing cloud spend!
collimarco 1 hour ago|||
Kubernetes is not bad, it's just low level. Most applications share the exact same needs (proof: you could run any web app on a simple platform like Heroku). That's why some years ago I built an open source tool (with 0 dependencies) that simplify Kubernetes deployments with a compact syntax which works well for 99% of web apps (instead of allowing any configuration, it makes many "opinionated" choices): https://github.com/cuber-cloud/cuber-gem I have been using it for all the company web apps and web services for years and everything works nicely. It can also auto scale easily and that allows us to manage huge spikes of traffic for web push (Pushpad) at a reasonable price (good luck if you used a VM - no scaling - or if you used a PaaS - very high costs).
wutwutwat 1 hour ago||
It's not just low level, in most cases, it's also overkill.

Most companies aren't "web scale" ™ and don't need an orchestrator built for google level elasticity, they need a vm autoscaling group if anything.

Most apps don't need such granular control over fs access, network policies, root access, etc, they need `ufw allow 80 && ufw enable`

Most apps don't need a 15 stage, docker layer caching optimized, archive promotion build pipeline that takes 30 minutes to get a copy change shipped to prod, they need a `git clone me@github.com:me/mine.git release_01 && ln -s release_01 /var/www/me/mine/current`

This is coming from someone who has had roles both as a backend product engineer and as a devops/platform engineer, who has been around long enough to remember "deploy" to prod was eclipse ftping php files straight to the prod server on file save. I manage clusters for a living for companies that went full k8s and never should have gone full k8s. ECS would have worked for 99% of these apps, if they even needed that.

Just like the js ecosystem went bat shit insane until things started to swing back towards sanity and people started to trim the needless bloat, the same is coming or due for the overcomplexity of devops/backend deployments

marcosscriven 4 hours ago|||
First time I’ve heard of Kamal. Looks ideal!

Do you pair it with some orchestration (to spin up the necessary VM)?

1dom 4 hours ago|||
I think this comment and replies capture the problem with Kubernetes. Nobody gets fired for choosing Kubernetes now.

It's obvious to you, me and the other 2 presumably techie people who've responded within 15 mins that you shouldn't have been using Kubernetes. But you probably work in a company of full of techie people, who ended up using Kubernetes.

We have HN, an environment full of techie people here who immediately recognise not to use k8s in 99% of cases, yet in actually paid professional environments, in 99% of cases, the same techie people will tolerate, support and converge on the idea they should use k8s.

I feel like there's an element of the emperors new clothes here.

whalesalad 1 hour ago|||
Your use case is very small and simple. Of course a single VM works. You’re changing a literal A record at CF to deploy confirms this.

That is not what kube is designed for.

znpy 3 hours ago||
> It always starts off all good with just managing a couple of containers to run your web app. Then before you know it, the devops folks have decided that they need to put a gazillion other services and an entire software-defined networking layer on top of it.

As a devops/cloud engineer coming from a pure sysadmin background (you've got a cluster of n machines running RHEL and that's it) i feel this.

The issues i see however are of different nature:

1. resumeè-driven development (people get higher-paying job if you have the buzzwords in your cv)

2. a general lack of core-linux skills. people don't actually understand how linux and kubernetes work, so they can't build the things they need, so they install off-the-shelf products that do 1000 things including the single one they need.

3. marketing, trendy stuff and FOMO... that tell you that you absolutely can't live without product X or that you must absolutely be doing Y

to give you an example of 3: fluxcd/argocd. they're large and clunky, and we're getting pushed to adopt that for managing the services that we run inside the cluster (not developer workloads, but mostly-static stuff like the LGTM stack and a few more things - core services, basically). they're messy, they add another layer of complexity, other software to run and troubleshoot, more cognitive load.

i'm pushing back on that, and frankly for our needs i'm fairly sure we're better off using terraform to manage kubernetes stuff via the kubernetes and helm provider. i've done some tests and frankly it works beautifully.

it's also the same tool we use to manage infrastructure, so we get to reuse a lot of skills we already have.

also it's fairly easy to inspect... I'm doing some tests using https://pkg.go.dev/github.com/hashicorp/hcl/v2/hclparse and i'm building some internal tooling to do static analysis of our terraform code and automated refactoring.

i still think kubernetes is worth the hassle, though (i mostly run EKS, which by the way has been working very good for me)

stingraycharles 7 hours ago||
Potentially useful context: OP is one of the cofounders of Tailscale.

> Traditional Cloud 1.0 companies sell you a VM with a default of 3000 IOPS, while your laptop has 500k. Getting the defaults right (and the cost of those defaults right) requires careful thinking through the stack.

I wish them a lot of luck! I admire the vision and am definitely a target customer, I'm just afraid this goes the way things always go: start with great ideals, but as success grows, so must profit.

Cloud vendor pricing often isn't based on cost. Some services they lose money on, others they profit heavily from. These things are often carefully chosen: the type of costs that only go up when customers are heavily committed—bandwidth, NAT gateway, etc.

But I'm fairly certain OP knows this.

faangguyindia 5 hours ago||
i was just curious so i tested this actually.

Using fio

Hetzner (cx23, 2vCPU, 4 GB) ~3900 IOPS (read/write) ~15.3 MB/s avg latency ~2.1 ms 99.9th percentile ≈ ~5 ms max ≈ ~7 ms

DigitalOcean (SFO1 / 2 GB RAM / 30 GB Disk) ~3900 IOPS (same!) ~15.7 MB/s (same!) avg latency ~2.1 ms (same!) 99.9th percentile ≈ ~18 ms max ≈ ~85 ms (!!)

using sequential dd

Hetzner: 1.9 GB/s DO: 850 MB/s

Using low end plan on both but this Hetzner is 4 euro and DO instance is $18.

zuhsetaqi 4 hours ago|||
Just for comparison I use the cheapest netcup root server:

RS 1000 G12 AMD EPYC™ 9645 8 GB DDR5 RAM (ECC) 4 dedicated cores 256 GB NVMe

Costs 12,79 €

Results with the follwing command:

fio --name=randreadwrite \ --filename=testfile \ --size=5G \ --bs=4k \ --rw=randrw \ --rwmixread=70 \ --iodepth=32 \ --ioengine=libaio \ --direct=1 \ --numjobs=4 \ --runtime=60 \ --time_based \ --group_reporting

IOPS Read: 70.1k IOPS Write: 30.1k IOPS ~100k IOPS total

Throughput Read: 274 MiB/s Write: 117 MiB/s

Latency Read avg: 1.66 ms, P99.9: 2.61 ms, max 5.644 ms Write avg: 0.39 ms, P99.9: 2.97 ms, max 15.307 ms

yread 3 hours ago|||
Nice, on Hetzner AX41-nvme (~50 eur, from 2020) non-raid I get:

IOPS: read 325k, write 139k

Throughput: read 1271MB/s, write 545MB/s

Latency: read avg 0.3ms, P99.9 2.7ms, max 20ms; write: 0.14ms, P99.9 0.35ms max 3.3ms

so roughly 100 times iops and throughput of the cloud VMs

Medowar 3 hours ago|||
That is a bit of a unfair comparison. The Hetzner and DO instances are shared hosting, you are using dedicated ressources.

Using a Netcup VPS 1000 G12 is more comparable.

read: IOPS=18.7k, BW=73.1MiB/s

write: IOPS=8053, BW=31.5MiB/s

Latency Read avg: 5.39 ms, P99.9: 85.4 ms, max 482.6 ms

Write avg: 3.36 ms, P99.9: 86.5 ms, max 488.7 ms

Nnnes 2 hours ago|||
Hetzner has dedicated resources too, but they also have 2 levels of shared resources, "Cost-Optimized" and "Regular Performance". The 3900 IOPS CX23 above is "Cost-Optimized".

Here are some "Regular Performance" shared resource stats

Hetzner CPX11 (Ashburn, 2 CPUs, 2GB, 5.49€ or $6.99/month before VAT)

read: IOPS=36.7k, BW=144MiB/s, avg/p99.9/max 2.4/6.1/19.5ms

write: IOPS=15.8k, BW=61.7MiB/s, avg/p99.9/max 2.4/6.1/18.7ms

Hetzner CPX22 (Helsinki, 2 CPUs, 4GB, 7.99€ or $9.49/month before VAT)

read: IOPS=48.2k, BW=188MiB/s, avg/p99.9/max 1.9/5.7/10.8ms

write: IOPS=20.7k, BW=80.8MiB/s, avg/p99.9/max 1.8/5.8/10.9ms

Hetzner CPX32 (Helsinki, 4 CPUs, 8GB, 13.99€ or $16.49/month before VAT)

read: IOPS=48.3k, BW=189MiB/s, avg/p99.9/max 1.9/6.2/36.1ms

write: IOPS=20.7k, BW=81.0MiB/s, avg/p99.9/max 1.8/6.3/36.1ms

trvz 1 hour ago|||
Storage performance is practically always a shared resource, and that's what y'all are talking about here...
yard2010 5 hours ago||||
I love Hetzner so much. I'm not affiliated I'm a really happy customer these guys just do everything right.
ratg13 1 hour ago||
As long as you never have to interact with them. If you run into issues they have caused themselves, you'll find yourself dealing with a unique mix of arrogance and incompetence.
drcongo 1 hour ago||
I've been using Hetzner for ~20 years and every single support interaction I've ever had with them has been top tier. Never AI bots, always humans who are helpful, courteous and prompt. I can't think of a single company, let alone hosting company, whose customer service has been so consistently good.
Aeolun 33 minutes ago||
It certainly helps the service never does anything wonky that requires a support interaction in the first place.
torginus 5 hours ago|||
>3000 IOPS

If that's true, I wonder if this is a deliberate decision by cloud providers to push users towards microservice architectures with proprietary cloud storage like S3, so you can't do on-machine dbs even for simple servers.

AnthonyMouse 3 hours ago||
It's probably a combination of high density storage nodes getting I/O bound and SSDs having finite write endurance. Anything that improves the first problem costs them money to improve it and then makes the second problem worse, and the second one costs them money again, so why would they want to make the default something that costs then more twice if most people don't need it?

Instead they make the default "meager IOPS" and then charge more to the people who need more.

sroussey 6 hours ago|||
Many cloud vendors have you pay through the nose for IOPS and bandwidth.

Edit: I posted this before reading, and these two are the same he points out.

stingraycharles 5 hours ago||
Yes, but you can’t directly compare SAN-style storage with a local NVMe. But I agree that it’s too expensive, but not nearly as insane as the bandwidth pricing. If you go to a vendor and ask for a petabyte of storage, and it needs to be fully redundant, and you need the ability to take PIT-consistent multi-volume snapshots, be ready to pay up. And this is what’s being offered here.

And yes, IO typically happens in 4kb blocks, so you need a decent amount of IOPS to get the full bandwidth.

fragmede 5 hours ago||
> Cloud vendor pricing often isn't based on cost.

Business 101 teaches us that pricing isn't based on cost. Call it top down vs bottom up pricing, but the first principles "it costs me $X to make a widget, so 1.y * $X = sell the product for $Y is not how pricing works in practice.

jeffrallen 5 hours ago|||
Just to spell this out more clearly for the back row.of the classroom:

The price is what the customer will pay, regardless of your costs.

barrkel 4 hours ago||
Economics teaches us that a big difference between cost and price attracts competition which should make the price trend towards the cost.
dns_snek 2 hours ago|||
Practice taught me that that "should" is doing a lot of heavy lifting here and it's often not the case, even across long time periods (years) that should allow competitors to emerge.

For example I calculated the cost of a solar install to be approximately: Material + Labour + Generous overhead + Very tidy profit = 10,000€

In practice I keep getting offers for ~14,000€, which will be reduced to 10,000€ with a government subsidy and my request for an itemized invoice is always met with radio silence.

ncruces 3 hours ago||||
Only if the barrier of entry is low.

Which it won't be, if at every turn you choose the hyperscaler.

stingraycharles 2 hours ago||||
If this is the case, cheap bandwidth for AWS, when?
_el1s7 4 hours ago|||
Exactly.
_el1s7 4 hours ago|||
That's not a business 101.
lelanthran 4 hours ago||
> That's not a business 101.

It kinda is, but obscured by GP's formula.

More simply; if it costs you $X to produce a product and the market is willing to pay $Y (which has no relation to $X), why would you price it as a function of $X?

If it costs me $10 to make a widget and the market is happy to pay $100, why would I base my pricing on $10 * 1.$MARGIN?

carefree-bob 4 hours ago||
Exactly. The mechanism by which the price ends up as X plus margin is just competition. Others enter the market and compete with you until the returns are driven down to the rental rate of capital. Any barriers to entry result in higher margins.

But that is an equilibrium result, and famously does not apply to monopolies, where elasticity of substitution will determine the premium over the rental rate of capital.

clktmr 6 hours ago||
> Agents, by making it easiest to write code, means there will be a lot more software. Economists would call this an instance of Jevons paradox. Each of us will write more programs, for fun and for work.

There is already so much software out there, which isn't used by anyone. Just take a look at any appstore. I don't understand why we are so obsessed with cranking out even more, whereas the obvious usecase for LLMs should be to write better software. Let's hope the focus shifts from code generation to something else. There are many ways LLMs can assist in writing better code.

delbronski 5 hours ago||
I think we, as engineers, are a bit stuck on what “software” has traditionally been. We think of systems that we carefully build, maintain, and update. Deterministic systems for interacting with computers. I think these “traditional” systems will still be around. But AI has already changed the way users interact with computers. This new interaction will give rise to another type of software. A more disposable type of software.

I believe right now we are still in the phase of “how can AI help engineers write better software”, but are slowly shifting to “how can engineers help AI write better software.” This will bring in a new herd of engineers with completely different views on what software is, and how to best go about building computer interactions.

Gareth321 28 minutes ago|||
The most recent software paradigm has been SaaS - software as a service. Capex is distributed among all customers and opex is paid for through the subscription. This avoids the large upfront capex and provides easy cost and revenue projections for both sides of the transaction. The key to SaaS is that the software is maximally generic. Meaning is works well for the largest number of people. This necessitates making tough cuts on UX and functionality when they only benefit small parts of the userbase.

Vibe coding or LLM accelerated development is going to turn this on its head. Everyone will be able to afford custom software to fit their specific needs and preferences. Where Salesforce currently has 150,000 customers, imagine 150,000 customers all using their own customised CRM. The scope for software expansion is unbelievably large right now.

skybrian 5 hours ago|||
Sometimes “better” means “customized for my specific use case.” I expect that there will be a lot of custom software that never appears in any app store.
stingraycharles 5 hours ago|||
The amount of single purpose scripts in my ~/playground/ folder has increased dramatically over the past year. Super useful, wouldn’t have had the time for it otherwise, but not in any way shareable. Eg “parse this excel sheet I got from my obscure bank and upload it to my budgeting app’s REST API”. Wouldn’t have had the time or energy to do this before, now I have it and it scratches an itch.
Gareth321 26 minutes ago||||
If we take it a step further, in a few years, why would anyone purchase generic software anymore? If we can perfectly customise software for our needs and preferences for almost free, why would anyone purchase generic software from an App Store? I genuinely think Apple's business model is in jeopardy.
AussieWog93 5 hours ago|||
This. Just today I added a full on shopping list system to our internal dashboard at work (small business) simply because it was slightly annoying and could be solved in 3 prompts and 15 minutes.
cush 5 hours ago|||
> I don't understand why we are so obsessed with cranking out even more... the obvious usecase for LLMs should be to write better software

I honestly think this is ideal. Video games aside, I think one day we'll look back and realize just how insane it was that we built software for millions or even billions of users to use. People can now finally build the software that does exactly what they've wanted their software to do without competing priorities and misaligned revenue models working against them. One could argue this kind of software, by definition, is higher quality.

edot 1 hour ago||
I don't think this will be true for average consumers. Perhaps for nerds like us, who enjoy a bit of tinkering and can put up with weird behaviors. I mean, are you envisioning that everyone would have their own custom messaging app, for example? Or email? Or banking app? I mean, I think most people's demands for those things are all extremely homogenous. I want messages to arrive, I want emails to get spam filtered a little but not too much, and I want my bank to only allow me to log in and see my balances, etc.

I could see maybe more customization of said software, but not totally fresh. I do agree that people will invent more one-off throwaway software, though.

woeirua 21 minutes ago||
I think you’re glossing over a lot of use cases. For example, I want my email’s spam controls much tighter.
croemer 3 hours ago|||
That's not what Jevons paradox means though. He's just name dropping some concept.

Jevons paradox would be if despite software becoming cheaper to produce the total spend on producing software would increase because the increase in production outruns the savings

Jevons paradox applies when demand is very elastic, i.e. small changes in price cause large changes in quantity demanded. It's a property of the market.

esjeon 5 hours ago|||
> Let's hope the focus shifts from code generation to something else. There are many ways LLMs can assist in writing better code.

My view is actually the opposite. Software now belongs to cattle, not pet. We should use one-offs. We should use micro-scale snippets. Speaking language should be equivalent to programming. (I know, it's a bit of pipe dream)

In that sense, exe.dev (and tailscale) is a bit like pet-driven projects.

amelius 32 minutes ago|||
Yes, and most applications still have GUIs, where we could be just talking to an LLM instead.
dgb23 5 hours ago|||
Both will likely happen to some degree.

As for the average quality: it’s unclear.

My intuition is that agents lift up the floor to some degree, but at the same time will lead to more software being produced that’s of mediocre quality, with outliers of higher quality emerging at a higher rate than before.

rvz 5 hours ago|||
There will be only 1 Microsoft® Excel, 1 Google Sheets and 1 LibreOffice and the rest are billions of dead vibe-coded "Excel killers" that no-one uses.
fragmede 4 hours ago||
Except that list originally had one item, and that item was Visicalc. Times change, but that list is going to stop being relevant before Excel gets knocked off the list.

If you're doing anything complicated, Excel just doesn't make sense anymore. it'll still the be data exchange format (at least, something more advanced than csv), but it's no longer the only frontend.

"No one uses" is no longer the insult it once was. I don't need or want to make software for every last person on the world to use. I have a very very small list of users (aka me) that I serve very well with most of the software that I generate these days outside of work.

rvz 3 hours ago||
> "No one uses" is no longer the insult it once was.

It certainly is for lots of businesses, otherwise they go out of business.

There is something called 'revenue' which they need to make from customers which are their 'users', and that revenue pays for the 'operating costs' which includes payroll, office rent, infrastructure etc.

This just means that it is important than ever to know what to build just as how it is built. It is unrealistic for a business to disregard that and to build anything they want and end up with zero users.

No users, No revenue. No revenue, No business.

andai 5 hours ago|||
Alas, we shifted from quality to quantity somewhere in the mid 19th century.
appreciatorBus 15 minutes ago|||
Humans have been making quality versus quantity decisions since the time we first grew these big giant brains of ours a million or two years ago, maybe longer.

If you wanted to, you could make an argument about the principal-agent problem - that as hunter-gatherers or subsistence, farmers, our quality versus quantity decisions only affected us, whereas in a market economy, you could argue that one person’s quality versus quantity decision affects someone else.

But dismantling capitalism will not solve this problem. It just moves the decision-making to a different group of people. Those people will face the same trade-offs and the same incentives. After the Revolution, even the most loyal comrade will have to contend with the fact that they can choose to provide the honourable working class with more of a thing if they drop the quality.

fragmede 5 hours ago|||
For software?
bell-cot 5 hours ago||
https://en.wikipedia.org/wiki/Shovelware
fragmede 5 hours ago||
What does that have to do with the mid 19th century?
kyle-rb 4 hours ago||
In the California gold rush, the people who got rich were the ones selling shovelware.
farfatched 6 hours ago||
Nice post. exe.dev is a cool service that I enjoyed.

I agree there is opportunity in making LLM development flows smooth, paired with the flexibility of root-on-a-Linux-machine.

> Time and again I have said “this is the one” only to be betrayed by some half-assed, half-implemented, or half-thought-through abstraction. No thank you.

The irony is that this is my experience of Tailscale.

Finally, networking made easy. Oh god, why is my battery doing so poorly. Oh god, it's modified my firewall rules in a way that's incompatible with some other tool, and the bug tracker is silent. Now I have to understand their implementation, oh dear.

No thank you.

farfatched 45 minutes ago||
> No thank you.

I hope this wasn't interpreted towards exe.dev. That really is a cool service!

LoganDark 5 hours ago||
I find it difficult to configure Tailscale for my use case because they seem to completely not support making ACL rules based on the identity of the device rather than a part of the address space. I'm not configuring a router here, I'm configuring a peer-to-peer networking layer... or at least I'm supposed to be...
spockz 5 hours ago|||
I remember from the docs you can use node names. At the very least you can use tags for sure. Assign tags to nodes and define the ACL based on those.
LoganDark 5 hours ago||
Last I read the docs while troubleshooting this very problem, you cannot specify node names as the source or destination of a grant. You can specify direct IP address ranges, node groups (including autogenerated ones) or tags, but not names.

Tags permanently erase the user identity from a device, and disable things like Taildrop. When I tried to assign a tag for ACLs, I found that I then could not remove it and had to endure a very laborous process to re-register a Tailscale device that I added to Tailscale for the express purpose of remotely accessing

codethief 2 hours ago|||
> because they seem to completely not support making ACL rules based on the identity of the device rather than a part of the address space

Could you rephrase that / elaborate on that? Isn't Tailscale's selling point precisely that they do identity-based networking?

EDIT: Never mind, now I see the sibling comment to which you also responded – I should have reloaded the page. Let's continue there!

jFriedensreich 14 minutes ago||
I have trouble seeing how this is different to linode, if i invest time in a new VM api, this has to work for cloud or my own machines transparently. Lastly as much as i share the disappointment in k8s promise, this seems a bit too simple, there is a reason homelabs mostly standardised on compose files.
faangguyindia 7 hours ago||
i just use Hetzner.

Everything which cloud companies provide just cost so much, my own postgres running with HA setup and backup cost me 1/10th the price of RDS or CloudSQL service running in production over 10 years with no downtime.

i directly autoscales instances off of the Metrics harvested from graphana it works fine for us, we've autoscaler configured via webhooks. Very simple and never failed us.

i don't know why would i even ever use GCP or AWS anymore.

All my services are fully HA and backup works like charm everyday.

mattbee 2 hours ago||
I founded a hosting company 25 years ago when User-Mode Linux was the hot new virtualisation tech. We aspired to just replicate the dedicated server experience because that was obviously how you deploy services with the most flexibility, and UML made it so cheap! Through the 2010s I (extremely wrongly) assumed that being metered on each little part of their stack was not something most developers would choose, for the sake of a little convenience.

Does a regular 20-something software engineer still know how to turn some eBay servers & routers into a platform for hosting a high-traffic web application? Because that is still a thing you can do! (I've done it last year to make a 50PiB+ data store). I'm genuinely curious how popular it is for medium-to-big projects.

And Hetzner gives you almost all of that economic upside while taking away much of the physical hassle! Why are they not kings of the hosting world, rather than turning over a modest €367M (2021).

I find it hard to believe that the knowledge to manage a bunch of dedicated servers is that arcane that people wouldn't choose it for this kind of gigantic saving.

jasongi 1 hour ago||
> I find it hard to believe that the knowledge to manage a bunch of dedicated servers is that arcane that people wouldn't choose it for this kind of gigantic saving.

Managing servers is fine. Managing servers well is hard for the average person. Many hand-rolled hosting setups I've encountered includes fun gems such as:

- undocumented config drift.

- one unit of availability (downtime required for offline upgrades, resizing or maintenance)

- very out of date OS/libraries (usually due to the first two issues)

- generally awful security configurations. The easiest configuration being open ports for SSH and/or database connections, which probably have passwords (if they didn't you'd immediately be pwned)

Cloud architecture might be annoying and complex for many use-cases, but if you've ever been the person who had to pick up someone else's "pet" and start making changes or just maintaining it you'll know why the it can be nice to have cloud arch put some of their constraints on how infra is provisioned and be willing to pay for it.

Manfred 6 hours ago|||
Companies buy cloud services because they want to reduce in-house server management and operations, for them it's a trade-off with hiring the right people. But you are right, when you can find the right people doing it yourself can be a lot cheaper.
mrweasel 5 hours ago|||
In some sense I'm starting to think it has more to do with accounting. Hardware, datacenters and software licenses (unless it's a subscription, which is probably is these days) are capital expenses, cloud is an operation expense. Management in a lot of companies hates capital expenditures, presumable because it forces long term thinking, i.e. three to five years for server hardware. Better to go the cloud route and have "room for manoeuvrability". I worked for a company that would hire consultants, because "you can fire those at two weeks notice, with no severance". Sure, but they've been here for five years now, at twice the cost of actual staff. Companies like that also loves the cloud.

Whether or not cloud is viable for a company is very individual. It's very hard to pin point a size or a use case that will always make cloud the "correct" choice.

whyagaindavid 2 hours ago||
Another point (but my common observation) is the responsibility. By going SaaS or using cloud - any kind of data protection, rules/responsibility etc is moved away. and in many ways it is better - Google, dropbox or Onedrive will have better PR to take the pain if something goes crazy. Tickbox compliance is easy.
fnoef 6 hours ago||||
Right... That's why the hire "AWS Certified specialist ninja"
Tepix 6 hours ago|||
I get the feeling that with LLMs in the mix, in-house server management can do a lot more than it used to.
mattbee 2 hours ago|||
The internet of 20 years ago was awash with info for running dedicated servers, fragmented and badly-written in places but it was all there. I can absolutely believe LLMs would enable more people to find that knowledge more easily.
tgv 6 hours ago|||
Perhaps it saves some time looking through the docs, but do you really trust an LLM to do the actual work?
windex 6 hours ago|||
Yes and an LLM checks it as well. I am yet to find a sysadmin task that an LLM couldn't solve neatly.
jdkoeck 5 hours ago||
A nice bonus is that sysadmin tasks tend to be light in terms of token usage, that’s very convenient given the increasingly strict usage limits these days.
andoando 4 hours ago||||
Yes, with a lot of reviewing what its doing/asking questions, 100%
fragmede 5 hours ago|||
By this point? Absolutely. They still get stuck in rabbit holes and go down the wrong path sometimes, so it's not fully fire and forget, but if you aren't taking advantage of LLMs to perform generic sysadmin drudgery, you're wasting your time that could be better spent elsewhere.
Jn2G3Np8 1 hour ago|||
Also using Hetzner.

But I came across Mythic Beasts (https://www.mythic-beasts.com/) yesterday, similar idea, UK based. Not used them yet but made the account for the next VPS.

tubs 42 minutes ago||
This is way way more expensive than hetzner. Not even comparable?
huijzer 6 hours ago|||
Agree, I used to always use Heroku or Render style platforms for my own software, but nowadays I just have a Linux server with Docker Compose and a Cron job. The cron job every minute runs docker pull (downloads latest image) and docker up -d (switches to new version only if there is a new version). And put caddy in front for the HTTPS. This has been very cheap and reliable for years now.
saltmate 6 hours ago|||
What images are you running that you'd need the latest version up after just a minute?
burner420042 6 hours ago|||
I'm not the OP but I'd clarify the cron check for new versions is done every minute. So when new images are pushed they're picked up quickly.

OP is not saying they push new versions at such a high frequency they need checks every one minute.

The choice of one minute vs 15 minute is implementation detail and when architected like this costs nothing.

I hope that helps. Again this is my own take.

huijzer 37 minutes ago|||
When I push new images via CI, I want it to go in production immediately. Like Heroku/Render/Dokku
RandomBK 3 hours ago|||
One annoyance (I don't know if they've since fixed it) was that Docker Hub would count pulls that don't contain an update towards the rate limit. That ultimately prompted me to switch to alternate repositories.
faangguyindia 3 hours ago||
one way is to host a manifest file (can host one on r2) and update it on each deploy and when manifest changes, new container image is pulled.
pants2 6 hours ago|||
Especially these days you can SSH to a baremetal server and just tell Claude to set up Postgres. Job done. You don't need autoscaling because you can afford a server that's 5X faster from the start.
i5heu 6 hours ago||
You just use docker.

It is like 4 lines of config for Postgres, the only line you need to change is on which path Postgres should store the data.

spockz 5 hours ago||
You also probably want the Postgres storage on a different (set) of disks.

Maybe change the filesystem?

swingboy 2 hours ago|||
Do you run containers? What orchestrator or deploy tool do you use?
alishayk 3 hours ago|||
I find it interesting that Hetzner was never a consideration, until... LLMs started recommending them.
alternatex 3 hours ago||
Hetzner was raved about before AI was cool. I know since based on those good reviews I moved half of my apps from DigitalOcean to Hetzner. My DigitalOcean droplet was lacking in RAM and it was more expensive for me to grow it than move some stuff to another small VPS on Hetzner.
kippinsula 4 hours ago|||
we've done both. Hetzner dedicated was genuinely fine, until a disk started throwing SMART warnings on a Sunday morning and we remembered why we pay 10x elsewhere for some things. probably less about the raw cost and more about which weekends you want back.
faangguyindia 1 hour ago|||
Well, you gotta take all that into consideration before your build out.

You can use block storage if data matters to you.

Many services do not need to care about data reliability or can use multiple nodes, network storage or many other HA setups.

omnimus 3 hours ago|||
Isn't this nature of every dedicated server? You also take on the hardware management burden - that's why they can be insanely cheap.

But there is middleground in form of VPS, where hardware is managed by the provider. It's still way way cheaper than some cloud magic service.

RandomBK 3 hours ago||
VPS comes at the cost of potential for oversubscription - even from more reputable vendors. You never really know if you're actually getting what you're paying for.
faangguyindia 1 hour ago||
They also offer dedicated VPS with guaranteed resource allocation.
TiccyRobby 6 hours ago|||
Honestly I like Hetzner a lot but lately it has been very unstable for us. https://status.hetzner.com/ this page always has couple of incidents happening at the same time. I really appreciate the services they provide but i wish they were more stable.
lifty 4 hours ago||
There are several things going on even now, 1 hour after your comment. But I appreciate that they list them. That hopefully means that they have a good culture of honesty, and they can improve.
omnimus 4 hours ago||
I looked through the issues and basically only ongoing thing is that backup power is not working in one of the data centers (could be a problem). The rest are warnings about planned shutdown of some services and speed limitation of object storage in one location.

I am sure it's luck but we have few hetzner VPSes in both German locations and in last 5 years afaik they've never been down. On our http monitor service they have 100s of days uptime only because we restarted them ourselves.

kubb 6 hours ago|||
[dead]
MagicMoonlight 5 hours ago||
Because if I have a government service with millions of users, I don’t want the cheap shitter servers to crap out on me.

An employee is going to cost anywhere between 8k and 50k per month. Hiring an employee to save 200/month on servers by using a shitty VPS provider is not saving you any money.

kennywinker 5 hours ago||
If you have millions of users, you absolutely need to have someone whose whole job is managing infrastructure. Expecting servers or cloud services to not crap out on you without someone with the skills and time to keep things running seems foolish.
sahil-shubham 4 hours ago||
The point about VMs being the wrong shape because they’re tied to CPU/memory resonates hard. The abstraction forces you to pay for time, not work.

I ended up buying a cheap auctioned Hetzner server and using my self-hostable Firecracker orchestrator on top of it (https://github.com/sahil-shubham/bhatti, https://bhatti.sh) specifically because I wanted the thing he’s describing — buy some hardware, carve it into as many VMs as I want, and not think about provisioning or their lifecycle. Idle VMs snapshot to disk and free all RAM automatically. The hardware is mine, the VMs are disposable, and idle costs nothing.

The thing that, although obvious, surprised me most is that once you have memory-state snapshots, everything becomes resumable. I make a browser sandbox, get Chromium to a logged-in state, snapshot it, and resume copies of that session on demand. My agents work inside sandboxes, I run docker compose in them for preview environments, and when nothing’s active the server is basically idle. One $100/month box does all of it.

martypitt 1 hour ago||
OT - but Bhatti looks really cool! Well done!
sahil-shubham 38 minutes ago||
Thank you :)
codethief 2 hours ago||
> My agents work inside sandboxes

Out of interest, what sandboxing solution do you use?

sahil-shubham 2 hours ago||
Not sure what you mean. I use the above linked personal project, bhatti, which internally uses Firecracker microVMs.
socketcluster 4 hours ago||
Virtual machines are the wrong abstraction. Anyone who has worked with startups knows that average developers cannot produce secure code. If average developers are incapable of producing secure code, why would average non-technical vibe-coders be able to? They don't know what questions to ask. There's no way vibe coders can produce secure backend software with or without AI. The average software that AI is trained on is insecure. If the LLM sees a massive pile of fugly vibe-coded spaghetti and you tell it "Make it secure please", it will turn into a game of Whac-a-Mole. Patch a vulnerability and two new ones appear. IMO, the right solution is to not allow vibe-coders to access the backend. It is beyond their capabilities to keep it secure, reliable and scalable, so don't make it their responsibility. I refuse to operate a platform where a non-technical user is "empowered" to build their own backend from scratch. It's too easy to blame the user for building insecure software. But IMO, as a platform provider, if you know that your target users don't have the capability to produce secure software, it's your fault; you're selling them footguns.
celrenheit 5 hours ago||
Shameless plug: https://clawk.work/

`ssh you/repo/branch@box.clawk.work` → jump directly into Claude Code (or Codex) with your repo cloned and credentials injected. Firecracker VMs, 19€/mo.

POC, please be kind.

aayushdutt 3 hours ago||
This looks nice, when did you launch this? Do you have validation / paying users?
celrenheit 2 hours ago||
[dead]
chimpanzee2 4 hours ago||
honestly sounds interesting

at 19€/mo are you subsidizing it given the sharp rise of LLM costs lately?

or are you heavily restricting model access. surely there is no Opus?

celrenheit 4 hours ago||
The 19€/mo is infra only. Claude Code inside the VM signs in via OAuth to the user's own Anthropic account. I'd love to explore bundling open models (Qwen, etc..) into the subscription down the line, but that needs product validation first, not going to ship something I'm not sure people actually want.
qxmat 4 hours ago|
Europe is crying out for sovereign clouds. If this is to be a viable alt cloud, US jurisdiction is a no.

Not sure we can move away from cpu/memory/io budgeting towards total metal saturation because code isn't what it used to be because no one handles malloc failure any more, we just crash OOM

Quothling 3 hours ago||
Europe is already moving into the EU cloud. Hetzner, OGH Cloud and so on as well as local data centers where partner companies set up own cloud with various things to rival office 365. So far it's mainly the public sector. My own city cut their IT budget by 70% by switching from Microsoft.

The key point is the partner companies. Almost nobody is actually running their own clouds the way they would with various 365 products, AWS or Azure. They buy the cloud from partners, similar to how they used to (and still do) buy solutions from Microsoft partners. So if you want to "sell cloud" you're probably going to struggle unless you get some of these onboard. Which again would probably be hard because I imagine a lot of what they sell is sort of a package which basically runs on VM's setup as part of the package that they already have.

effisfor 4 hours ago||
For anybody interested, the meat of 'EU sovereign' means EU companies, not US or UK companies with EU servers. (because of CLOUD Act and the UK-US bilateral arrangement connected to it).

International visitors might tell us more about benefits of non EU, US or UK nexus companies/legal/rights.

More comments...