Picking an arbitrary price point of $200/mo, you can get 4(!) vCPUs and 16GB of RAM at AWS. Architectures are different etc., but this is roughly a mid-spec dev laptop of 5 or so years ago.
At Hetzner, you can rent a machine with 48 cores and 128GB of RAM for the same money. It's hard to overstate how far apart these machines are in raw computational capacity.
There are approaches to problems that make sense with 10x the capacity that don't make sense on the much smaller node. Critically, those approaches can sometimes save engineering time that would otherwise go into building a more complex system to manage around artificial constraints.
Yes, there are other factors like durability etc. that need to be designed for. But going the other way, dedicated boxes can deliver more consistent performance without worries of noisy neighbors.
No network latency between nodes, less memory bandwidth latency/contention as there is in VMs, no caching architecture latency needed when you can just tell e.g. Postgres to use gigs of RAM and then let Linux's disk caching take care of the rest (and not need a separate caching architecture).
If you’re running Postgres locally you can turn off the TCP/IP part; nothing more to audit there.
SSH based copying of backups to a remote server is simple.
If not accessible via network, you can stay on whatever version of Postgres you want.
I’ve heard these arguments since AWS launched, and all that time I’ve been running Postgres (since 2004 actually) and have never encountered all these phantom issues that are claimed as being expensive or extremely difficult.
It gets even easier now that you have cheap s3 - just upload the dump to s3 every day and set the s3 deletion policy to whatever is feasible for you.
For backups, including Postgres, I was planning on paying Veeam ~$500 a year for a software license to backup the active node and Postgres database to s3/r2. Standby node would be getting streaming updates via logical replication.
There are free options as well but I didn’t want to cheap out on the backups.
It looks pretty turnkey. I am a software engineer not a sysadmin though. Still just theory as well as I haven’t built it out yet
Either way: 1 day of a mid-level developer in the majority of the world (basically: anywhere except Zurich, NYC or SF) is between €208 and €291. (Yearly salary of €50-€70k)
A junior developer's time for setup and the cost of hardware is practically a one-off expense. It's a few days of work at most.
The alternative you're advocating for (a recurring SaaS fee) is a permanent rent trap. That money is gone forever, with no asset or investment to show for it. Over a few years, you'll have spent tens of thousands of dollars for nothing. The real cost is not what you pay a developer; it's what you lose by never owning your tools.
Not sure where I advocated for that. Could you point it out please?
(what's "medium-size corp" and how did you come up with $100k ?)
[0] A normal sysadmin remains vaguely bemused at their job title and the way it changes every couple years.
Sometimes even the certified cloud engineers can't tell you why an RDS behaves the way it does, nor can they really fix it. Sometimes you really do need a DBA, but that applies equally to on-prem and cloud.
I'm a sysadmin, but have been labelled and sold as: Consultant (sounds expensive), DevOps engineer, Cloud Engineer, Operations Expert and right now a Site Reliability Engineer.... I'm a systems administrator.
RDS has a value. But for many teams the price paid for this value is ridiculously high when compared to other options.
It doesn't need someone who knows how to use the labrythine AWS services and console?
These comments sound super absurd to me, because RDS is difficult as hell to setup, unless you do it very frequently or already have it in IoC format, since one needs setting up a VPC, subnets, security groups, internet gateway, etc.
It's not like creating a DynamoDB, Lambda or S3 where a non-technical person can learn it in a few hours.
Sure, one might find some random Terraform file online to do this or vibe-code some CloudFormation, but that's not really a fair comparison.
I totally also understand why some people with family to support mortgage to pay they can't just walk way from a job at FAANG or MAMAA type place.
Looking at your comparison, this point it just seems like a scam.
If your needs go beyond that? Then you need real computers with real configuration and you have OVH/Hetzner/Latitude who will rent you MONSTER machines for the cost of some cheap-ass surplus 2017 Intel on The Cloud.
And if you just want a blog or whatever? Zillion VPS options.
The traditional cloud is for regulatory/process/corruption capture extraction in 2025: its machine economics and developer productivity use case is fucking zero I've seen. Maybe there's some edge case where a completely unencumbered team is better off with DMV trip permissions theatre, remnant Intel racked with noisy neighbors at massive markup, and no support recourse.
(2) What do you do if your large Hetzner server starts to show signs of malfunction? How soon would you be able to replace it, and how easily?
(2a) What do you do when your large Hetzner server just dies? I see that this happens rarely, but what's your contingency plan, if any?
(3) What do you do when your load is highly spiky? Do you reserve bare metal capacity for the biggest peak you expect to serve, because it's so much cheaper than running an elastic serverless architecture of the same capacity anyway?
(4) Considering that your stack still includes many components, how do you manage them, and how expensive is the management overhead? Do you need an extra SRE?
These are not rhetorical questions; I'd love to hear firm real practitioners! (E.g. Stack Overflow used to do deep dives into their few-big-servers architecture.)
A key factor underlining all of this is understanding, from a business/organizational perspective, your actual uptime requirements. Google may aim at 5 nines with the budget to achieve it, but many banks have routine planned downtime. If you don't know your objectives, you will have trouble making the tradeoffs necessary to get there. As a hypothetical, would your business choose 99.999% uptime (26 seconds down on average per month) vs 99.99% (4.3 min) if that caused infra costs to rise by 50% or more? If you said we can cut our infra costs by 50% by planning a short weekly maintenance window, how would that resonate?
Speaking to a few, in my experience:
2) (not at Hetzner specifically, but at a dedicated host). You have backups & recovery plans, and redundancy where it makes sense. You might run your database with a replica. If you are serving Web traffic, maybe you keep a hot spare. Also, you are still allowed to use e.g. cloud services if it makes sense to do so so you can backup to S3 and use things like SQS or KMS if you don't want to run them yourself. It's worth noting that you may not get advance notice; I recall our service being impacted by a fire at a datacenter that IIRC was caused by a traffic accident on a nearby highway. The point is you have to design resilience into the system. Fortunately, this is well-trod ground.
It would not be a terrible failover option to have something like an autoscale group at AWS ready to step in if the dedicated cluster goes offline. Keep that cluster scaled to 0 until it's needed. Put the cloud behind your cheap dedicated capacity.
3) See above. In my case, we over-provisioned because it's cheap to do so. I did not do this at the time, but I would probably look at running a replicated database with a hot standby on another server.
4) It has not been my experience that "modern" cloud deployments require fewer SRE resources. Like water running downhill, cloud projects seek complexity.
I am not even thousands km near the level of what you are doing, but my client was paying $100/m for an AWS server, SQS and S3 bucket, for a small PHP based web application that uses Amazon Seller API, Keepa API for the products he ships. Used MySQL for data storage.
I implemented the whole thing in Python, Django, and PostgreSQL (initially used SQLite) put it in a $25/m unmanaged VPS.
I have not got any complaints about performance, and it's running continuously updating product prices, details, processing PDF invoices using OCR, finding missing products in shipments, while also serving the website, and a 4 core server with 6GB RAM is handling it just fine.
The load is not going to be so high to require AWS and friends, for now. It's a small internal app, probably won't even get over 100 users, and if it ever does, it's extremely simple to migrate, because the app is so compact, even though not exactly monolithic.
And still, it probably won't need a $100 AWS server, unless we are scaling up much larger.
If all you need is "good enough" reliability and basic compute power (which I think is good enough for many businesses, considering AWS isn't exactly outage free either), you're probably better off getting a server or renting one from a cheap cloud host. If you're promising five nines of uptime for some reason, you may want to reconsider.
This is exactly my point. Sorry if I was not clear on my OP.
We are using Seller API to get different product information, while their API provides base work for communicating with their endpoint, you'll have to implement your own system to use that, and handle the absurd unreliability of their API's rate limiter, and the spider web of API callbacks to get information that you require.
I do not know how much actually cost of the original application.
The app, that I was developing, was for another purpose, and the reimplementation was later added.
The app replaces an existing commercial app that is being used, which is $200+/m. So, may be 4/5 years to exceed the savings. They have been using the app for 3 years, I think.
And, maybe I am beating my drum a little, I believe my implementation works, and looks much better than the commercial or the first implementation.
So, I am really looking forward for this to success.
There are cheaper ways of building that use case on AWS.
Most AWS sticker shock I’ve seen results from someone who doesn’t really understand cloud trying to build on the cloud. Cost has to be designed in from the start (in addition to security, operational overhead, etc).
In general, I’ve found two types of engineering teams who don’t use the cloud: the mugs and the superstars. And since superstars are few and far between, that means…
I guess those promises about needing fewer expensive people never materialised.
tbh, aside from the really anaemic use-cases where everything actually manages to scale to zero and has very low load: I have genuinely never seen an AWS project (outside of free credits of course) that works out cheaper than what came before.
That's TCO from PNLs, not a "gut feeling". We have a decade of evidence now.
My comment was not about using AWS is bad, it has its uses. My comment was about how in this instance it was simply not needed. And I even speculated when it might be needed.
To pick the correct tool for the job, is what, it means to be an Engineer, or a person with common sense. With experience, we can get over childish absolutions of a tool or service, and look at the broader aspects, unless, of course, we are expecting some kind of monetary gains.
For most public cloud providers you have to give them your credit card number so they can charge an arbitrary amount.
For Hetzner, instead of CC#, you give a scan of your ID (of course you can attach your CC too or Paypal). Personally I do my payments via a bank transfer. I recently paid for the whole 2025 and 2026 for all my k8s clusters. It gives unimaginable peace of mind when compared to AWS/GCP/Azure.
Plus, their cloud instances often spin up much faster than EC2.
Data centers all over the country and I get to locate under 10ms from my regional audience.
Just a data point if you want some bigger iron than a VM.
Before that, I used to go for Linode, but I think they've become more pricey?
Too bad, actually, their service was pretty good.
Also in my experience more complex systems tend to have much less reliability/resilience than simple single node systems. Things rarely fail in isolation.
Now, if you actually need to decouple your file storage and make it durable and scalable, or need to dynamically create subdomains, or any number of other things… The effort of learning and integrating different dedicated services at the infrastructure level to run all this seems much more constraining.
I’ve been doing this since before the “Cloud,” and in my view, if you have a project that makes money, cloud costs are a worthwhile investment that will be the last thing that constrains your project. If cloud costs feel too constraining for your project, then perhaps it’s more of a hobby than a business—at least in my experience.
Just thinking about maintaining multiple cluster filesystems and disk arrays—it’s just not what I would want to be doing with most companies’ resources or my time. Maybe it’s like the difference between folks who prefer Arch and setting up Emacs just right, versus those happy with a MacBook. If I felt like changing my kernel scheduler was a constraint, I might recommend Arch; but otherwise, I recommend a MacBook. :)
On the flip side, I’ve also tried to turn a startup idea into a profitable project with no budget, where raw throughput was integral to the idea. In that situation, a dedicated server was absolutely the right choice, saving us thousands of dollars. But the idea did not pan out. If we had gotten more traction, I suspect we would have just vertically scaled for a while. But it’s unusual.
This is because you are looking only at provisioning/deployment. And you are right -- node size does not impact DevOps all that much.
I am looking at the solution space available to the engineers who write the software that ultimately gets deployed on the nodes. And that solution space is different when the nodes have 10x the capability. Yes, cloud providers have tons of aggregate capability. But designing software to run on a fleet of small machines is very different from accomplishing the same tasks on a single large machine.
It would not be controversial to suggest that targeting code at an Apple Watch or Raspberry Pi imposes constraints on developers that do not exist when targeting desktops. I am saying the same dynamic now applies to targeting cloud providers.
This isn't to say there's a single best solution for everything. But there are tradeoffs that are now always apparent. The art is knowing when it makes sense to pay the Cloud Tax, and whether to go 100% Cloud vs some proportion of dedicated.
I’ve never had an issue with moving data.
I think you confuse Heztner with bare metal. Hetzner has Hetzner Cloud which is like AWS EC2 but much cheaper. (They also have bare metal servers which are even cheaper.) With Heztner Cloud, you can use Terraform, Github Actions and whatever else you mentioned.
I think the issue is actually the opposite.
With the cloud, the engineers fail to see the actual cost of their inefficient scaled-out code, because someone else (the CFO) pays the bill; and the answer to any issue, is simply adding more "workers" and more "cloud", since they're basically "free" from the perspective of the employee. (And the more "cloud" something is, like, the serverless, the more "free", completely inverting the economics of making a profit on the service — when the CFO tells you that your AWS bill is too high, you move everything from the EC2 to AWS Lambda, since the salesperson from AWS tells you that serverless is far cheaper, only for the bill to get even higher, for reasons unknown, of course.)
Whom the cloud tax actually constrains are the entrepreneurs and solo-preneurs. If you have to pay $5000/mo to AWS just for the infra, you can only go so long without lots of revenue, and you'd need to have a whopping 5k/mo+ worth of revenue before breaking even. Yet with a $200/mo like at OVH or Hetzner, you can afford to let it grow at negligible cost to yourself, and it can basically start being profitable with the first few users.
Don't believe this? Look at the blog entries by the guy who bought Yahoo!'s Delicious, written before they went bankrupt and were up for sale. He was basically pointing out that the services have roughly the same number of users, and require the same engineering resources, yet one is being operated at a loss, whereas the other one makes a profit (guess which one, and guess why).
* https://en.wikipedia.org/wiki/Delicious_(website)
* https://en.wikipedia.org/wiki/Pinboard_(website)
* https://news.ycombinator.com/from?site=blog.pinboard.in
So, literally, the difference between the cloud and renting One Big Server, is making a loss and going out of business, and remaining in business and purchasing your underwater competitor for pennies on the dollar.
However, to the point of microservices as the article mentions, you probably should look at lambda (or fargate, or a mix) unless you can really saturate the capacity of multiple servers.
When we swapped to ECS+EC2 running microservices over to lambda our costs dropped sharply. Even serving millions of requests a day we spend a lot of time in between idle, especially spread across the services.
Additionally, we have 0 outages now from hardware in the last 5 years. As an engineer, this has made my QoL significantly better.
Probably? It's about 5-10X more expensive than equivalent services from Hetzner.
That said, with a defined workload without a ton of variation or segmentation needs there are lots of ways to deliver a cheaper solution.
What are you getting, and do you need it?
* Centralized logging, log search, log based alerting
* Secrets manager
* Managed kubernetes
* Object store
* Managed load balancers
* Database HA
* Cache solutions
... Can I run all these by myself? Sure. But I'm not in this business. I just want to write software and run that.
And yes, I have needed most of this from day 1 for my startup.
For a personal toy project, or when you reach a certain scale, it may makes sense to go the other way. U
So while there are areas where you need to introduce distributed systems, this repeated disparaging comment of “toy hobby projects” makes me distrust your judgement heavily. I have replaced many such installations by actually delivering (grand distributed designs often don’t fully deliver), reducing costs, dramatically improving performance, and most importantly reducing complexity by magnitudes.
One server means you can handle the equiv of 100+ AWS instances. And if you're into that turf, then having a rack of servers saves even more.
Big corp is pulling back from the cloud for a reason.
It's still useful to have the various services, background jobs, system events, etc. in one indexed place which can also manage retention and alerting. And ideally in a place reachable even if the main service goes down. I've got centralised logging on a small homelab server with a few services on it and it's worth the effort.
> Load balancing? In practice most people for most work don’t use it because of actually outgrowing hardware, but because they have to provision to shared hardware without exclusivity.
Depending on how much you lose in case of downtime, you may want at least 2x of hardware for redundancy and that means some kind of fancy routing (whether it's LB, shared IP, or something else)
> Secrets management? There are no secrets to be constantly distributed to various machines on a network.
Typically businesses grow to more than one service. For example I've got a slack webhook in 3 services in a small company and I want to update it in one place. (+ many other credentials)
> Caching? Distributed systems create latency that doesn’t need to exist at all
This doesn't solve the need for caching results of larger operations. It doesn't matter how much latency you have or not, you still don't want that rarely-changing 1sec long query to run on every request. Caching is rarely only about network latency.
That's amazing. I wish I could do the same.
Unfortunately, I cannot run my business on a single server in a cage somewhere for a multitude of reasons. So I use AWS, a couple of colos and SaaS providers to deliver reliable services to my customers. Note I'm not a dogmatic AWS advocate, I seek out the best value -- I can't do what I do in AWS without alot of capital spend on firewalls and storage appliances, as well as the network infrastructure and people required to make those work.
And as soon as you have two of anything, all the above start mattering.
If none of this actually is an issue for you and your customers, I'll say your are very lucky.
You must be doing truly a lot of growth prior to building. Or perhaps insisting on tiny VMs for your loads?
This happens way too often. Early-stage startups that build everything on the AWS free tier (t2.micro only!), and then when the time comes they scale everything horizontally
Hopefully when we can afford to do it, we will.
Do people really use the bare CloudWatch logs as an answer for log search? I find it terrible and pretty much always recommend something like DataDog or Splunk or New Relic.
which in reality is any project under a few hundred thousand users
You can get actual direct-attached SSDs on EC2 (and I'd expect performance to be on-par with Hetzner), but those are ephemeral and you lose them on reboot.
Thanks for the insight!
The problem that Hetzner and a lot of hardware providing hosts have, is the lack of affordable flexibility.
Hetzner their design is based upon a base range of standardized products. This can only be upgraded within a pre-approved range of upgrade options (limited to storage/memory).
Upgrades are often a mixed bag of carefully designed "upgrade paths". As you can expect, upgrades are not cheap. Doubling the storage on a base server, often increases the price of your server by 50 to 75%. The typical customizing will cost you dearly.
This is where AWS wins a lot more. Yes, they are expensive as hell, but you often are not stuck to a base config and a limited upgrade path. The ability to scale beyond what Hetzner can offer is there, and your not forced to overbuy from the start. Transferring between servers is a few buttons and done. With Hetzner, if you did not overspec from the start, your going to do those fun server migrations.
The ironic part is, that buying your own hardware and running it yourself, often ends up paying back within a 8~12 month periode (not counting electricity / internet). And you maintain a lot more flexibility.
* You want to use bifurcation, go for it.
* You want to use consumer 4TB nvme's for second layer read storage (what hetzner refuses to offer as they limited those to 2TB and only one a few servers), go for it.
* You want a 10Gbit interlink between your server, go for it. No need to pay a monthly fee! No need to reserve "future space".
* O, you want a 25Gbit, go for it (hetzner = not possible).
* You want 50Gbit ...
* You want to chuck in a few LLM capable GPUs without breaking the bank...
Its ironic that we are 2025 and Hetzner is stil limited to 1Gbit connection on its hardware, when just about any consumer level hardware has 2.5Gbit by default for years.
Your own hardware gives you the flexibility of AWS and the cost saving beyond Hetzner. Maybe its just my environment, but i see more and more smaller to medium companies going back to their own locally run servers. Not even colocation.
The increase in consumer level fiber, what used to be expensive or not available, has opened the doors for businesses. Most companies do not need insane backbones.
The fact that you can get business fiber 10Gbit for a 100 Euro price in some EU countries (of course never the north), is insane. I even seen some folks combining fiber with starlink & 5G as backup in case their fiber fails/is out.
As long as you fit within a specific usage case that is being offered by Hetzner, they are cheap. But its the moment you step outside that comfort zone, ... This is one of Hetzner weaknesses and where AWS or Self hosted comes back.
We had a leased server from them, running VMware, and we had Linux virtual machines for our application.
We ran out of RAM. We only had 16 or 32GB at the time. Hey, can we double this? Sure, but our payment would nearly double. How does that make any sense?
If this were a co-located box we owned, I could buy a pair of $125 chips from Crucial (or $250 Dell chips from CDW) and there we go. But we're expected to pay this much more per month?
Their answer was "you can do more with the server so that's what you're paying for"
Storage was a similar situation, we were still on RAID with spinning drives and we wanted to go SSD, not even NVME. Wasn't going to happen. And if we went to a new server we'd have to get all new IP's and stuff. Ugh.
And 10Gb...that was a pipe dream. Costs were insane.
We ended up having to decide between two things:
1. Move to a co-lo and buy a couple servers, ala StackExchange. This is what I wanted to do.
2. Tweak the current application stack, and re-write the next version to run on AWS.
What did we end up doing? Some half ass solution using the existing server for DB and NGINX proxy, while running the sites on (very slow) Slicehost instances (which Rackspace had recently acquired and roughly integrated into their network). So we still had downtime issues, slow databases, etc.
For storage, Hetzner does offer Volumes, which you can attach to your VM and you can choose exactly how large you want them to be and are charged separately. But your argument about doubling resources and doubling prices still holds for RAM.
The argument was about dedicated hardware. But it still holds for VPS.
Have you seen the price of Cloud Storage? ARM VPS 40GB is 4.51 (inc tax), for 40GB storage, your paying 2.10 Euro. So my argument still holds as your paying almost 50% more, just to go from 40GB to 80GB. And that ratio gets worse if your renting higher end VPS, and double your storage on them.
Lets be honest, 53.62 Euro for 1TB of SSD storage in 2025, is ridiculous.
Netcup is at 12 Euro/TB for SSD storage (same speed as the VMS as its just localized storage on the server, not network storage). Fyi: A ARM 6 Core 256GB, at netcup is 6.26 Euro.
Hetzner used to be the market leader and pushed others, but you barely see any new products or upgraded from them anymore. I said it before, if Netcup actually invested into a more modern/scalable VPS solution (instead of their 2010 VPS panels), they will eat a lots of Hetzners clients.
I have several workloads that just invoke Lambda in parallel. Now I effectively have a 1000 core machine and can blast through large workloads without even thinking about it. I have no VM to maintain or OS image to consider or worry about.
Which highlights the other difference that you failed to mention. Hertzner charges a "one time setup" fee to create that VM. That puts a lot of back pressure on infrastructure decisions and removes any scalability you could otherwise enjoy in the cloud.
If you want to just rent a server then Hertzner is great. If you actually want to run "in the cloud" then Hertzner is a non-starter.
Lambda is a decent choice when you need fast, spiky scaling for a lot simple self-contained tasks. It is a bad choice for heavy tasks like transcoding long videos, training a model, data analysis, and other compute-heavy tasks.
It's almost exactly the same price as EC2. What you don't get to control is the mix of vCPU and RAM. Lambda ties those two together. For equivalent EC2 instances the cost difference is astronomically small, on the order of pennies per month.
> like transcoding long videos, [...] data analysis, and other compute-heavy tasks
If you aren't breaking these up into multiple smaller independent segments then I would suggest that you're doing this wrong in the first place.
> training a model
You're going to want more than what a basic EC2 instance affords you in this case. The scaling factors and velocity are far less of a factor.
It should be obvious that this is not the best answer for all projects.
https://medium.com/life-at-apollo-division/compare-the-cost-...
Care to elaborate?
If either of these exceed the limitations of the call, which is 6MB or 256kB depending on call type, then you can just use S3. For large distributed task coordination you're going to be doing this anyways.
> deployment .zip sizes
Overlays exist and are powerful.
> max execution time
If your workload depends on long uninterrupted runs of time on single CPUs then you have other problems.
> Plus you'll be locked into AWS.
In the world of serverless your interface to the endpoints and semantics of Lambda are minimal and easily changed.
You're better off using ECS / Fargate for application logic.
Hetzner Cloud, then! In the US, $0.53/hr / $333.59/mo for 48 vCPU/192GB RAM/960GB NVMe. Includes 8 TB/mo traffic, when 8 TB egress would cost $720 on EC2; more traffic is $1.20/TB when the first tier of AWS egress is $90/TB. No setup fee. Not that it's EC2 but there's clearly flexibility there.
More generally, if you want AWS, you want AWS; if you want servers you have options.
Because its backed into the price. If you run a VPS for a month, you get the listed monthly price. But if you run a VPS for a shorter time, the hourly billing price is a lot more expensive.
The ironic part being, that your better off keeping a VPS active until the end of your month periode (if you already crossed 2/3), then its is to cancel early.
Noticed that few people realize that the hourly price != the monthly price.
It's a nice pattern. Just don't make them clones of each other, or they might go BLAM at the same time!
https://news.ycombinator.com/item?id=32049205
https://news.ycombinator.com/item?id=32032235
https://news.ycombinator.com/item?id=32028511 (<-- this is where it got figured out)
---
Edit: both these points are mentioned in the OP.
https://www.servethehome.com/amd-epyc-7002-rome-cpus-hang-af...
HN goes down when we restart the server process, usually as part of updating the code - but only for a few seconds. The message "Restarting the server. Shouldn't take long." displays when that is happening.
There are also, to my exasperation, still moments of brownout during certain traffic spikes or moments of obscure resource contention. But these are at least rarer than they used to be.
Even the ones who do know have been conditioned to tremble with fear at the thought of administrating things like a database or storage. These are people who can code cryptography kernels and network protocols and kernel modules, but the thought of running a K8S cluster or Postgres fills them with terror.
“But what if we have downtime!” That would be a good argument if the cloud didn’t have downtime, but it does. Most of our downtime in previous years has been the cloud, not us.
“What if we have to scale!” If we are big enough to outgrow a 256 core database with terabytes of SSD, we can afford to hire a full time DBA or two and have them babysit a cluster. It’ll still be cheaper.
“What if we lose data?” Ever heard of backups? Streaming backups? Hot spares? Multiple concurrent backup systems? None of this is complex.
“But admin is hard!” So is administrating cloud. I’ve seen the horror of Terraform and Helm and all that shit. Cloud doesn’t make admin easy, just different. It promised simplicity and did not deliver.
… and so on.
So we pay about 1000X what we should pay for hosting.
Every time I look at the numbers I curse myself for letting the camel get its nose under the tent.
If I had it to do over again I’d forbid use of big cloud from day one, no exceptions, no argument, use it and you’re fired. Put it in the articles of incorporation and bylaws.
These days probably the best way of getting these 'cloudy' engineers on board is just to tell them its Kubernetes and run all of your servers as K3s.
Dev culture is totally fad driven and devs are sheep, so this works.
They could have got the job done by hosting the service in a vps with a multi-tenant database schema. Instead, they went about learning kubernetes and drillingg deep into "cloud-native" stack. Spent a year trying to setup the perfect devops pipeline.
Not surprisingly the company went out of business within the next few years.
But the engineers could find new jobs thanks to their acquired k8s experience.
I mean, of the two, the PaaS route certainly burns more money, the exception being the rare shop that is so incompetent they can't even get their own infrastructure configured correctly, like in GP's situation.
There are guaranteed more shops that would be better off self-hosting and saving on their current massive cloud bills than the rare one-offs that actually save so much time using cloud services, it takes them from bankruptcy to being functional.
Does it? Vercel is $20/month and Neon starts at $5/month. That obviously goes up as you scale up, but $25/month seems like a fairly cheap place to start to me.
(I don't work for Vercel or Neon, just a happy customer)
And that’s before you factor in 500gb of storage.
Sure you need net/infra admins but the software and hardware these days are pretty management friendly and you'll find you still need (often more expensive "cloud") admins so you're not offsetting much management cost there.
Colocation is plentiful and providers often aggregate and resell bandwidth from their preferred carriers.
At one point we were up to 8 dell vrtx clusters and a few SANs, with 500+ VMs from huge msSQL servers to kube clusters the public cloud bill would have been well into the 6 figures even with preferred pricing and reserved instances. Our colocation bill was $2400/mo and that was mostly for power. The one thing that always surprised me was how much faster everything was - every time we had to scale-over into the cloud the public cloud node was noticably slower even for identical CPU generations and vcpu.
You need to be very keen about server deals, updates, support contracts and licenses - but it's really manageable and interconnecting with the cloud is trivial at this point - you can get a "cloud connect" fiber drop to your preferred cloud provider and connect your colo infra to your vpc.
Once you have an established baseline for your server needs - it's almost always more capital friendly to buy the servers and keep them running for the ~5 reliable years you'll get out of them - usually break even here is 2-3 years vs renting from a provider. If you're running your servers until they fail you'll get 7-10 years out of them, provided the power cost is still worth running them (usually that is also around the 8-10 year mark depending on your power cost).
So there are many reasons you'd buy vs rent - including capital deductions and access to cheap interest rates. You can also get some pretty crazy deals (like 33% of new price) by buying 2-3 year old equipment, then continue to run them for another 4-5 years, which is the lowest cost scenario if you don't need bleeding edge.
Especially for the "one (or a few) big server" scenario in the article, that would seem to me a pretty big factor.
And I can do whatever I want with the hardware. When I bought my servers, they came with disk controllers with non-optional RAID, as almost all of them do. I wanted to run RAIDz2 in FreeBSD/ZFS, so I swapped in non-RAID controllers. They were just a few bucks, but having that ability meant I could choose from a wider range of servers.
These companies all ended up massively increasing their budgets switching to cloud workloads when a simple server in the office was easily enough for their 250 users. Cloud is amazing for some uses and pure marketing BS for others but it seems like a lot of engineers aim for a perfect scalable solution instead of one that is good enough.
A big pain point that I personally don't love is that this non-cloud approach normally means running my own database. It's worth considering a provider who also provides cloud databases.
If you go for an 'active/passive' setup, consider saving even more money by using a cloud VM with auto scaling for the 'passive' part.
In terms of pricing the deals available these days on servers are amazing you can get 4GB RAM VPSs with decent CPU and bandwidth for ~$6 or bare metal for ~$90 for 32GB RAM quad core worth using sites like serversearcher.com to compare.
Compare that with using your distro's packaged version where you can have version variations, variations in default config or file path locations, etc.
You can abuse git for it if you really want to cut corners.
SQLite uses one reader/writer lock over the whole database. When any thread is writing the database, no other thread is reading it. If one thread is waiting to write, new reads can't begin. Additionally, every read transaction starts by checking if the database has changed since last time, and then re-loading a bunch of caches.
This is suitable for SQLite's intended use case. It's most likely not suitable for a server with 256 hardware threads and a 50Gbps network card. You need proper transaction and concurrency control for heavy workloads.
Additionally, SQLite lacks a bunch of integrity checks, like data types and various kinds of constraints. And things like materialised views, etc.
SQLite is lite. Use it for lite things, not hevy things.
Sqlite (properly configured) will outperform "proper databases" often by an order of magnitude in the context of a single box. You want a single writer for high performance as it lets you batch.
> 256 hardware threads...
Have you tried? I have. Others have too. [1]
> Additionally, SQLite lacks a bunch of integrity checks, like data types and various kinds of constraints. And things like materialised views, etc.
Sqlite has blobs so you can use your own custom encoding which is what you want in a high performance context.
Here's sqlite on a 5$ shared VPS that can handle 10000+ checks per second over a billion checkboxes [2]. You're gonna be fine.
- [1] https://use.expensify.com/blog/scaling-sqlite-to-4m-qps-on-a...
SQLite (actually SQL-ite, like a mineral) maybe be light, but so are many workloads these days. Even 1000 queries per second is quite doable with SQLite and modest hardware, and I've worked at billion dollar businesses handling fewer queries than that.
SQLite is easily the best scaling DB tech I've used. I've moved all my postgres workloads over to it and the gains have been incredible.
It's not a panacea and not the best in all cases but it's a very sane default that I recommend everyone start with and only complicate their stack with an external DB when they they start hitting real limits (often never happens)
I moved several projects from sqlite to postgres because sqlite didn't scale enough for any of them.
The out of the box defaults for sqlite are terrible for web apps.
Most if not all of your concerns with SQLite are simply a matter of not using the default configuration. Enable WAL mode, enable strict mode, etc. and it's a lot better.
Right now, my plan is to move from a bunch of separate VPSes, to one dedicated server from Hetzner and run a few VMs inside of it with separate public IPs assigned to them alongside some resource limits. You can get them for pretty affordable prices, if you don't need the latest hardware: https://www.hetzner.com/sb/
That way I can limit the blast range if I mess things up inside of a VM, but at the same time benefit from an otherwise pretty simple setup for hosting personal stuff, a CPU with 8 threads and 64 GB of RAM ought to be enough for most stuff I might want to do.
Give me a box, trust me with ssh keys and things are so much easier. Simple is good for the soul and the wallet.
You don't need to buy server hardware(!), the article specifically mentions renting from eg Hetzner.
> The benefits of "just don't think about hardware" are real
Can you explain on this claim, beyond what the article mentioned?
I run a lambda behind a load balancer, hardware dies, its redundant, it gets replaced. I have a database server fail, while it re provisions it doesn't saturate read IO on the SAN causing noisy neighbor issues.
I don't deal with any of it, I don't deal with depreciation, I don't deal with data center maintenance.
You don't deal with that either if you rent a dedicated server from a hosting provider. They handle the datacenter and maintenance for you for a flat monthly fee.
But the cloud premium needs reiteration: twenty five times. For the price of the cloud server, you can have twenty-five-way redundancy.
A medium to large size asteroid can cause mass extinction events - this happens sometimes - it's not a theoretical risk.
The risk of the people responsible for managing the platform messing up and losing some of your data is still a risk in the cloud. This thread has even already had the argument "if the cloud provider goes down, it's not your fault" as a cloud benefit. Either cloud is strong and stable and can't break, or cloud breaks often enough that people will just excuse you for it.
Yep, and it's mostly caused by the VC funding model - if your investors are demanding hockey-stick growth, there is no way in hell a startup can justify (or pay for) the resulting Capex.
Whereas a nice, stable business with near-linear growth can afford to price in regular small Capex investments.
Yes, there is.
Honestly, it looks to me that this school of thought is mostly adopted by people that can't do arithmetic or use a calculator. But it does absolutely exist.
That said, no, servers are not nearly expensive enough to move the needle on a company nowadays. The room that often goes around them is, and that's why way more people rent the room than the servers in it.
I ran the IT side of a media company once, and it all worked on a half-empty rack of hardware in a small closet... except for the servers that needed bandwidth. These were colocated. Until we realized that the hoster did not have enough bandwidth, at which point we migrated to two bare metal servers at Hetzner.
The actual space isn't a big deal, but the entire environment has large fixed costs.
In practice, all that except connectivity is relatively easy to have on-site.
Connectivity is highly dependent on the business location, local providers, their business plans and their willingness to go out of their way to serve the clients.
And I am not talking only about bandwidth, but also reserve lines and latency.
Never underestimate the price people are willing to pay to evade responsibility. I estimate this is a multi-billion dollar market.
I for one really miss being able to go see the servers that my code runs on. I thought data centers were really interesting places. But I don't see a lot of effort to decide things based on pure dollar cost analysis at this point. There's a lot of other industry forces besides the microeconomics that predetermine people's hosting choices.
An IBM z17 is effectively one big server too, but provides levels of reliability that are simply not available in most IT environments. It won't outperform the AMD rack, but it will definitely keep up for most practical workloads.
If you sit down and really think honestly about the cost of engineering your systems to an equivalent level of reliability, you may find the cost of the IBM stack to be competitive in a surprising number of cases.
ETA - fixed spelling error
Now, if you can live with the weird environment and your people know how to programm what is essentially a distributed system described in terms noone else uses: I guess it's still ok, given the competition is all executing IBMs playbook too.
My understanding is that usually you subdivide into few LPARs and then reboot the production ones on schedule to prevent drift and ensure that yes, unplanned IPLs will work