Use One Big Server (2022)

Posted by antov825 8/31/2025

Use One Big Server (2022)(specbranch.com)

350 points | 323 comments

runako 8/31/2025|

One of the more detrimental aspects of the Cloud Tax is that it constrains the types of solutions engineers even consider.

Picking an arbitrary price point of $200/mo, you can get 4(!) vCPUs and 16GB of RAM at AWS. Architectures are different etc., but this is roughly a mid-spec dev laptop of 5 or so years ago.

At Hetzner, you can rent a machine with 48 cores and 128GB of RAM for the same money. It's hard to overstate how far apart these machines are in raw computational capacity.

There are approaches to problems that make sense with 10x the capacity that don't make sense on the much smaller node. Critically, those approaches can sometimes save engineering time that would otherwise go into building a more complex system to manage around artificial constraints.

Yes, there are other factors like durability etc. that need to be designed for. But going the other way, dedicated boxes can deliver more consistent performance without worries of noisy neighbors.

shrubble 8/31/2025||

It's more than that - it's all the latency that you can remove from the equation with your bare-metal server.

No network latency between nodes, less memory bandwidth latency/contention as there is in VMs, no caching architecture latency needed when you can just tell e.g. Postgres to use gigs of RAM and then let Linux's disk caching take care of the rest (and not need a separate caching architecture).

matt-p 8/31/2025||

The difference between a fairly expensive ($300) RDS instance + EC2 in the same region vs a $90 dedicated server with a NVME drive and postgres in a container is absolutely insane.

bspammer 8/31/2025|||

A fair comparison would include the cost of the DBA who will be responsible for backups, updates, monitoring, security and access control. That’s what RDS is actually competing with.

shrubble 8/31/2025|||

Paying someone $2000 to set that up once should result in the costs being recovered in what, 18 months?

If you’re running Postgres locally you can turn off the TCP/IP part; nothing more to audit there.

SSH based copying of backups to a remote server is simple.

If not accessible via network, you can stay on whatever version of Postgres you want.

I’ve heard these arguments since AWS launched, and all that time I’ve been running Postgres (since 2004 actually) and have never encountered all these phantom issues that are claimed as being expensive or extremely difficult.

sahilagarwal 9/1/2025|||

I guess my non-management / non-business side is show here, but how can it be that much?? I still remember I designed a fairly simple cron job that took database backups when I was a junior developer.

It gets even easier now that you have cheap s3 - just upload the dump to s3 every day and set the s3 deletion policy to whatever is feasible for you.

alemanek 9/1/2025|||

I am not an expert here but I am currently researching for a planned project.

For backups, including Postgres, I was planning on paying Veeam ~$500 a year for a software license to backup the active node and Postgres database to s3/r2. Standby node would be getting streaming updates via logical replication.

There are free options as well but I didn’t want to cheap out on the backups.

It looks pretty turnkey. I am a software engineer not a sysadmin though. Still just theory as well as I haven’t built it out yet

nine_k 9/2/2025||||

Taking database backups is relatively simple. What differentiates a good solution is the ease of restoring from a backup. This includes the certainty that the restored state would be a correct point-in-time state from the past, not an amalgamation of several such states.

fragmede 9/1/2025|||

How much were you paid as a jr developer, and how long did it take you to set up? Then round up to mid-level developer, and add in hardware and software costs.

dijit 9/1/2025||

That's a deflection. The question isn't about a developer's salary; it's about the fundamental difference between a one-time investment and a permanent cost.

Either way: 1 day of a mid-level developer in the majority of the world (basically: anywhere except Zurich, NYC or SF) is between €208 and €291. (Yearly salary of €50-€70k)

A junior developer's time for setup and the cost of hardware is practically a one-off expense. It's a few days of work at most.

The alternative you're advocating for (a recurring SaaS fee) is a permanent rent trap. That money is gone forever, with no asset or investment to show for it. Over a few years, you'll have spent tens of thousands of dollars for nothing. The real cost is not what you pay a developer; it's what you lose by never owning your tools.

fragmede 9/2/2025||

> The alternative you're advocating for

Not sure where I advocated for that. Could you point it out please?

applied_heat 9/1/2025|||

$2k? That’s a $100k project for a medium size Corp

christophilus 9/1/2025|||

$200 does seem too low. $100k seems waaay too high. That sounds like an AWS talking point.

sysguest 9/1/2025|||

hmm where did you get the numbers?

(what's "medium-size corp" and how did you come up with $100k ?)

Aeolun 9/1/2025||

I’m assuming he’s talking about the corporate team of DBA’s that will spend weeks discussing the best way to copy a bunch of SQL files to S3

vidarh 9/1/2025||||

I do consulting in this space, and we consistently make more money from people who insist on using cloud services, because their setups tend to need far more work.

benterix 9/1/2025|||

Similar here - but in my case the reason is because of vendor lock-in - they spent years getting into AWS and any thought of getting out seems dreadful.

kiney 9/1/2025|||

same for me

yjftsjthsd-h 8/31/2025||||

As long as you also include the Cloud Certified DevOps Engineer™[0] to set up that RDS instance.

[0] A normal sysadmin remains vaguely bemused at their job title and the way it changes every couple years.

mrweasel 9/1/2025||

It's also interesting that the cloud engineer can apparently be both a DBA, network-, storage- and backup engineer, but if you move the same services on-prem, you apparently need specialists for each task.

Sometimes even the certified cloud engineers can't tell you why an RDS behaves the way it does, nor can they really fix it. Sometimes you really do need a DBA, but that applies equally to on-prem and cloud.

I'm a sysadmin, but have been labelled and sold as: Consultant (sounds expensive), DevOps engineer, Cloud Engineer, Operations Expert and right now a Site Reliability Engineer.... I'm a systems administrator.

Aeolun 9/1/2025|||

If you’ve started working in the industry more than about 15 years ago all the titles sound quaint.

icedchai 9/1/2025||||

I haven't seen a company that hired DBAs in over 15 years. I think the "DevOps" movement sent them packing, along with SysAdmins.

dijit 9/1/2025||

Sysadmins never left, they just got rebranded.

icedchai 9/2/2025||

I actually agree with this. I meant you never seen roles with the "system administrator" job title, not that it actually disappeared as a function. DBAs on the other hand, I do think that has mostly been absorbed into other roles.

data_marsupial 9/1/2025|||

Need to get Platform Engineer for a full house

benterix 9/1/2025||||

You are aware that RDS needs backups, setting up monitoring properly, defining access, providing secrets management etc., and updates between major versions are not automatic?

RDS has a value. But for many teams the price paid for this value is ridiculously high when compared to other options.

pdhborges 9/1/2025||

AWS can make major version upgrades automatically now with less downtime. I think they do the logical replication dance internally.

Cthulhu_ 9/1/2025||||

While that's fair, most organizations I've worked at in the past decade have had a dedicated team for managing their cloud setup, which is also responsible for backups, updates, monitoring, security and access control. I don't think they're competing.

sgarland 9/1/2025||||

You don’t need a DBA for any of those, you need someone who can read some docs. It’s not witchcraft.

Aeolun 9/1/2025||

I’d argue that AWS is witchcraft a lot of the time. They’ll have all these they claim will work for everything, but you’ll always find one of the things you’d expect to be unavailable.

lelanthran 9/1/2025||||

The RDS solution doesn't need a technical person to set it up?

It doesn't need someone who knows how to use the labrythine AWS services and console?

whstl 9/1/2025||

Agree.

These comments sound super absurd to me, because RDS is difficult as hell to setup, unless you do it very frequently or already have it in IoC format, since one needs setting up a VPC, subnets, security groups, internet gateway, etc.

It's not like creating a DynamoDB, Lambda or S3 where a non-technical person can learn it in a few hours.

Sure, one might find some random Terraform file online to do this or vibe-code some CloudFormation, but that's not really a fair comparison.

matt-p 8/31/2025||||

Totally. My frustration isn't even price though RDS is literally just dog slow.

steveBK123 9/1/2025|||

My firm paid DBAs for RDS as well so..

zenmac 9/1/2025|||

Yeah but AWS SRE are what making the big bucks! Soooo what can you do? It is nice to see many people here on HN are supporting open network and platform and making very drastic comments as to encouraging google engineers to quite their jobs.

I totally also understand why some people with family to support mortgage to pay they can't just walk way from a job at FAANG or MAMAA type place.

Looking at your comparison, this point it just seems like a scam.

jpgvm 9/1/2025||

Right now the big bucks are in managing massive bare metal GPU clusters.

benterix 9/1/2025|||

Yeah, let's use the opportunity while it lasts.

reactordev 9/1/2025|||

This. Clustering and managing Nvidia at scale is the new hotness demanding half-million dollar salaries.

benreesman 9/1/2025|||

In 2025 if you need convenience and no red tape you've got fly.io in the general case and maybe Vercel or something on a particular framework (there are some good ones for a particular stack).

If your needs go beyond that? Then you need real computers with real configuration and you have OVH/Hetzner/Latitude who will rent you MONSTER machines for the cost of some cheap-ass surplus 2017 Intel on The Cloud.

And if you just want a blog or whatever? Zillion VPS options.

The traditional cloud is for regulatory/process/corruption capture extraction in 2025: its machine economics and developer productivity use case is fucking zero I've seen. Maybe there's some edge case where a completely unencumbered team is better off with DMV trip permissions theatre, remnant Intel racked with noisy neighbors at massive markup, and no support recourse.

nine_k 9/1/2025||

(1) How does fly.io reliability compare to AWS, GCP, or maybe Linode or DO?

(2) What do you do if your large Hetzner server starts to show signs of malfunction? How soon would you be able to replace it, and how easily?

(2a) What do you do when your large Hetzner server just dies? I see that this happens rarely, but what's your contingency plan, if any?

(3) What do you do when your load is highly spiky? Do you reserve bare metal capacity for the biggest peak you expect to serve, because it's so much cheaper than running an elastic serverless architecture of the same capacity anyway?

(4) Considering that your stack still includes many components, how do you manage them, and how expensive is the management overhead? Do you need an extra SRE?

These are not rhetorical questions; I'd love to hear firm real practitioners! (E.g. Stack Overflow used to do deep dives into their few-big-servers architecture.)

runako 9/2/2025||

These are great questions.

A key factor underlining all of this is understanding, from a business/organizational perspective, your actual uptime requirements. Google may aim at 5 nines with the budget to achieve it, but many banks have routine planned downtime. If you don't know your objectives, you will have trouble making the tradeoffs necessary to get there. As a hypothetical, would your business choose 99.999% uptime (26 seconds down on average per month) vs 99.99% (4.3 min) if that caused infra costs to rise by 50% or more? If you said we can cut our infra costs by 50% by planning a short weekly maintenance window, how would that resonate?

Speaking to a few, in my experience:

2) (not at Hetzner specifically, but at a dedicated host). You have backups & recovery plans, and redundancy where it makes sense. You might run your database with a replica. If you are serving Web traffic, maybe you keep a hot spare. Also, you are still allowed to use e.g. cloud services if it makes sense to do so so you can backup to S3 and use things like SQS or KMS if you don't want to run them yourself. It's worth noting that you may not get advance notice; I recall our service being impacted by a fire at a datacenter that IIRC was caused by a traffic accident on a nearby highway. The point is you have to design resilience into the system. Fortunately, this is well-trod ground.

It would not be a terrible failover option to have something like an autoscale group at AWS ready to step in if the dedicated cluster goes offline. Keep that cluster scaled to 0 until it's needed. Put the cloud behind your cheap dedicated capacity.

3) See above. In my case, we over-provisioned because it's cheap to do so. I did not do this at the time, but I would probably look at running a replicated database with a hot standby on another server.

4) It has not been my experience that "modern" cloud deployments require fewer SRE resources. Like water running downhill, cloud projects seek complexity.

t_mahmood 9/1/2025|||

I don't get why people are so hell-bent on going to AWS, for the most minor applications, without looking at simpler options!

I am not even thousands km near the level of what you are doing, but my client was paying $100/m for an AWS server, SQS and S3 bucket, for a small PHP based web application that uses Amazon Seller API, Keepa API for the products he ships. Used MySQL for data storage.

I implemented the whole thing in Python, Django, and PostgreSQL (initially used SQLite) put it in a $25/m unmanaged VPS.

I have not got any complaints about performance, and it's running continuously updating product prices, details, processing PDF invoices using OCR, finding missing products in shipments, while also serving the website, and a 4 core server with 6GB RAM is handling it just fine.

The load is not going to be so high to require AWS and friends, for now. It's a small internal app, probably won't even get over 100 users, and if it ever does, it's extremely simple to migrate, because the app is so compact, even though not exactly monolithic.

And still, it probably won't need a $100 AWS server, unless we are scaling up much larger.

jeroenhd 9/1/2025|||

AWS is useful for big business. Automatic multi region failover and hosted databases may be expensive, but they're a massive pain to manually configure and an easy footgun if you're not used to doing that sort of thing. Plus, with Amazon you already have public toolkits to use those features with all of your services, so you don't need to figure how to integrate/what open source system to use to accomplish all of that. Plus, if you go for your own physical server, you need to arrange parts and maintenance windows for any hardware that will eventually fail.

If all you need is "good enough" reliability and basic compute power (which I think is good enough for many businesses, considering AWS isn't exactly outage free either), you're probably better off getting a server or renting one from a cheap cloud host. If you're promising five nines of uptime for some reason, you may want to reconsider.

t_mahmood 9/1/2025||

> If all you need is "good enough" reliability and basic compute power (which I think is good enough for many businesses, considering AWS isn't exactly outage free either), you're probably better off getting a server or renting one from a cheap cloud host.

This is exactly my point. Sorry if I was not clear on my OP.

We are using Seller API to get different product information, while their API provides base work for communicating with their endpoint, you'll have to implement your own system to use that, and handle the absurd unreliability of their API's rate limiter, and the spider web of API callbacks to get information that you require.

choeger 9/1/2025||||

How much did that reimplementing cost and when will the savings exceed that cost?

t_mahmood 9/1/2025||

This costed around $10k. Which also includes work that is outside the reimplementation.

I do not know how much actually cost of the original application.

The app, that I was developing, was for another purpose, and the reimplementation was later added.

The app replaces an existing commercial app that is being used, which is $200+/m. So, may be 4/5 years to exceed the savings. They have been using the app for 3 years, I think.

And, maybe I am beating my drum a little, I believe my implementation works, and looks much better than the commercial or the first implementation.

So, I am really looking forward for this to success.

Esophagus4 9/1/2025||||

Without understanding the architecture and use case better, at first read, my gut says that isn’t an AWS problem - it sounds like a solutions architecture problem.

There are cheaper ways of building that use case on AWS.

Most AWS sticker shock I’ve seen results from someone who doesn’t really understand cloud trying to build on the cloud. Cost has to be designed in from the start (in addition to security, operational overhead, etc).

In general, I’ve found two types of engineering teams who don’t use the cloud: the mugs and the superstars. And since superstars are few and far between, that means…

dijit 9/1/2025|||

Sounds like we need a specialist.

I guess those promises about needing fewer expensive people never materialised.

tbh, aside from the really anaemic use-cases where everything actually manages to scale to zero and has very low load: I have genuinely never seen an AWS project (outside of free credits of course) that works out cheaper than what came before.

That's TCO from PNLs, not a "gut feeling". We have a decade of evidence now.

t_mahmood 9/1/2025|||

... you failed at reading comprehension?

My comment was not about using AWS is bad, it has its uses. My comment was about how in this instance it was simply not needed. And I even speculated when it might be needed.

To pick the correct tool for the job, is what, it means to be an Engineer, or a person with common sense. With experience, we can get over childish absolutions of a tool or service, and look at the broader aspects, unless, of course, we are expecting some kind of monetary gains.

3shv 9/1/2025||||

What are some cheaper and better hosting providers that you can recommend?

benterix 9/1/2025|||

Hetzner.

For most public cloud providers you have to give them your credit card number so they can charge an arbitrary amount.

For Hetzner, instead of CC#, you give a scan of your ID (of course you can attach your CC too or Paypal). Personally I do my payments via a bank transfer. I recently paid for the whole 2025 and 2026 for all my k8s clusters. It gives unimaginable peace of mind when compared to AWS/GCP/Azure.

Plus, their cloud instances often spin up much faster than EC2.

drewnick 9/1/2025||||

For bare metal I’ve been using tier.net to get 192 GB RAM, 4TB NVME and 32 cores for $219/mo.

Data centers all over the country and I get to locate under 10ms from my regional audience.

Just a data point if you want some bigger iron than a VM.

t_mahmood 9/1/2025||||

I have used Knownhost previously, it served me really well.

Before that, I used to go for Linode, but I think they've become more pricey?

LamaOfRuin 9/1/2025||

Linode was bought by Akamai. They immediately raised prices, and they have been, if anything, less reliable.

t_mahmood 9/1/2025||

Ahh, yes, I remember now! I think it's almost 8 years now? Stopped using them after the buy out.

Too bad, actually, their service was pretty good.

ferngodfather 9/1/2025|||

Hetzner! They do ask for ID though.

mr_toad 9/1/2025|||

Saving $75 a month at what cost in labour?

andersmurphy 9/1/2025||

You actually save on labour. A VPS is a lot less work than anything involving AWS console.

andersmurphy 8/31/2025|||

100% this add an embedded database like sqlite and optimise writes to batch and you can go really really far with hetzner. It's also why I find the "what about overprovisioning" argument silly (once you look outside of AWS you can get insane cost/perf ratio).

Also in my experience more complex systems tend to have much less reliability/resilience than simple single node systems. Things rarely fail in isolation.

Demiurge 8/31/2025|||

I think it’s the other way around. I’m a huge fan of Hetzner for small sites with a few users. However, for bigger projects, the cloud seems to offer a complete lack of constraints. For projects that can pay for my time, $200/m or $2000/m in hosting costs is a negligible difference. What’s the development cost difference between AWS CDK / Terraform + GitHub Actions vs. Docker / K8s / Ansible + any CI pipeline? I don’t know; in my experience, I don’t see how “bare metal” saves much engineering time. I also don’t see anything complicated about using an IaC Fargate + RDS template.

Now, if you actually need to decouple your file storage and make it durable and scalable, or need to dynamically create subdomains, or any number of other things… The effort of learning and integrating different dedicated services at the infrastructure level to run all this seems much more constraining.

I’ve been doing this since before the “Cloud,” and in my view, if you have a project that makes money, cloud costs are a worthwhile investment that will be the last thing that constrains your project. If cloud costs feel too constraining for your project, then perhaps it’s more of a hobby than a business—at least in my experience.

Just thinking about maintaining multiple cluster filesystems and disk arrays—it’s just not what I would want to be doing with most companies’ resources or my time. Maybe it’s like the difference between folks who prefer Arch and setting up Emacs just right, versus those happy with a MacBook. If I felt like changing my kernel scheduler was a constraint, I might recommend Arch; but otherwise, I recommend a MacBook. :)

On the flip side, I’ve also tried to turn a startup idea into a profitable project with no budget, where raw throughput was integral to the idea. In that situation, a dedicated server was absolutely the right choice, saving us thousands of dollars. But the idea did not pan out. If we had gotten more traction, I suspect we would have just vertically scaled for a while. But it’s unusual.

runako 8/31/2025|||

> I really don't see how "bare metal" saves any engineering time

This is because you are looking only at provisioning/deployment. And you are right -- node size does not impact DevOps all that much.

I am looking at the solution space available to the engineers who write the software that ultimately gets deployed on the nodes. And that solution space is different when the nodes have 10x the capability. Yes, cloud providers have tons of aggregate capability. But designing software to run on a fleet of small machines is very different from accomplishing the same tasks on a single large machine.

It would not be controversial to suggest that targeting code at an Apple Watch or Raspberry Pi imposes constraints on developers that do not exist when targeting desktops. I am saying the same dynamic now applies to targeting cloud providers.

This isn't to say there's a single best solution for everything. But there are tradeoffs that are now always apparent. The art is knowing when it makes sense to pay the Cloud Tax, and whether to go 100% Cloud vs some proportion of dedicated.

Demiurge 8/31/2025|||

Overall, I agree that most people underestimate the runway that the modern dedicated server can give you.

sevensor 9/1/2025|||

I’ve seen multiple projects founder on the complexity of writing software for the cloud. Moving data from here to there ends up being way harder than anybody expected. Maybe teams with more experience build this into their planning, but from what I’ve seen, if you’re using the cloud, your solution ends up being 95% about getting data where it’s supposed to be and 5% application logic.

Esophagus4 9/1/2025||

This sounds a people problem, not a technology problem.

I’ve never had an issue with moving data.

benterix 9/1/2025|||

> I’m a huge fan of Hetzner ... I don’t see how “bare metal” saves much engineering time.

I think you confuse Heztner with bare metal. Hetzner has Hetzner Cloud which is like AWS EC2 but much cheaper. (They also have bare metal servers which are even cheaper.) With Heztner Cloud, you can use Terraform, Github Actions and whatever else you mentioned.

Demiurge 9/1/2025||

Yeah, I do confuse it, because I've been using Hetzner long before they had "cloud".

cnst 9/1/2025|||

> types of solutions engineers even consider

I think the issue is actually the opposite.

With the cloud, the engineers fail to see the actual cost of their inefficient scaled-out code, because someone else (the CFO) pays the bill; and the answer to any issue, is simply adding more "workers" and more "cloud", since they're basically "free" from the perspective of the employee. (And the more "cloud" something is, like, the serverless, the more "free", completely inverting the economics of making a profit on the service — when the CFO tells you that your AWS bill is too high, you move everything from the EC2 to AWS Lambda, since the salesperson from AWS tells you that serverless is far cheaper, only for the bill to get even higher, for reasons unknown, of course.)

Whom the cloud tax actually constrains are the entrepreneurs and solo-preneurs. If you have to pay $5000/mo to AWS just for the infra, you can only go so long without lots of revenue, and you'd need to have a whopping 5k/mo+ worth of revenue before breaking even. Yet with a $200/mo like at OVH or Hetzner, you can afford to let it grow at negligible cost to yourself, and it can basically start being profitable with the first few users.

Don't believe this? Look at the blog entries by the guy who bought Yahoo!'s Delicious, written before they went bankrupt and were up for sale. He was basically pointing out that the services have roughly the same number of users, and require the same engineering resources, yet one is being operated at a loss, whereas the other one makes a profit (guess which one, and guess why).

* https://en.wikipedia.org/wiki/Delicious_(website)

* https://en.wikipedia.org/wiki/Pinboard_(website)

* https://news.ycombinator.com/from?site=blog.pinboard.in

So, literally, the difference between the cloud and renting One Big Server, is making a loss and going out of business, and remaining in business and purchasing your underwater competitor for pennies on the dollar.

ldoughty 9/1/2025|||

I agree that AWS EC2 is probably too expensive on the whole. It also doesn't really provide any of the greater benefits of the cloud that come from "someone else's server".

However, to the point of microservices as the article mentions, you probably should look at lambda (or fargate, or a mix) unless you can really saturate the capacity of multiple servers.

When we swapped to ECS+EC2 running microservices over to lambda our costs dropped sharply. Even serving millions of requests a day we spend a lot of time in between idle, especially spread across the services.

Additionally, we have 0 outages now from hardware in the last 5 years. As an engineer, this has made my QoL significantly better.

jgalt212 9/1/2025||

> I agree that AWS EC2 is probably too expensive on the whole.

Probably? It's about 5-10X more expensive than equivalent services from Hetzner.

Spooky23 9/1/2025|||

It really depends on what you are doing. But when you factor the network features, the ability to scale the solution, etc you get alot of stuff inside that $200/mo EC2 device. The product is more than the VM.

That said, with a defined workload without a ton of variation or segmentation needs there are lots of ways to deliver a cheaper solution.

troupo 9/1/2025||

> you get alot of stuff inside that $200/mo EC2 device. The product is more than the VM.

What are you getting, and do you need it?

throwaway7783 9/1/2025||

Probably not for $200/mo EC2, but AWS/GCP in general

* Centralized logging, log search, log based alerting

* Secrets manager

* Managed kubernetes

* Object store

* Managed load balancers

* Database HA

* Cache solutions

... Can I run all these by myself? Sure. But I'm not in this business. I just want to write software and run that.

And yes, I have needed most of this from day 1 for my startup.

For a personal toy project, or when you reach a certain scale, it may makes sense to go the other way. U

eska 9/1/2025|||

Now imagine your solution is not on a distributed system and go through that list. Centralized logging? There is nothing to centralized. Secrets management? There are no secrets to be constantly distributed to various machines on a network. Load balancing? In practice most people for most work don’t use it because of actually outgrowing hardware, but because they have to provision to shared hardware without exclusivity. Caching? Distributed systems create latency that doesn’t need to exist at all, reliability issues that have to be avoided, thundering herd issues that you would otherwise not have, etc.

So while there are areas where you need to introduce distributed systems, this repeated disparaging comment of “toy hobby projects” makes me distrust your judgement heavily. I have replaced many such installations by actually delivering (grand distributed designs often don’t fully deliver), reducing costs, dramatically improving performance, and most importantly reducing complexity by magnitudes.

bbarnett 9/1/2025|||

Not to mention scaling. Most clients I know never, ever have scaled once. Ever. Or if they do, it's to save money.

One server means you can handle the equiv of 100+ AWS instances. And if you're into that turf, then having a rack of servers saves even more.

Big corp is pulling back from the cloud for a reason.

throwaway7783 9/2/2025||

I mentioned this in an earlier comment. It is dumb to be on the cloud at a large enough scale.

viraptor 9/1/2025||||

> Centralized logging? There is nothing to centralized.

It's still useful to have the various services, background jobs, system events, etc. in one indexed place which can also manage retention and alerting. And ideally in a place reachable even if the main service goes down. I've got centralised logging on a small homelab server with a few services on it and it's worth the effort.

> Load balancing? In practice most people for most work don’t use it because of actually outgrowing hardware, but because they have to provision to shared hardware without exclusivity.

Depending on how much you lose in case of downtime, you may want at least 2x of hardware for redundancy and that means some kind of fancy routing (whether it's LB, shared IP, or something else)

> Secrets management? There are no secrets to be constantly distributed to various machines on a network.

Typically businesses grow to more than one service. For example I've got a slack webhook in 3 services in a small company and I want to update it in one place. (+ many other credentials)

> Caching? Distributed systems create latency that doesn’t need to exist at all

This doesn't solve the need for caching results of larger operations. It doesn't matter how much latency you have or not, you still don't want that rarely-changing 1sec long query to run on every request. Caching is rarely only about network latency.

Spooky23 9/1/2025||||

It sounds like you make a living doing stuff that has an incredibly small, ninja-like team, has a very low change rate, or is something that nobody really cares about. Things like RPO/RTO, multi-tenancy, logging, etc don't matter.

That's amazing. I wish I could do the same.

Unfortunately, I cannot run my business on a single server in a cage somewhere for a multitude of reasons. So I use AWS, a couple of colos and SaaS providers to deliver reliable services to my customers. Note I'm not a dogmatic AWS advocate, I seek out the best value -- I can't do what I do in AWS without alot of capital spend on firewalls and storage appliances, as well as the network infrastructure and people required to make those work.

throwaway7783 9/2/2025||

Exactly. I don't quite understand how people say you just need a box. It certainly is much higher performant than a cloud VM, but that is not the only thing there is to run a software well. It all adds up bit by bit. It surely seems to be the way to go at some scale (or no customers who care).

throwaway7783 9/2/2025|||

Maybe I'm dumb. I am not even taking about distributed systems here. I'm taking about basic high availability configuration. Two web servers, two (or three) db server instances for HA. I have had paying enterprise customers from day 1, and I don't want them screaming at me for systems going down.

And as soon as you have two of anything, all the above start mattering.

If none of this actually is an issue for you and your customers, I'll say your are very lucky.

doganugurlu 9/1/2025||||

You need database HA and load balancers on day 1?

You must be doing truly a lot of growth prior to building. Or perhaps insisting on tiny VMs for your loads?

swiftcoder 9/1/2025|||

> Or perhaps insisting on tiny VMs for your loads?

This happens way too often. Early-stage startups that build everything on the AWS free tier (t2.micro only!), and then when the time comes they scale everything horizontally

throwaway7783 9/2/2025||

I'll repeat what I said above. It's for availability (aka I don't want my customers screaming at me if the machine goes down). And no , scaling out was not our first solution, scaling up was. I have considered going bare metal so many times, but the number of things we need to build/manage by ourselves to function is too much right now.

Hopefully when we can afford to do it, we will.

throwaway7783 9/2/2025|||

HA is for availability. I don't want downtime for my enterprise customers. Are your customers okay with downtime? And as soon as you have more than one nodes, you need some kind of a load balancer in the front.

rcxdude 9/3/2025||

In practice, until you're at a certain scale, software bugs are more of a threat to your availability than hardware failures or maintenance downtime, and the cloud does nothing for you there (in fact, the additional complexity is likely to make it worse). Modern hardware is pretty reliable, more so than a given ec2 instance, for example.

throwaway7783 9/4/2025||

Often, software bugs cause issues with machines (uncontrolled logging overwhelming disk space and killing everything, instead of objectstore absorbing it, or a memory bug that kills the process over a few days and needing manual or custom monitoring scripts instead of k8s handling this until its root caused, and so on)

runako 9/1/2025||||

> Centralized logging, log search, log based alerting

Do people really use the bare CloudWatch logs as an answer for log search? I find it terrible and pretty much always recommend something like DataDog or Splunk or New Relic.

throwaway7783 9/2/2025||

we are on GCP. Logs Explorer on GCP is pretty good

troupo 9/1/2025|||

> For a personal toy project,

which in reality is any project under a few hundred thousand users

throwaway7783 9/2/2025||

.. who are okay with services going down here and there

troupo 9/2/2025||

Which is the vast majority of services.

throwaway7783 9/2/2025||

Okay. If the premise is "you don't have to worry about downtimes and only need to serve a few hundred thousand users and no data intensive use cases", then I guess you can do whatever and it'll still be okay.

cedws 9/1/2025|||

I don’t disagree but “cores” is not a good measure of computational power.

christophilus 9/1/2025||

True, but the cores on a dedicated Hetzner box obliterate the cores on an EC2 machine every time I’ve tested them. So, if anything, it understates the massive performance gap.

andersmurphy 9/1/2025||

Hetzner also tends to have more modern SSDs with the latest nvme. Which can make a massive difference for your DB.

Nextgrid 9/1/2025||

It's less about the modernity of SSDs and more about a fundamental difference: all persistent storage on AWS is actually networked - it's exposed to you as NVME but it's actually on a SAN and all IO requests go over the network.

You can get actual direct-attached SSDs on EC2 (and I'd expect performance to be on-par with Hetzner), but those are ephemeral and you lose them on reboot.

andersmurphy 9/2/2025||

Wow, that's crazy, I was wondering why the numbers I were seeing on AWS were so much worse. I assumed it was the drive modernity. But network makes a lot more sense.

Thanks for the insight!

benjiro 9/1/2025|||

> At Hetzner, you can rent a machine with 48 cores and 128GB of RAM for the same money.

The problem that Hetzner and a lot of hardware providing hosts have, is the lack of affordable flexibility.

Hetzner their design is based upon a base range of standardized products. This can only be upgraded within a pre-approved range of upgrade options (limited to storage/memory).

Upgrades are often a mixed bag of carefully designed "upgrade paths". As you can expect, upgrades are not cheap. Doubling the storage on a base server, often increases the price of your server by 50 to 75%. The typical customizing will cost you dearly.

This is where AWS wins a lot more. Yes, they are expensive as hell, but you often are not stuck to a base config and a limited upgrade path. The ability to scale beyond what Hetzner can offer is there, and your not forced to overbuy from the start. Transferring between servers is a few buttons and done. With Hetzner, if you did not overspec from the start, your going to do those fun server migrations.

The ironic part is, that buying your own hardware and running it yourself, often ends up paying back within a 8~12 month periode (not counting electricity / internet). And you maintain a lot more flexibility.

* You want to use bifurcation, go for it.

* You want to use consumer 4TB nvme's for second layer read storage (what hetzner refuses to offer as they limited those to 2TB and only one a few servers), go for it.

* You want a 10Gbit interlink between your server, go for it. No need to pay a monthly fee! No need to reserve "future space".

* O, you want a 25Gbit, go for it (hetzner = not possible).

* You want 50Gbit ...

* You want to chuck in a few LLM capable GPUs without breaking the bank...

Its ironic that we are 2025 and Hetzner is stil limited to 1Gbit connection on its hardware, when just about any consumer level hardware has 2.5Gbit by default for years.

Your own hardware gives you the flexibility of AWS and the cost saving beyond Hetzner. Maybe its just my environment, but i see more and more smaller to medium companies going back to their own locally run servers. Not even colocation.

The increase in consumer level fiber, what used to be expensive or not available, has opened the doors for businesses. Most companies do not need insane backbones.

The fact that you can get business fiber 10Gbit for a 100 Euro price in some EU countries (of course never the north), is insane. I even seen some folks combining fiber with starlink & 5G as backup in case their fiber fails/is out.

As long as you fit within a specific usage case that is being offered by Hetzner, they are cheap. But its the moment you step outside that comfort zone, ... This is one of Hetzner weaknesses and where AWS or Self hosted comes back.

bluedino 9/1/2025|||

Almost reminds of Rackspace back in...2011

We had a leased server from them, running VMware, and we had Linux virtual machines for our application.

We ran out of RAM. We only had 16 or 32GB at the time. Hey, can we double this? Sure, but our payment would nearly double. How does that make any sense?

If this were a co-located box we owned, I could buy a pair of $125 chips from Crucial (or $250 Dell chips from CDW) and there we go. But we're expected to pay this much more per month?

Their answer was "you can do more with the server so that's what you're paying for"

Storage was a similar situation, we were still on RAID with spinning drives and we wanted to go SSD, not even NVME. Wasn't going to happen. And if we went to a new server we'd have to get all new IP's and stuff. Ugh.

And 10Gb...that was a pipe dream. Costs were insane.

We ended up having to decide between two things:

1. Move to a co-lo and buy a couple servers, ala StackExchange. This is what I wanted to do.

2. Tweak the current application stack, and re-write the next version to run on AWS.

What did we end up doing? Some half ass solution using the existing server for DB and NGINX proxy, while running the sites on (very slow) Slicehost instances (which Rackspace had recently acquired and roughly integrated into their network). So we still had downtime issues, slow databases, etc.

radiator 9/1/2025|||

> Doubling the storage on a base server, often increases the price of your server by 50 to 75%

For storage, Hetzner does offer Volumes, which you can attach to your VM and you can choose exactly how large you want them to be and are charged separately. But your argument about doubling resources and doubling prices still holds for RAM.

Nextgrid 9/1/2025|||

FYI he's talking about dedicated servers (or "root servers" as they call them).

benjiro 9/1/2025|||

> For storage, Hetzner does offer Volumes, which you can attach to your VM

The argument was about dedicated hardware. But it still holds for VPS.

Have you seen the price of Cloud Storage? ARM VPS 40GB is 4.51 (inc tax), for 40GB storage, your paying 2.10 Euro. So my argument still holds as your paying almost 50% more, just to go from 40GB to 80GB. And that ratio gets worse if your renting higher end VPS, and double your storage on them.

Lets be honest, 53.62 Euro for 1TB of SSD storage in 2025, is ridiculous.

Netcup is at 12 Euro/TB for SSD storage (same speed as the VMS as its just localized storage on the server, not network storage). Fyi: A ARM 6 Core 256GB, at netcup is 6.26 Euro.

Hetzner used to be the market leader and pushed others, but you barely see any new products or upgraded from them anymore. I said it before, if Netcup actually invested into a more modern/scalable VPS solution (instead of their 2010 VPS panels), they will eat a lots of Hetzners clients.

themafia 8/31/2025||

On AWS if you want raw computational capacity you use Lambda and not EC2. EC2 is for legacy type workloads and doesn't have nearly the same scaling power and speed that Lambda does.

I have several workloads that just invoke Lambda in parallel. Now I effectively have a 1000 core machine and can blast through large workloads without even thinking about it. I have no VM to maintain or OS image to consider or worry about.

Which highlights the other difference that you failed to mention. Hertzner charges a "one time setup" fee to create that VM. That puts a lot of back pressure on infrastructure decisions and removes any scalability you could otherwise enjoy in the cloud.

If you want to just rent a server then Hertzner is great. If you actually want to run "in the cloud" then Hertzner is a non-starter.

solid_fuel 8/31/2025|||

Strong disagree here. Lambda is significantly more expensive per vCPU hour and introduces tight restrictions on your workflow and architecture, one of the most significant being maximum runtime duration.

Lambda is a decent choice when you need fast, spiky scaling for a lot simple self-contained tasks. It is a bad choice for heavy tasks like transcoding long videos, training a model, data analysis, and other compute-heavy tasks.

themafia 9/1/2025||

> significantly more expensive per vCPU hour

It's almost exactly the same price as EC2. What you don't get to control is the mix of vCPU and RAM. Lambda ties those two together. For equivalent EC2 instances the cost difference is astronomically small, on the order of pennies per month.

> like transcoding long videos, [...] data analysis, and other compute-heavy tasks

If you aren't breaking these up into multiple smaller independent segments then I would suggest that you're doing this wrong in the first place.

> training a model

You're going to want more than what a basic EC2 instance affords you in this case. The scaling factors and velocity are far less of a factor.

runako 9/1/2025|||

This is a great example of what I meant when I said that a part of the Cloud Tax is it constrains the solution space available to developers. In an era where one can purchase, off-the-shelf, a 256-core machine with terabytes of RAM, developers are still counting megabytes(!) of file sizes due to the constraints of AWS.

It should be obvious that this is not the best answer for all projects.

jalk 9/1/2025||||

This article (from Nov. 2022) shows that "utilizing Lambda is preferable until Lambda is utilized about 40 to 50 % of the time"

https://medium.com/life-at-apollo-division/compare-the-cost-...

eska 9/1/2025|||

> If you aren't breaking these up into multiple smaller independent segments then I would suggest that you're doing this wrong in the first place.

Care to elaborate?

icedchai 9/1/2025||

You are expected to work around Lambda limitations because it's the "right way", not because the limitations make things overly complex. /s

icedchai 8/31/2025||||

That's fine, except for all of Lambda's weird limitations: request and response sizes, deployment .zip sizes, max execution time, etc. For anything complicated you'll eventually you run into all this stuff. Plus you'll be locked into AWS.

themafia 9/1/2025||

> request and response sizes

If either of these exceed the limitations of the call, which is 6MB or 256kB depending on call type, then you can just use S3. For large distributed task coordination you're going to be doing this anyways.

> deployment .zip sizes

Overlays exist and are powerful.

> max execution time

If your workload depends on long uninterrupted runs of time on single CPUs then you have other problems.

> Plus you'll be locked into AWS.

In the world of serverless your interface to the endpoints and semantics of Lambda are minimal and easily changed.

icedchai 9/1/2025||

Of course, we can generally work around all these things. The point is it is annoying to do so. It adds friction and further couples you to a proprietary platform.

You're better off using ECS / Fargate for application logic.

twotwotwo 9/1/2025||||

> [Hetzner] charges a "one time setup" fee to create that VM. That puts a lot of back pressure on infrastructure decisions and removes any scalability you could otherwise enjoy in the cloud.

Hetzner Cloud, then! In the US, $0.53/hr / $333.59/mo for 48 vCPU/192GB RAM/960GB NVMe. Includes 8 TB/mo traffic, when 8 TB egress would cost $720 on EC2; more traffic is $1.20/TB when the first tier of AWS egress is $90/TB. No setup fee. Not that it's EC2 but there's clearly flexibility there.

More generally, if you want AWS, you want AWS; if you want servers you have options.

matt-p 8/31/2025||||

Very few providers charge setup, some will provision a server within a 90s of an api call.

themafia 9/1/2025||

Hertzner does on the server the OP was referencing:

https://www.hetzner.com/dedicated-rootserver/ax162-s/

Aeolun 9/1/2025|||

If you are scared off by the €80 setup on a server that costs €200 a month, it seems like the setup fee did its intended job no?

ferngodfather 9/1/2025||||

Most providers do for dedicated servers, or make you agree to a fixed term. I don't believe they do the same for VPS / Cloud servers.

benjiro 9/1/2025||

> I don't believe they do the same for VPS / Cloud servers.

Because its backed into the price. If you run a VPS for a month, you get the listed monthly price. But if you run a VPS for a shorter time, the hourly billing price is a lot more expensive.

The ironic part being, that your better off keeping a VPS active until the end of your month periode (if you already crossed 2/3), then its is to cancel early.

Noticed that few people realize that the hourly price != the monthly price.

matt-p 9/1/2025|||

I don't think that negates the point I was making. Most don't, for example none of the providers on https://www.serversearcher.com/ seem to charge setup.

lachiflippi 9/1/2025|||

Hetzner does not charge any provisioning fees for VMs and never has.

dang 8/31/2025||

HN uses two—one live and one backup, so we can fail over if there's a hardware issue or we need to upgrade something.

It's a nice pattern. Just don't make them clones of each other, or they might go BLAM at the same time!

https://news.ycombinator.com/item?id=32049205

https://news.ycombinator.com/item?id=32032235

https://news.ycombinator.com/item?id=32028511 (<-- this is where it got figured out)

---

Edit: both these points are mentioned in the OP.

bpye 8/31/2025||

Whilst not as fatal as a failing SSD, AMD also had a fun errata where a CPU core would hang in CC6 after ~1044 days.

https://www.servethehome.com/amd-epyc-7002-rome-cpus-hang-af...

d_burfoot 9/1/2025||

Any stats on HN downtime over the years? I remember one or two outages in the last decade or so, but I would guess the uptime is about 99.99%.

dang 9/1/2025||

We don't specifically track that, no. The worst one was when we went down for (IIRC) a couple days because of a disk failure, I think in Jan 2014. It was after that that we added a failover box.

HN goes down when we restart the server process, usually as part of updating the code - but only for a few seconds. The message "Restarting the server. Shouldn't take long." displays when that is happening.

There are also, to my exasperation, still moments of brownout during certain traffic spikes or moments of obscure resource contention. But these are at least rarer than they used to be.

api 8/31/2025||

I’ve found that it’s hard to even hire engineers who aren’t all in on cloud and who even know how to build without it.

Even the ones who do know have been conditioned to tremble with fear at the thought of administrating things like a database or storage. These are people who can code cryptography kernels and network protocols and kernel modules, but the thought of running a K8S cluster or Postgres fills them with terror.

“But what if we have downtime!” That would be a good argument if the cloud didn’t have downtime, but it does. Most of our downtime in previous years has been the cloud, not us.

“What if we have to scale!” If we are big enough to outgrow a 256 core database with terabytes of SSD, we can afford to hire a full time DBA or two and have them babysit a cluster. It’ll still be cheaper.

“What if we lose data?” Ever heard of backups? Streaming backups? Hot spares? Multiple concurrent backup systems? None of this is complex.

“But admin is hard!” So is administrating cloud. I’ve seen the horror of Terraform and Helm and all that shit. Cloud doesn’t make admin easy, just different. It promised simplicity and did not deliver.

… and so on.

So we pay about 1000X what we should pay for hosting.

Every time I look at the numbers I curse myself for letting the camel get its nose under the tent.

If I had it to do over again I’d forbid use of big cloud from day one, no exceptions, no argument, use it and you’re fired. Put it in the articles of incorporation and bylaws.

matt-p 8/31/2025|

I have also found this happening. It's actually really funny because I think even I'm less inclined to run postgres myself these days, when I used to run literally hundreds of instances with not much more than PG_DUMP, cron and two read only replicas.

These days probably the best way of getting these 'cloudy' engineers on board is just to tell them its Kubernetes and run all of your servers as K3s.

api 8/31/2025||

I’m convinced that cloud companies have been intentionally shaping dev culture. Microservices in particular seem like a pattern designed to push managed cloud lock in. It’s not that you have to have cloud to use them, but it creates a lot of opportunities to reach for managed services like event queues to replace what used to be a simple function call or queue.

Dev culture is totally fad driven and devs are sheep, so this works.

matt-p 8/31/2025||

Yeah I think that's fair. I'm very pro containers though, that's a genuine step forward from deploy scrips or vm images.

lewisjoe 8/31/2025||

I helped bootstrap a company that made an enterprise automation engine. The team wanted to make the service available as SaaS for boosting sales.

They could have got the job done by hosting the service in a vps with a multi-tenant database schema. Instead, they went about learning kubernetes and drillingg deep into "cloud-native" stack. Spent a year trying to setup the perfect devops pipeline.

Not surprisingly the company went out of business within the next few years.

rixed 9/1/2025||

> Not surprisingly the company went out of business within the next few years.

But the engineers could find new jobs thanks to their acquired k8s experience.

doganugurlu 9/1/2025||

Get paid to learn and build your career instead, baby!

joshmn 8/31/2025|||

This is my experience too—there’s too much time wasted trying to solve a problem that might exist 5 years down the road. So many projects and early-stage companies would be just fine either with a PaaS or nginx in front of a docker container. You’ll know when you hit your pain point.

cpursley 8/31/2025||

Yep, this is why I'm a proponent of paas until the bill actually hurts. Just pay the heroku/render/fly tax and focus on product market fit. Or, play with servers and K8s, burning your investors money, then move on to the next gig and repeat...

Aeolun 9/1/2025|||

The moment I sign up for a PaaS the bill hurts. I can never get over the fact I can get 1000x more compute for the same price, never mind that I never use it and have to set everything up myself. I’ll just never pay to lock myself in to something so restricted. My dedicated server allows me to do anything I want or need.

cpursley 9/1/2025||

If you enjoy playing with servers instead of shipping features, enjoy!

Aeolun 9/2/2025||

That’s only true if you still have to learn how to deploy to a server. I have the opposite problem. I need to learn how to deploy to these wonky services, and it never seems to transfer from one to the other.

cpursley 9/2/2025||

I moved from Heroku -> to Render.com in a day, then later Render -> Fly in a couple hours because everything was already dockerized. I’ve never really have to think about my servers on any of these providers, they just run.

DaSHacka 8/31/2025||||

> Or, play with servers and K8s, burning your investors money, then move on to the next gig and repeat...

I mean, of the two, the PaaS route certainly burns more money, the exception being the rare shop that is so incompetent they can't even get their own infrastructure configured correctly, like in GP's situation.

There are guaranteed more shops that would be better off self-hosting and saving on their current massive cloud bills than the rare one-offs that actually save so much time using cloud services, it takes them from bankruptcy to being functional.

fragmede 8/31/2025||

> the PaaS route certainly burns more money,

Does it? Vercel is $20/month and Neon starts at $5/month. That obviously goes up as you scale up, but $25/month seems like a fairly cheap place to start to me.

(I don't work for Vercel or Neon, just a happy customer)

cpursley 8/31/2025||

Yeah, also a happy neon customer - but they can get pricy. Still prefer them over AWS. For compute, Fly is pretty competitive.

theaniketmaurya 8/31/2025||

I’m using Neon too and upgraded to the scale up version today. Curious, what do you mean rhat they can get pricey?

Aeolun 9/1/2025||

Like, you keep your server running for a month and you need to pay $255 pricey? I can get about 64 cores of dedicated compute for the price of a single neon compute (4c/16gb) unit.

And that’s before you factor in 500gb of storage.

cpursley 9/1/2025||

And how much time are you spending babysitting all of this? What’s your upgrade, deploy and rollback story? Because I don’t have to even think about these things.

fragmede 8/31/2025|||

Yeah, same. Vercel + Neon and then if you actually have customers and actually end up paying them enough money that it becomes significant, then you can refactor and move platforms, but until you do, there are bigger fish to fry.

matt-p 8/31/2025||

100%. Making it a docker container and deploying it is literally a few hours at most.

tgtweak 9/1/2025||

I've been doing hybrid colo+public cloud for over a decade and it's always been the most cost effective route at a certain scale. That specific break even point is lowering over time with the density and cost effectiveness of hardware.

Sure you need net/infra admins but the software and hardware these days are pretty management friendly and you'll find you still need (often more expensive "cloud") admins so you're not offsetting much management cost there.

Colocation is plentiful and providers often aggregate and resell bandwidth from their preferred carriers.

At one point we were up to 8 dell vrtx clusters and a few SANs, with 500+ VMs from huge msSQL servers to kube clusters the public cloud bill would have been well into the 6 figures even with preferred pricing and reserved instances. Our colocation bill was $2400/mo and that was mostly for power. The one thing that always surprised me was how much faster everything was - every time we had to scale-over into the cloud the public cloud node was noticably slower even for identical CPU generations and vcpu.

You need to be very keen about server deals, updates, support contracts and licenses - but it's really manageable and interconnecting with the cloud is trivial at this point - you can get a "cloud connect" fiber drop to your preferred cloud provider and connect your colo infra to your vpc.

brazzy 9/1/2025|

Colocation to me means you buy your own hardware and rent only the rack space (and power and connectivity) from the datacenter. Is that really what you're talking about? If so, why do you choose this over renting bare metal servers?

tgtweak 9/1/2025|||

Not always - you can lease your servers from the vendor as well, in which case you're renting the rack space, power and cooling from the datacenter and you're renting the servers from the vendor - most of the leases are designed so you can refresh your hardware every 4-5 years and it's usually still cheaper than renting from a dedicated hosting company.

Once you have an established baseline for your server needs - it's almost always more capital friendly to buy the servers and keep them running for the ~5 reliable years you'll get out of them - usually break even here is 2-3 years vs renting from a provider. If you're running your servers until they fail you'll get 7-10 years out of them, provided the power cost is still worth running them (usually that is also around the 8-10 year mark depending on your power cost).

So there are many reasons you'd buy vs rent - including capital deductions and access to cheap interest rates. You can also get some pretty crazy deals (like 33% of new price) by buying 2-3 year old equipment, then continue to run them for another 4-5 years, which is the lowest cost scenario if you don't need bleeding edge.

brazzy 9/1/2025||

What about the cost of having people actually go to the datacenter to install hardware, and go again whenever there is a hardware problem, possibly resulting in much longer downtimes than with a rented server?

Especially for the "one (or a few) big server" scenario in the article, that would seem to me a pretty big factor.

tgtweak 9/2/2025||

At 1 rack scale you're saving ~20-30k/mo in cloud fees - you can hire an excellent sysadmin in the 12-15k/mo range and they can do a lot more than just go to the datacenter as needed.

brazzy 9/2/2025||

But we're not comparing colo to cloud fees, we're comparing colo to renting a server.

fragmede 9/1/2025||||

Because it's your hardware in the colo, so if money becomes dire, you can extend the servers lifetime beyond the standard depreciation schedule. Your rented bare metal servers might be slightly cheaper than a respective EC2 instance, but you stop paying that bill, it's gonna go poof, same as the EC2 instance.

HankStallone 9/2/2025|||

I went with buying and colocation because I found I sleep better this way than when I used to rent servers in a distant datacenter and have to count on techs I'd never met working on hardware I'd never seen if anything went wrong. In my case, I live near the datacenter, so I can be hands-on fairly quickly if something goes wrong that I can't handle remotely.

And I can do whatever I want with the hardware. When I bought my servers, they came with disk controllers with non-optional RAID, as almost all of them do. I wanted to run RAIDz2 in FreeBSD/ZFS, so I swapped in non-RAID controllers. They were just a few bucks, but having that ability meant I could choose from a wider range of servers.

AuthAuth 9/1/2025||

A lot of the time businesses just aren't that important. The amount places I've seen that stress over uptime when nothing they run is at all critical. Hell you could drop the production environment in the middle of the day and yes it would suck and you'd get a few phone calls but life would go on.

These companies all ended up massively increasing their budgets switching to cloud workloads when a simple server in the office was easily enough for their 250 users. Cloud is amazing for some uses and pure marketing BS for others but it seems like a lot of engineers aim for a perfect scalable solution instead of one that is good enough.

ehnto 9/1/2025|

I had a team member who would reiterate that during tough times. They come from much more consequential work, so they would often remark that at least nobody dies when we fuck up.

winternewt 9/1/2025||

Every corporate meeting should start with reminding ourselves that we're all going to die. And it most likely won't be from anything happening at the office.

matt-p 8/31/2025||

A thoroughly good article. It's probably worth also considering adding a CDN if you take this approach at scale. You get to use their WAF and DNS failover.

A big pain point that I personally don't love is that this non-cloud approach normally means running my own database. It's worth considering a provider who also provides cloud databases.

If you go for an 'active/passive' setup, consider saving even more money by using a cloud VM with auto scaling for the 'passive' part.

In terms of pricing the deals available these days on servers are amazing you can get 4GB RAM VPSs with decent CPU and bandwidth for ~$6 or bare metal for ~$90 for 32GB RAM quad core worth using sites like serversearcher.com to compare.

railorsi 8/31/2025||

What’s the issue with running Postgres inside a docker container + regular backups? Never had problem and relatively easy to manage.

Biganon 9/1/2025|||

Why use a docker container? I run Postgres as is, what would I gain with running it in a container?

aflukasz 6 days ago|||

You can decouple Postgres and surrounding userspace upgrade cycles from your host os, if this is something that you want. Or run multiple different PG versions (have independent upgrades schedule) without being tied to the host os specific mechanisms for that.

Nextgrid 9/1/2025|||

It makes the whole thing is configured in a docker-compose file (or your raw Docker CLI invocation) + the data volume. So as long as you have those two things you can replicate it and move it to other hosts regardless of their distro.

Compare that with using your distro's packaged version where you can have version variations, variations in default config or file path locations, etc.

matt-p 8/31/2025|||

no PITB, but mostly just 'it's hassle' for the application server I literally don't need backups, just automated provisioning/docker container etc. Adding postgres then means I need full backups including PITB because I don't even want to lose an hours data.

doganugurlu 9/1/2025||

Or use SQLite and your backups are literally a copy of a file.

You can abuse git for it if you really want to cut corners.

vanviegen 9/1/2025||

Only if you can freeze your application for that long, in which case your statement is true for all non-broken databases.

markusw 9/2/2025|||

You can easily do consistent backup on live databases. There’s a backup command and API.

vanviegen 9/2/2025||

Sure. But then it's not "just a file" copy, like GP said.

wild_egg 9/1/2025|||

It only freezes your application if you've misconfigured it.

vanviegen 9/2/2025||

If you want to backup your database using just a file copy, you'd better freeze your database if you value your data. Or use a fancy snapshotting filesystem.

andersmurphy 8/31/2025||

If you're running on a single machine then you'll get way more performance with something like sqlite (instead of postgres/MySQL) which also makes managing the database quite trivial.

immibis 8/31/2025|||

SQLite has serious concurrency concerns which have to be evaluated. You should consider running postgres or mysql/mariadb even if it's on the same server.

SQLite uses one reader/writer lock over the whole database. When any thread is writing the database, no other thread is reading it. If one thread is waiting to write, new reads can't begin. Additionally, every read transaction starts by checking if the database has changed since last time, and then re-loading a bunch of caches.

This is suitable for SQLite's intended use case. It's most likely not suitable for a server with 256 hardware threads and a 50Gbps network card. You need proper transaction and concurrency control for heavy workloads.

Additionally, SQLite lacks a bunch of integrity checks, like data types and various kinds of constraints. And things like materialised views, etc.

SQLite is lite. Use it for lite things, not hevy things.

andersmurphy 9/1/2025|||

Not sure what you are talking about? In WAL mode (which is what you should be using) writes don't block reads and reads don't block writes. If you are connections pooling (which you should) the cache will stay hot.

Sqlite (properly configured) will outperform "proper databases" often by an order of magnitude in the context of a single box. You want a single writer for high performance as it lets you batch.

> 256 hardware threads...

Have you tried? I have. Others have too. [1]

> Additionally, SQLite lacks a bunch of integrity checks, like data types and various kinds of constraints. And things like materialised views, etc.

Sqlite has blobs so you can use your own custom encoding which is what you want in a high performance context.

Here's sqlite on a 5$ shared VPS that can handle 10000+ checks per second over a billion checkboxes [2]. You're gonna be fine.

- [1] https://use.expensify.com/blog/scaling-sqlite-to-4m-qps-on-a...

- [2] https://checkboxes.andersmurphy.com

hruk 9/1/2025||||

Agree on many things here, but SQLite does support WAL mode which supports 1 writer/N writer readers with snapshot isolation on reads. Writes are serialized but still quite fast.

SQLite (actually SQL-ite, like a mineral) maybe be light, but so are many workloads these days. Even 1000 queries per second is quite doable with SQLite and modest hardware, and I've worked at billion dollar businesses handling fewer queries than that.

wild_egg 9/1/2025||||

You know, it's ok to say that you're out of your element and don't have direct experience with the thing you're commenting on.

SQLite is easily the best scaling DB tech I've used. I've moved all my postgres workloads over to it and the gains have been incredible.

It's not a panacea and not the best in all cases but it's a very sane default that I recommend everyone start with and only complicate their stack with an external DB when they they start hitting real limits (often never happens)

immibis 9/1/2025||

> You know, it's ok to say that you're out of your element and don't have direct experience with the thing you're commenting on.

I moved several projects from sqlite to postgres because sqlite didn't scale enough for any of them.

andersmurphy 9/1/2025||

May I suggest you could have been holding it wrong?

The out of the box defaults for sqlite are terrible for web apps.

immibis 9/2/2025||

Are you aware of the irony behind saying "You're holding it wrong"? Do you know where that phrase came from?

Rohansi 9/1/2025|||

Is any SQL database suitable for 50GBps of network traffic hitting it?

Most if not all of your concerns with SQLite are simply a matter of not using the default configuration. Enable WAL mode, enable strict mode, etc. and it's a lot better.

rixed 9/1/2025|||

If you have a single request at a time and need little integrity checks.

KronisLV 8/31/2025||

Just today I wasted some time due to an unexpected Tailscale key expiry and some other issues related to running a container cluster: https://blog.kronis.dev/blog/the-great-container-crashout

Right now, my plan is to move from a bunch of separate VPSes, to one dedicated server from Hetzner and run a few VMs inside of it with separate public IPs assigned to them alongside some resource limits. You can get them for pretty affordable prices, if you don't need the latest hardware: https://www.hetzner.com/sb/

That way I can limit the blast range if I mess things up inside of a VM, but at the same time benefit from an otherwise pretty simple setup for hosting personal stuff, a CPU with 8 threads and 64 GB of RAM ought to be enough for most stuff I might want to do.

ehnto 9/1/2025|

That's the worst part of stringing a bunch of cloud together. Auth, keys, config, credentials expiring, logging back into everything all day. It smooths out the brain.

Give me a box, trust me with ssh keys and things are so much easier. Simple is good for the soul and the wallet.

decasia 8/31/2025||

Regardless of the cost and capacity analysis, it's just hard to fight the industry trends. The benefits of "just don't think about hardware" are real. I think there is a school of thought that capex should be avoided at all costs (and server hardware is expensive up front). And above all, if an AWS region goes down, it doesn't seem like your org's fault, but if your bespoke private hosting arrangement goes down, then that kinda does seem like your org's fault.

logifail 8/31/2025||

> and server hardware is expensive up front

You don't need to buy server hardware(!), the article specifically mentions renting from eg Hetzner.

> The benefits of "just don't think about hardware" are real

Can you explain on this claim, beyond what the article mentioned?

bearjaws 8/31/2025||

> Can you explain on this claim, beyond what the article mentioned?

I run a lambda behind a load balancer, hardware dies, its redundant, it gets replaced. I have a database server fail, while it re provisions it doesn't saturate read IO on the SAN causing noisy neighbor issues.

I don't deal with any of it, I don't deal with depreciation, I don't deal with data center maintenance.

Nextgrid 8/31/2025||

> I don't deal with depreciation, I don't deal with data center maintenance.

You don't deal with that either if you rent a dedicated server from a hosting provider. They handle the datacenter and maintenance for you for a flat monthly fee.

immibis 8/31/2025||

They do rely on you to tell them if hardware fails, however, and they'll still unplug your server and physically fix it. And there's a risk they'll replace the wrong drive in your RAID pair and you'll lose all your data - this happens sometimes - it's not a theoretical risk.

But the cloud premium needs reiteration: twenty five times. For the price of the cloud server, you can have twenty-five-way redundancy.

1dom 9/1/2025||

> And there's a risk they'll replace the wrong drive in your RAID pair and you'll lose all your data - this happens sometimes - it's not a theoretical risk.

A medium to large size asteroid can cause mass extinction events - this happens sometimes - it's not a theoretical risk.

The risk of the people responsible for managing the platform messing up and losing some of your data is still a risk in the cloud. This thread has even already had the argument "if the cloud provider goes down, it's not your fault" as a cloud benefit. Either cloud is strong and stable and can't break, or cloud breaks often enough that people will just excuse you for it.

immibis 9/1/2025|||

Many people have already had their data destroyed by remote hands replacing the wrong side of a RAID. Nobody's already had their server destroyed by a mass-extincting meteor.

namibj 9/1/2025|||

There's a reason semiconductor manufacturing is so highly automated, and it's not labor cost. Humans err. Computers only err when told. But they'll repeat a task reliably without random mistakes if told what to do by a competent (manufacturing process) engineering organization. Yes it takes more than one engineer.

swiftcoder 9/1/2025|||

> I think there is a school of thought that capex should be avoided at all costs

Yep, and it's mostly caused by the VC funding model - if your investors are demanding hockey-stick growth, there is no way in hell a startup can justify (or pay for) the resulting Capex.

Whereas a nice, stable business with near-linear growth can afford to price in regular small Capex investments.

marcosdumay 8/31/2025|||

> I think there is a school of thought that capex should be avoided at all costs (and server hardware is expensive up front).

Yes, there is.

Honestly, it looks to me that this school of thought is mostly adopted by people that can't do arithmetic or use a calculator. But it does absolutely exist.

That said, no, servers are not nearly expensive enough to move the needle on a company nowadays. The room that often goes around them is, and that's why way more people rent the room than the servers in it.

sam_lowry_ 8/31/2025||

Connectivity is a problem, not the room.

I ran the IT side of a media company once, and it all worked on a half-empty rack of hardware in a small closet... except for the servers that needed bandwidth. These were colocated. Until we realized that the hoster did not have enough bandwidth, at which point we migrated to two bare metal servers at Hetzner.

marcosdumay 8/31/2025||

It's connectivity, reliable power, reliable cooling, and security.

The actual space isn't a big deal, but the entire environment has large fixed costs.

sam_lowry_ 9/1/2025||

In abstract yeah.

In practice, all that except connectivity is relatively easy to have on-site.

Connectivity is highly dependent on the business location, local providers, their business plans and their willingness to go out of their way to serve the clients.

And I am not talking only about bandwidth, but also reserve lines and latency.

matt-p 8/31/2025|||

If you rent dedicated servers, then you're not worrying about any of the capex or maintenance stuff.

qaq 8/31/2025|||

the benefits of don't write a distributed system unless you really have to are also very real

ehnto 9/1/2025||

Exactly, same for microservices I feel. Why have enterprise org problems if you don't have an enterprise org.

ehnto 9/1/2025|||

I think you hit the nail on the head. What enterprise are paying for is abstraction of responsibility. Suits would never criticise going with Microsoft or Amazon.

grg0 9/1/2025|||

> if an AWS region goes down, it doesn't seem like your org's fault, but if your bespoke private hosting arrangement goes down, then that kinda does seem like your org's fault.

Never underestimate the price people are willing to pay to evade responsibility. I estimate this is a multi-billion dollar market.

wongarsu 8/31/2025|||

For anything up to about 128GB RAM you can still easily avoid capex by just renting servers. Above that it gets a bit trickier

IshKebab 8/31/2025|||

It's not like it's a huge capex for that level of server anyway. Probably less than the cost of one employee's laptop.

matt-p 8/31/2025|||

Renting (hosted) servers above 128GB RAM is still pretty easy, but I agree pricing levels out. 128GB RAM server ~$200/Month, 384 GB ~$580, 1024 GB ~$940/Month

decasia 8/31/2025||

To be clear - this isn't an endorsement on my part, just observations of why cloud-only deployment seems common. I guess we shouldn't neglect the pressure towards resume-oriented development either, as it undoubtedly plays a part in infra folks' careers. It probably makes you sound obsolete to be someone who works in a physical data center.

I for one really miss being able to go see the servers that my code runs on. I thought data centers were really interesting places. But I don't see a lot of effort to decide things based on pure dollar cost analysis at this point. There's a lot of other industry forces besides the microeconomics that predetermine people's hosting choices.

bob1029 8/31/2025|

This isn't even the end game for "one big server". AMD will give the most bang per rack, but there are other factors.

An IBM z17 is effectively one big server too, but provides levels of reliability that are simply not available in most IT environments. It won't outperform the AMD rack, but it will definitely keep up for most practical workloads.

If you sit down and really think honestly about the cost of engineering your systems to an equivalent level of reliability, you may find the cost of the IBM stack to be competitive in a surprising number of cases.

dardeaup 8/31/2025||

At what cost politically? I would expect political battles to be far more intense than any of the technical ones.

sgarland 9/1/2025||

That’s because 75% (citation: wild-ass estimate) of tech workers are incapable of critical thinking, and blindly parrot whatever they’ve heard / read. The number of times I’ve seen something on HN, thought “that doesn’t sound right,” and then spent a day disproving it locally is too damn high. Of course, by then no one gives a shit, and they’ve all moved on patting each other on the back about how New Shiny is better.

grg0 9/1/2025||

I do wish this field were more scientific and factual. Rather, it more closely resembles cults.

dardeaup 9/1/2025||

I agree. I always cringe when I see a job posting where they're wanting to hire a "passionate" xxx engineer. I always think to myself, "no, you really don't. you want to hire a dispassionate engineer who is objective". It's very difficult to be objective when you're passionate about something (especially a technology). And then what do you do with that passionate person when the organization gets rid of the technology that they're passionate about?

ETA - fixed spelling error

fock 9/1/2025||

no. In the short time I work at a z/OS-shop, they had to IPL twice. And the IPL takes ages...

Now, if you can live with the weird environment and your people know how to programm what is essentially a distributed system described in terms noone else uses: I guess it's still ok, given the competition is all executing IBMs playbook too.

p_l 9/1/2025||

Entire mainframe IPL, or just LPAR?

My understanding is that usually you subdivide into few LPARs and then reboot the production ones on schedule to prevent drift and ensure that yes, unplanned IPLs will work

More comments...