Top
Best
New

Posted by pavel_lishin 3 days ago

Go ahead, self-host Postgres(pierce.dev)
671 points | 395 commentspage 3
jsight 2 days ago|
I often find it sad how many things that we did, almost without thinking about them, that are considered hard today. Take a stroll through this thread and you will find out that everything from RAID to basic configuration management are ultrahard things that will lead you to having a bus factor of 1.

What went so wrong during the past 25 years?

Beltiras 2 days ago||
I've had my hair on fire because my app code shit the bed. I've never ever (throughout 15 years of using it in everything I do) had to even think about Postgres, and yes, I always set it up self-hosted. The only concern I've had is when I had to do migrations where I had to upgrade PG to fit with upgrades in the ORM database layer. Made for some interesting stepping-stone upgrades once in a while but mostly just careful sysadmining.
cosmodust 2 days ago||
I would suggest if you do host your database yourself consider taking the data seriously. Few easy solutions are using a multi zonal disk [1] with scheduled automatic snapshots [2].

[1] https://docs.cloud.google.com/compute/docs/disks/hd-types/hy... [2] https://docs.cloud.google.com/compute/docs/disks/create-snap...

cube00 1 day ago||
Scheduled automatic snapshots are not the kind of consistent snapshots you need for a filesystem based backup.
cosmodust 1 day ago||
Snapshots might break ACID for last few transactions but it will flush all in-memory writes before taking the freeze. Consider its 1 click solution, its good enough than losing everything?
nhumrich 3 days ago||
What do you postgres self hosters use for performance analysis? Both GCP-SQL and RDS have their performance analysis pieces of the hosted DB and it's incredible. Probably my favorite reason for using them.
cuu508 3 days ago||
I use pgdash and netdata for monitoring and alerting, and plain psql for analyzing specific queries.
sa46 3 days ago||
I’ve been very happy with Pganalyze.
wreath 3 days ago||
> Take AWS RDS. Under the hood, it's:

    Standard Postgres compiled with some AWS-specific monitoring hooks
    A custom backup system using EBS snapshots
    Automated configuration management via Chef/Puppet/Ansible
    Load balancers and connection pooling (PgBouncer)
    Monitoring integration with CloudWatch
    Automated failover scripting
I didn't know RDS had PgBouncer under the hood, is this really accurate?

The problem i find with RDS (and most other managed Postgres) is that they limit your options for how you want to design your database architecture. For instance, if write consistency is important to you want to support synchronous replication, there is no way to do this in RDS without either Aurora or having the readers in another AZ. The other issue is that you only have access to logical replication, because you don't have access to your WAL archive, so it makes moving off RDS much more difficult.

mystifyingpoi 3 days ago|
> I didn't know RDS had PgBouncer under the hood

I don't think it does. AWS has this feature under RDS Proxy, but it's an extra service and comes with extra cost (and a bit cumbersome to use in my opinion, it should have been designed as a checkbox, not an entire separate thing to maintain).

Although, it technically has "load balancer", in form of a DNS entry that resolves to a random reader replica, if I recall correctly.

zsoltkacsandi 3 days ago||
I've operated both self-hosted and managed database clusters with complex topologies and mission-critical data at well-known tech companies.

Managed database services mostly automate a subset of routine operational work, things like backups, some configuration management, and software upgrades. But they don't remove the need for real database operations. You still have to validate restores, build and rehearse a disaster recovery plan, design and review schemas, review and optimize queries, tune indexes, and fine-tune configuration, among other essentials.

In one incident, AWS support couldn't determine what was wrong with an RDS cluster and advised us to "try restarting it".

Bottom line: even with managed databases, you still need people on the team who are strong in DBOps. You need standard operating procedures and automation, built by your team. Without that expertise, you're taking on serious risk, including potentially catastrophic failure modes.

Nextgrid 3 days ago|
I've had an RDS instance run out of disk space and then get stuck in "modifying" for 24 hours (until an AWS operator manually SSH'd in I guess). We had to restore from the latest snapshot and manually rebuild the missing data from logs/other artifacts in the meantime to restore service.

I would've very much preferred being able to SSH in myself and fix it on the spot. Ironically the only reason it ran out of space in the first place is that the AWS markup on that is so huge we were operating with little margin for error; none of that would happen with a bare-metal host where I can rent 1TB of NVME for a mere 20 bucks a month.

As far as I know we never got any kind of compensation for this, so RDS ended up being a net negative for this company, tens of thousands spent over a few years for laptop-grade performance and it couldn't even do its promised job the only time it was needed.

ergonaught 3 days ago||
> Self-hosting a database sounds terrifying.

Is this actually the "common" view (in this context)?

I've got decades with databases so I cannot even begin to fathom where such an attitude would develop, but, is it?

Boggling.

Nextgrid 3 days ago||
Over a decade of cloud provider propaganda achieves that. We appear to have lost the basic skill of operating a *nix machine, so anything even remotely close to that now sounds terrifying.

You mean you need to SSH into the box? Horrifying!

gnusi 2 days ago||
Can't agree more.
npn 2 days ago||
> I sleep just fine at night thank you.

I also self-host my webapp for 4+ years. never have any trouble with databases.

pg_basebackup and wal archiving work wonder. and since I always pull the database (the backup version) for local development, the backup is constantly verified, too.

vbezhenar 2 days ago||
I don't feel like it's easy to self-host postgres.

Here are my gripes:

1. Backups are super-important. Losing production data just is not an option. Postgres offers pgdump which is not appropriate tool, so you should set up WAL archiving or something like that. This is complicated to do right.

2. Horizontal scalability with read replicas is hard to implement.

3. Tuning various postgres parameters is not a trivial task.

4. Upgrading major version is complicated.

5. You probably need to use something like pgbouncer.

6. Database usually is the most important piece of infrastructure. So it's especially painful when it fails.

I guess it's not that hard when you did it once and have all scripts and memory to look back. But otherwise it's hard. Clicking few buttons in hoster panel is much easier.

npn 2 days ago||
wal archiving is piss easy. you can also just use basebackup. with postgres 17 it is easier than ever with incremental backup feature.

you don't need horizontal scalability when a single server can have 384 cpu real cores, 6TB of ram, some petabytes of pcie5 ssd, 100Gbps NIC.

for tuning postgres parameters, you can start by using pgtune.leopard.in.ua or pgconfig.org.

upgrading major version is piss easy since postgres 10 or so. just a single command.

you do not need pgbouncer if your database adapter library already provide the database pool functionality (most of them do).

for me maintained database also need that same amount of effort, due to shitty documents and garbage user interfaces (all aws, gcp or azure is the same), not to mention they change all the time.

tonyhart7 2 days ago|||
"all scripts and memory to look back. But otherwise it's hard. Clicking few buttons in hoster panel is much easier."

so we need open source way to do that, coolify/dokploy comes to mind and it exactly do that way

I would say 80% of your point wouldnt be hit at certain scale, as most application grows and therefore outgrow your tech stack. you would replace them anyway at some point

nrhrjrjrjtntbt 2 days ago||
Scaling to a different instance size is also easy on AWS.

That said a self hosted DB on a dedicated Hetzner flies. It does things at the price that may save you time reworking your app to be more efficient on AWS for cost.

So swings and roundabouts.

lukaslalinsky 2 days ago|
I'm not a cloud-hosting fan, but comparing RDS to a single instance DB seems crazy to me. Even for a hobby project, I couldn't accept losing data since the last snapshot. If you are going to self-host PostgreSQL in production, make sure you have at least some knowledge how to setup streaming replication and have monitoring in place making sure the replication works. Ideally, use something like Patroni for automatic failover. I'm saying this a someone running fairly large self-hosted HA PostgreSQL databases in production.
tgtweak 2 days ago|
RDS is not, by default, multi-instance and multi-region or fault tolerant at all - you choose all of that in your instance config. The amount of single-instance single-region zero-backup RDS setup's I've seen in the wild is honestly concerning. Do Devs think an RDS instance on it's own without explicit configuration is fault tolerant and backed-up? If you have an ec2 instance with EBS and auto-restart you have almost identical fault tolerance (yes there are some slight nuances on RDS regarding recovery following a failure).

Just found that assumption a bit dangerous. The ease with which you can set that up is easy on RDS but it's not on by default.

More comments...