Posted by Imustaskforhelp 2 days ago
But, yes, PostgreSQL is all I ever use for anything that needs to be big. I ported a big old web app that had ScyllaDB, Elastic Search, Redis, and probably some other stuff I've forgotten. It got PostgreSQL+PostGIS (it's a mapping app), that's it. I'm sure there's some situation where it would be worth looking at all that other stuff, but it's ridiculous to build all that complexity in before you even have users.
which while are an interesting problem to think through because well, it makes me feel smarter knowing these databases. I still felt like postgresql was enough, (at best something like clickhouse or redis combined with postgresql might be more than enough for many things)
But the fact of the matter is that sometimes you don't even need postgresql. I actually just deploy sqlite apps in golang/rust, a website that I had made that is a single binary: https://mirror.forum
Sqlite is more than enough as well and there are a lot of people pushing sqlite above its weight as well (Turso and the likes)
but its nice knowing that Sqlite/postgresql are more than enough for many use cases. Both are really amazing for a lot of things :-D
Its amazing how we've for the most part atleast now simplified the architecture to have good options. Like using Sqlite and especially Postgresql in the start is a decision that you would rarely go wrong at. I personally feel like although I use sqlite a lot more in projects because i value the simplicity it gives but postgresql also holds a somewhat good relationship in my heart :-D
I personally believe (notice the caveat here) that debugging relatively well designed databases on a good RDBMS is an easy task, if given access to logs, data (or even a redacted replica), and docs. Choose the docs along with one of the first two and it shouldnt remain a mystery for long. I've worked with postgresql a lot over the last couple decades, and even the most convoluted live-in-prod bug isn't wasn't half as bad as the vibe coded nonsense I see and debug weekly (not hating on AI; fwiw this is more a symptom of process/lack there of around integration of the tech IMO).
Kafka? No one wants to operate Kafka, if it's a serious contender it's because you need things only it can do. Same with Elasticsearch, it sucks to operate, sucks to build a second stack just for search, so you'd only consider it at the point that Postgres is no longer suitable. Same with Snowflake.
People reach for the things you mentioned wayyyyyy before they should.
Just because you want something queue shaped or search, doesn't mean you should reach for the big, specialized, expensive technology, when Postgres can already support it in your existing infrastructure up until some significant scale you often won't surpass.
Most orgs are probably due for a technical reassessment. Scale that I believe would have melted postgres in the past now just runs fine. CPUs got a ton faster and memory got cheaper.
Projects that would have required a complex distributed setup 10 years ago can now be handled by a single machine. For a lot of use cases, CPUs got faster than population growth.
A well-tuned Postgres installation on fast hardware and intelligent schema design can scale incredibly far, even if you’re asking it to double as a message bus and full-text search tool.
I have seen many cases of people decrying the use of some of these without understanding the requirements though. People with strong opinions like to make up their own requirements without necessarily listening to the business, considering the team, considering the existing solutions, etc.
Requiring HA, partitioning, and replication are good problems to have.
The alternative is spending engineering time on setting all these up for a failed service with like 100 users.
As for scale... Just use a larger machine. This works for regular transactional data until you're at something like Amazon scale.
Edit:
Think about this, suppose that you store 1 megabyte of data for each of your customers. So if you have a million customers, it's just 1Tb. And these days, you can have a server with 10Tb RAM delivered overnight. Although you might have to sell your firstborn son (offer applies only to royal families) to fund it.
A lot of sharding/no-sql/... development happened in the late 2000-s when computers were about ~100 times less powerful than now. You _could_ get a system with 10Tb RAM in 2010, but as a specially-designed supercomputer.
For most other problems, Postgres works well, or at least well enough.
I really don’t understand why everyone insists that you should use it as a work/message queue.
There are lots of purpose-built bullet proof queuing systems that are simple to setup and administer (or just use SQS).
Your queue is likely to have very different access patterns than the rest of your data, and sticking it in Postgres means you’re probably going to end up setting up partitions or optimizing auto-vacuum on that table way earlier than you probably need to mess around with this things in your scaling.
If your queue has more than a few hundred jobs a day (or you anticipate that like anytime soon), just use a queue.
Because in 99% of cases you don't need a purpose built solution (Even if engineers often think that) at the scale that most people operate in. Nothing is easier to setup and administer than the database you already use.
We are using Postgres as a worker queue in production for many years, with millions of items being processed at any time and it's been perfectly fine. If you have hundreds like in your example...might as well use sqlite.
There's great projects like https://github.com/NikolayS/pgque and https://lucumr.pocoo.org/2026/4/4/absurd-in-production/ that give you even some tooling around that.
What about the filesystem you already use? 99% of projects don't need a relational database at all.
Using Postgres gives you transactions and consistency if you have to restore from a backup. Most of the time this doesn't matter (and is a liability) and you can just use some external queue system but sometimes it does matter.
I'm thinking of the problem as using a small amount of text to represent the work that needs to be done and then using a postgres table where some entries are being added as work that needs to be done, and then a worker is pulling the rows of work out of that table, and maybe putting a completion message somewhere in postgres. I'll concede that is more transient data than probably most of the other tables, it might benefit from vacuuming more often. Does the autovacuuming system not figure out it needs to run more often and do it?
Wouldn't the issue would be more overall queries per second, the amount of writes you're already doing, and the general load on the database. We just added some audit tables that are quickly growing to millions of rows, and it seems like Postgres isn't even breaking a sweat. I'm mostly spit balling here and probably glossing over some details.
But, like you said SQS is pretty easy too.
A db can be performant, but at a certain point the global locks incremental primary keys create just strangle throughput. What makes a good db design normal form, is almost guaranteed to be inefficient at scale sooner or later. =3
You have pushed your entire infrastructure into a single failure domain, for one thing. You make it certain that if your database fails, EVERYTHING fails.
In addition, there is resource contention and workload variability. As you start to push your postgres instance, all the workloads hitting your database are going to be competing for resources. While postgres itself is good at parallelizing the work it is doing, all that work is still going to be hitting the same database, and competing for the same kcache. Your entire infrastructure might degrade in performance at the same time.
Any issue with one component can cascade very easily if they all share the same database. If your login functionality has a bug and is creating churn on your database, it can lock everything.
With multiple databases, you have a much smaller blast radius when you do database operations. You isolate your workloads and can independently scale them.
Admittedly, all of these issues occurred at a place that had high traffic and high availability requirements. Honestly, though, if your load is so low that you never feel infrastructure pressure, it probably doesn't matter what strategy you use.
I had always thought that trading firms might be some of the sole exceptions of over-engineering as I had went into the rabbit-holes of FPGA's and found that some of the use-cases of that technology was trading firms using it to literally shave off even a few milliseconds and so the amount of optimization there
So I am a bit curious how postgres/some dotnet services and SQS queues is able to create a trading firm and scale it and even sell it for 140M$. It feels like my understanding of the situation was a bit uncertain then but could you please elaborate more if possible?
Why do I need to push Postgres to its limits before using a different solution? Throwing a hosted Redis in front of some hot-path API calls is very straightforward and easier to reason about than materialized views or UNLOGGED tables.
You already have a connection string to your database with a password or authn/z with your cloud provider. If this is a "serious" application, you have backups, monitoring, user roles, pgbouncer, partitioning, and other Postgres-specific things to think about. With just a little bit of care, you can make whatever queries you are running fast enough to not need redis.
But ok, you think adding redis is going to solve your performance problem because you can just cache API responses in redis instead of hitting the DB. Maybe, but now you have to think about cache invalidation, eviction behavior, sizing the redis instance, another set of authn/z roles to think about, and of course more cost.
I realize we're speaking past each other, but IME Postgres will work well into the terabyte range and if you can't tune your database setup for performance then reaching for cache is a form of premature optimization.
For redis, it seems there's no "out of the box" way to take some data from my DB and cache it. It seems it needs to be hand rolled per query you're optimising, you lose any structural link to the source data (redis doesn't know about my table structure), and now I have another service I need to worry about. Or is it much easier nowadays than I am thinking?
cached_val = redis.get(MY_CACHED_KEY);
if (cached_val) return cached_val;
db_val = pg.select(...);
redis.set(MY_CACHED_KEY, db_val, ttl=60s);
return db_val;
Imo this is much easier to grok/reason about than materialized views, and has the added benefits of: it is much faster than your db query if cache is set, there is some amount of robustness if your db is temporarily unavailable, you don't need to worry about "is my cache being backed up and costing me money because its part of my persistent db", etc
Basically, the point is: if you are using some hyperscaler to host your stuff (most people), then you already have trivial access to other services beyond postgres, and shoving everything into postgres might not actually be easier than using things that were purpose-built for your problems-that-are-not-actually-the-shape-of-a-persistent-relational-db.
Fwiw I love Postgres.
Here's Malcom Gladwell discussing spagetti sauce which feels oddly relevant: https://youtu.be/iIiAAhUeR6Y?si=UJUUiF6H0j6IY3lL