Postgres Postmaster does not scale

Posted by davidgu 2 days ago

Postgres Postmaster does not scale(www.recall.ai)

128 points | 90 commentspage 2

truekonrads 1 day ago|

Cool debugging, but… 1) if you have very spiky loads (on the hour) and you can distribute them a little it’s obvious that this is will be good thing. 2) they had the answer all along in their telemetry Sometimes wisdom beats effort

j16sdiz 1 day ago||

>... we run an unusual workload

ya, right. just make up some reason not following the best practices

atherton94027 2 days ago||

I'm a bit confused here, do they have a single database they're writing to? Wouldn't it be easier and more reliable to shard the data per customer?

hinkley 2 days ago||

When one customer is 50 times bigger than your average customer then sharding doesn't do much.

BatteryMountain 2 days ago|||

Combination of partitioning + sharding perhaps? Often times its is only a handful of tables that grows large, so even less so for a single large customer, thus sharding that customer out and then partitioning the data by a common/natural boundary should get you 90% there. Majority of data can be partitioned, and it doesn't have to be by date - it pays dividends to go sit with the data and reflect what is being stored, its read/write pattern and its overall shape, to determine where to slice the partitions best. Sometimes splitting a wide table into two or three smaller tables can work if your joins aren't too frequent or complex. Can also help if you can determine which of the rows can be considered hot or cold, so you move the colder/hotter data to a separate tables to speed up read/writes. There are always opportunities for storage optimization large datasets but it does take time & careful attention to get it right.

PunchyHamster 2 days ago|||

It does if you have thousands of customers.

hinkley 1 day ago||

Orders of complexity don’t give a tinker’s damn about how many other customers you have.

atsjie 2 days ago|||

I wouldn't call that "easier" perse.

thayne 2 days ago||

Sharding is often not easy. Depending on the application, it may add significant complexity to the application. For example, what do you do if you have data related to multiple customers? How do you handle customers of significantly different sizes?

And that is assuming you have a solution for things like balancing, and routing to the correct shard.

atherton94027 1 day ago|||

Presumably sharding is a lot easier than trying to debug lockups in individual postgres thread? It's well known, we've been doing it for at least 30+ years as an industry.

nextaccountic 1 day ago|||

deja vu

did you comment exactly the same things some months ago?

thayne 1 day ago||

Not that I recall

levkk 2 days ago||

One of the many problems PgDog will solve for you!

eatonphil 2 days ago|

The article addresses this, sort of. I don't understand how you can run multiple postmasters.

> Most online resources chalk this up to connection churn, citing fork rates and the pid-per-backend yada, yada. This is all true but in my opinion misses the forest from the trees. The real bottleneck is the single-threaded main loop in the postmaster. Every operation requiring postmaster involvement is pulling from a fixed pool, the size of a single CPU core. A rudimentary experiment shows that we can linearly increase connection throughput by adding additional postmasters on the same host.

btown 2 days ago|||

You don't need multiple postmasters to spawn connection processes, if you have a set of Postgres proxies each maintaining a set pool of long-standing connections, and parceling them out to application servers upon request. When your proxies use up all their allocated connections, they throttle the application servers rather than overwhelming Postgres itself (either postmaster or query-serving systems).

That said, proxies aren't perfect. https://jpcamara.com/2023/04/12/pgbouncer-is-useful.html outlines some dangers of using them (particularly when you might need session-level variables). My understanding is that PgDog does more tracking that mitigates some of these issues, but some of these are fundamental to the model. They're not a drop-in component the way other "proxies" might be.

evanelias 1 day ago|||

> I don't understand how you can run multiple postmasters.

I believe they're just referring to having several completely-independent postgres instances on the same host.

In other words: say that postgres is maxing out at 2000 conns/sec. If the bottleneck actually was fork rate on the host, then having 2 independent copies of postgres on a host wouldn't improve the total number of connections per second that could be handled: each instance would max out at ~1000 conns/sec, since they're competing for process-spawning. But in reality that isn't the case, indicating that the fork rate isn't the bottleneck.

eatonphil 1 day ago||

That makes sense, thanks.

vivzkestrel 2 days ago||

very stupid question: similar to how we had a GIL replacement in python, cant we replace postmaster with something better?

lfittl 2 days ago|

Specifically on the cost of forking a process for each connection (vs using threads), there are active efforts to make Postgres multi-threaded.

Since Postgres is a mature project, this is a non-trivial effort. See the Postgres wiki for some context: https://wiki.postgresql.org/wiki/Multithreading

But, I'm hopeful that in 2-3 years from now, we'll see this bear fruition. The recent asynchronous read I/O improvements in Postgres 18 show that Postgres can evolve, one just needs to be patient, potentially help contribute, and find workarounds (connection pooling, in this case).

jabl 2 days ago||

Would be nice if the OrioleDB improvements were to be incorporated in postgresql proper some day.. https://www.slideshare.net/slideshow/solving-postgresql-wick...

iamleppert 1 day ago||

Why do you need a connection to a database during the meeting? Doesn't it make more sense to record the meeting data to some local state first, and then serialize it to database at the end of the meeting or when a database connection is available? Or better yet, have a lightweight API service that can be scaled horizontally that is responsible for talking to the database and maintains its own pool of connections.

They probably don't even need a database anyway for data that is likely write once, read many. You could store the JSON of the meeting in S3. It's not like people are going back in time and updating meeting records. It's more like a log file and logging systems and data structures should be enough here. You can then take that data and ingest it into a database later, or some kind of search system, vector database etc.

Database connections are designed this way on purpose, it's why connection pools exist. This design is suboptimal.

xyzzy_plugh 1 day ago|

It took me a long time to realize this but yes asking people to just open and write to files (or S3) is in fact asking a lot.

What you describe makes sense, of course, but few can build it without it being drastically worse than abusing a database like postgres. It's a sad state of affairs.

moomoo11 2 days ago||

maybe this is silly but these days cloud resources are so cheap. just loading up instances and putting this stuff into memory and processing it is so fast and scalable. even if you have billions of things to process daily you can just split if needed.

you can keep things synced across databases easily and keep it super duper simple.

carshodev 2 days ago||

Yeah you can get an AMD 9454P with 1TB of memory and 20TB of redundant NVME storage for like 1000$ a month, its crazy how cheap compute and storage is these days.

If people are building things which actually require massive amounts of data stored in databases they should be able to charge accordingly.

sgt 2 days ago||

It's not really my experience that cloud resources are very cheap.

PunchyHamster 2 days ago|||

They are expensive compared to buying server and running it 24/7

They are cheap if you tiny fraction of server use for $20/mo or have 50 engineers working on code

moomoo11 1 day ago|||

I guess I’m talking about a company actually making money.

I would much rather spend 5k per month to make 1 million, keeping things extremely simple.

clarity_hacker 2 days ago||

[dead]

ltxdsf 2 days ago||

[dead]

parentheses 2 days ago|

I think this is the kind of investigation that AI can really accelerate. I imagine it did. I would love to see someone walk through a challenging investigation assisted by AI.