Top
Best
New

Posted by charleshn 12/20/2025

What Does a Database for SSDs Look Like?(brooker.co.za)
148 points | 121 commentspage 2
firesteelrain 12/20/2025|
At first glance this reads like a storage interface argument, but it’s really about media characteristics. SSDs collapse the random vs sequential gap, yet most DB engines still optimize for throughput instead of latency variance and write amplification. That mismatch is the interesting part
Havoc 12/20/2025||
I'm a little bit surprised enterprise isn't sticking to optane for this. It's EoL tech at this point, but it'll still smoke top of the line nvmes for small Q1 which I'd think you'd want for some databases.
hyperman1 12/20/2025||
Postgres allows you to choose a different page size (at initdb time? At compile time?). The default is 8K. I've always wondered if 32K wouldn't be a better value, and this article points in the same direction.
taffer 12/20/2025|
On the other hand, smaller pages mean that more pages can fit in your CPU cache. Since CPU speed has improved much more than memory bus speed, and since cache is a scarce resource, it is important to use your cache lines as efficiently as possible.

Ultimately, it's a trade-off: larger pages mean faster I/O, while smaller pages mean better CPU utilisation.

didgetmaster 12/20/2025||
Back in college (for me the 80s), I learned that storing table data in rows would greatly increase performance due to high seek times on hard disks. SELECT * FROM table WHERE ... could read in the entire row in a single seek. This was very valuable when your table has 100 columns.

However; a different query (e.g. SELECT name, phone_number FROM table) might result in fewer seeks if the data is stored by column instead of by row.

The article only seems to address data structures with respect to indexes, and not for the actual table data itself.

ksec 12/20/2025||
It may be worth pointing out, current highest capacity EDSFF drive offers ~8PB in 1U. That is 320PB per rack, and current roadmaps in 10 years time up to 1000+ PB or 1EB per rack.

Design Database for SSD would still go a very very long way before what I think the author is suggesting which is designing for cloud or datacenter.

dbzero 12/20/2025||
Please give a try to dbzero. It eliminates the database from the developer's stack completely - by replacing a database with the DISTIC memory model (durable, infinite, shared, transactional, isolated, composable). It's build for the SSD/NVME drive era.
danielfalbo 12/20/2025||
Reminds me of: Databases on SSDs, Initial Ideas on Tuning (2010) [1]

[1] https://www.dr-josiah.com/2010/08/databases-on-ssds-initial-...

sscdotopen 12/20/2025||
Umbra: A Disk-Based System with In-Memory Performance, CIDR'20

https://db.in.tum.de/~freitag/papers/p29-neumann-cidr20.pdf

cmrdporcupine 12/20/2025|
Yep, and the work on https://www.cs.cit.tum.de/dis/research/leanstore/ that preceded it.

And CedarDB https://cedardb.com/ the more commercialized product that is following up on some of this research, including employing many of the key researchers.

gethly 12/20/2025||
> I’d move durability, read and write scale, and high availability into being distributed

So, essentially just CQRS, which is usually handled in the application level with event sourcing and similar techniques.

ritcgab 12/20/2025|
SSDs are more of a black box per se. FTL adds another layer of indirection and they are mostly proprietary and vendor-specific. So the performance of SSDs are not generalizable.
More comments...