Do you even need a database?

Posted by upmostly 4 days ago

Do you even need a database?(www.dbpro.app)

289 points | 293 commentspage 3

ghc 3 days ago|

I'm so old I remember working on databases that were designed to use RAW, not files. I'm betting some databases still do, but probably only for mainframe systems nowadays.

bob1029 3 days ago|

https://docs.oracle.com/cd/B16276_01/doc/win.102/b14305/arch...

ghc 3 days ago||

> Oracle® Database Platform Guide 10g Release 2 (10.2) for Microsoft Windows Itanium (64-Bit)

Well, I guess that at least confirms Oracle on Itanium (!?) still supported RAW 5 years ago.

I'm guessing everyone's on ASM by now though, if they're still upgrading. I ran into a company not long ago with a huge oracle cluster that still employed physical database admins and logical database admins as separate roles...I would bet they're still paying millions for an out of date version of Oracle and using RAW.

evanelias 3 days ago||

> still supported RAW 5 years ago

I seem to remember Oracle 10g was first released over 20 years ago? It has been EOL for much longer than 5 years...

ghc 2 days ago||

Oh you're right! I was looking at the last documentation update timestamp, but the original release was 2006. That makes a lot more sense than Itanium support in 2021.

rglover 3 days ago||

A few months back I decided to write an embedded db for my firm's internal JS framework. Learned a lot about how/why databases work the way they do. I use stuff like reading memory cached markdown files for static sites, but there are certain things that a database gives you (chief of which for me was query ergonomics—I loved MongoDB's query language but grew too frustrated with the actual runtime) that you'll miss once you move past a trivial data set.

I think a better way to ask this question is "does this application and its constraints necessitate a database? And if so, which database is the correct tool for this context?"

tracker1 3 days ago|

For me, I just wish MongoDB had scaling options closer to how Elatic/Cassandra and other horizontally scalable databases work, in that the data is sharded in a circle with redundancy metrics... as opposed to Mongo, which afaik is still limited to either sharding or replication (or layers of them). FWIW, I wish that RethinkDB had seen more attention and success and for that matter might be more included to use CockroachDB over Mongo, where I can get some of the scaling features while still being able to have some level of structured data.

orthogonal_cube 3 days ago||

SQLite did decently well but I think they should’ve done an additional benchmark with the database loaded completely into memory.

Since they’re using Go to accept requests and forwarding them to their SQLite connection, it may have been worthwhile to produce the same interface with Rust to demonstrate whether or not SQLite itself was hitting its performance limit or if Go had some hand in that.

Other than that, it’s a good demonstration of how a custom solution for a lightweight task can pay off. Keep it simple but don’t reinvent the wheel if the needs are very general.

waldrews 3 days ago||

File systems are nice if you need to do manual or transparent script-based manipulations. Like 'oh hey, I just want to duplicate this entry and hand-modify it, and put these others in an archive.' Or use your OS's access control and network sharing easily with heterogeneous tools accessing the data from multiple machines. Or if you've got a lot of large blobs that aren't going to get modified in place.

What the world needs is a hybrid - database ACID/transaction semantics with the ability to cd/mv/cp file-like objects.

gavinray 3 days ago||

Not to nitpick, but it would be interesting to see profiling info of the benchmarks

Different languages and stdlib methods can often spend time doing unexpected things that makes what looks like apples-to-apples comparisons not quite equivalent

matja 3 days ago||

If you think files are easier than a database, check out https://danluu.com/file-consistency/

inasio 3 days ago||

There's a whole thing this days about building solvers (e.g. SAT or Ising) out of exotic hardware that does compute in memory. A while back I wondered if one could leverage distributed DB logic to build solvers for massive problems, something like compute in DB.

a34729t 3 days ago||

I sympathize with this so hard. I frequently conduct system design interviews where the problem could easily be handled on a single machine with a flat file, let alone sql lite. Only the rare candidate mentions this; mostly I get a horde of microservices and queues and massive distributed databases that are totally unneccessary.

pdimitar 3 days ago|

I mostly agree but I had plenty of cases where the project / team was forced to reinvent a query engine over flat files and/or in-memory caches.

From that POV I moved from the extreme of "you don't need a state at all (and hence no database)" to the bit more moderate "you usually don't need a file or a DB but you almost certainly will want to query whatever state you store so just get a DB early".

I strongly sympathize with "no microservices" of course. That's an overkill for at least 99% of all projects I've ever seen (was a contractor for a long time). But state + querying is an emergent property as a lot of projects move beyond the prototype phase.

thutch76 3 days ago|

I love reading posts like these.

I will still reach for a database 99% or the time, because I like things like SQL and transactions. However, I've recently been working on a 100% personal project to manage some private data; extracting insights, graphing trends, etc. It's not high volume data, so I decided to use just the file system, with data backed at yaml files, with some simple indexing, and I haven't run into any performance issues yet. I probably never will at my scale and volume.

In this particular case having something that was human readable, and more importantly diffable, was more valuable to me than outright performance.

Having said that, I will still gladly reach for a database with a query language and all the guarantees that comes with 99% of the time.

More comments...