Top
Best
New

Posted by malgamves 14 hours ago

Files are the interface humans and agents interact with(madalitso.me)
167 points | 103 commentspage 3
TacticalCoder 10 hours ago|
As TFA basically says: files on a filesystem is a DB. Just a very crude one. There aren't nice indexes for a variety of things. "Views" are not really there (arguably you can create different views with links but it's, once again, very crude). But it's definitely a DB, represented as a tree indeed as TFA mentions.

My life's data, including all the official stuff (bank statements, notary acts, statements made to the police [witness, etc.], insurance, property titels), all my coding projects, all the family pictures (not just the ones I took) and all the stuff I forgot, is in files, not in a dedicated DB. But these files are a definitely a database.

And because I don't want to deal with data corruption and even less want to deal with synching now corrupted data, many of my files contains, in their filename, a partial cryptographic checksum. E.g. "dsc239879879.jpg" becomes "dsc239789879-b3-6f338201b7.jpg" (meaning the Blake3 hash of that file has to begin with 6f338201b7 or the file is corrupted).

At any time, if I want to, I can import these in "real" dedicated DBs. For example I can pass my pictures as a read-only to "I'm Mich" (immich) and then query my pictures: "Find me all the pictures of Eliza" or "Find me all the pictures taken in 2016 on the french riviera".

But the real database of my all my life is and shall always be files on a filesystem.

With a "real" database, a backup can be as simple as a dump. With files backuping involve... Making sure you keep a proper version of all your files.

I'd say files are even more important than the filesystem: a backup on a BluRay disc or on an ext4-formatted SSD or on an exfat formatted SSD or on a tape... Doesn't matter: the files are the data.

A filesystem is the first "database" with these data: a crude one, with only simple queries. But a filesystem is definitely a database.

The main advantage of this very simple database is that as long as the data are accessible, you know your data is safe and can always use them to populate more advanced databases if needed.

euroderf 6 hours ago||
It's not "crude" if you get hierarchical organization without having to screw around with RECURSIVE, or "closure this" and "closure that". It just works.
rzerowan 7 hours ago|||
Were it more portable BeOS/Haiku's BeFS would have been a perrfect fit in this instance.Seeing that it is a filesystem thah has database properties via extended attributes[1] and indexing.

Were Haiku mor mature/stable would have been a nice fit for the OS for the LLM/Ai personal use cases.

[1] https://arstechnica.com/information-technology/2018/07/the-b...

ciupicri 8 hours ago|||
Why Blake3 and not say XXH3 64/128 bits (https://xxhash.com/)?
heavyset_go 8 hours ago||
You can get views by using namespaces/cgroups
istillwritecode 8 hours ago||
Except android and iOS are both trying to keep you away from your own files.
Gigachad 3 hours ago|
Kind of? iOS does have a file manager which explicitly shows you your own files. They just made a separation between OS/Program files vs the users own files. What more killed files was cloud programs where multiple users can edit at the same time which required a system that was more sophisticated than syncing a file.
jnsaff2 7 hours ago||
Here’s me getting excited that a new file system is being developed but alas, just talk about text files.
galsapir 10 hours ago||
nice, esp. liked - "our memories, our thoughts, our designs should outlive the software we used to create them"
SoftTalker 5 hours ago|
Weird. My memories and thoughts are not created by software.
jonstewart 9 hours ago||
It reminds me a lot of Hans Reiser’s original white paper, which can be found at https://web.archive.org/web/20070927003401/http://www.namesy.... Add some embeddings and boom.
fogzen 5 hours ago||
Does this really have to do with file systems? Replacing RAG/context stuffing with tool calls for data access seems like the actual change. Whether the tool call is backed by a file system or DB or whatever shouldn’t matter, right?
naaqq 10 hours ago||
This article said some things I couldn’t put into words about different AI tools. Thanks for sharing.
BoredPositron 8 hours ago||
I revived my Johnny Decimal system as my single source of truth for almost anything and couldn't be happier. The filing is done mostly by agents now but I still have the overview myself.
ciupicri 8 hours ago|
Could you give us more details about your system?
rafaepta 9 hours ago||
Great read. Thanks for sharing
staplung 6 hours ago|
Not knocking the article in any way but from the headline I was expecting - perhaps hoping - this would be about some innovation in filesystems research like it was the 90's again. That's not what this is.

It's about how filesystems as they are (and have been for decades) are proving to be powerful tools for LLMs/agents.

alecco 4 hours ago||
And by filesystem they mean CLI (command line interface) and a full *nix system. Like the hundreds of similar articles about it for the past year said.
Gigachad 3 hours ago|||
I feel like every article on HN now disguises itself as interesting but the content is just the same boring AI slop.
palata 3 hours ago||
I have been reading HN for a few years, and my feeling is that I find fewer and fewer interesting articles. Maybe it's just me, and the average articles are the same quality.

Now I tend to skim through it to see if a title looks like it may bring interesting discussions, and then I skim through the discussions. Because there are very knowledgeable people who sometimes share valuable insights.

Interestingly, last time I asked a question, hoping to get interesting people to share insights, I was answered that I "should learn how to use an LLM instead of asking questions" :-).

fragmede 5 hours ago|||
Yeah, none of it was really about file systems. There was a brief mention that file systems look like a graph, and that you build roughly an index so it looks graph and thus database-y, but you could store it all in a sqlite database with a column, called filename and a column called content for all the details about file systems this post went into. I too was expecting something more in depth about file systems like for instance, cluster file systems have made a little to no advancement. ZFS is not a cluster file system and we've been needing a good one of those for decades, ever since VM's became feasible on consumer grade hardware. Still, files on desk is better than having to pay Oracle a fee per-skill on today's modern, open Internet. That was never going to happen.
mangogogo 6 hours ago||
i was hoping the same, but then it turned out to be another article about LLMs.
More comments...