Posted by malgamves 11 hours ago
It's about how filesystems as they are (and have been for decades) are proving to be powerful tools for LLMs/agents.
Now I tend to skim through it to see if a title looks like it may bring interesting discussions, and then I skim through the discussions. Because there are very knowledgeable people who sometimes share valuable insights.
Interestingly, last time I asked a question, hoping to get interesting people to share insights, I was answered that I "should learn how to use an LLM instead of asking questions" :-).
IMO it's insulting to the audience, it says your time and attention is not worthy of the author's own time and attention spent putting their own thoughts in their own words.
If you're going to do that at least mention it's LLM output or just give me your outline prompts. I don't care what your LLM has to say, I'm capable of prompting your outline in my own model myself if I feel like it.
Yes, this! Please label AI generated content. Pull request written by an AI? Label it as ai generated. Blog post? Article generated with AI? Say so! It’s ok to use AI models. Especially if English is your second language. But put a disclaimer in. Don’t make the reader guess.
Eg:
> This content was partially generated by chatgpt
Or
> Blog post text written entirely by human hand, code examples by Claude code
It is easy to spot the compacted token distribution unique to each model, but search engines still seem to promote nonsense content. =3
"Bad Bot Problem - Computerphile"
https://www.youtube.com/watch?v=AjQNDCYL5Rg
"A Day in the Life of an Ensh*ttificator "
The problem I think with AI generated posts is that you feel like you can't trust the content once it's AI. It could be partly hallucinated, or misrepresented.
> That's not a technical argument. It's a values argument. And it's one that the filesystem, for all its age and simplicity, is uniquely positioned to serve. Not because it's the best technology. But because it's the one technology that already belongs to you.
That's a bit vague. Was the article written without the aid of LLMs? Yes or no.
Are you saying this post is a few edits away from becoming a New York Times bestseller?
But you're right, it did hit the front page, and that says more about my sensibilities not lining up with whoever is voting the article up.
It's not a website you go to — it's a little spirit that lives on your machine.
Not a chatbot. A tool that reads and writes files on your filesystem.
That's not a technical argument. It's a values argument."
There are a lot of unique aspects of the writing in this post that LLMs don't typically generate on their own.
And there's not a "delve" or "tapestry" or even a bullet point to be found.
Also, accusations and complaints like this are off-topic and uninteresting.
We should be talking about filesystems here, not your gut instinct AI detector that has a sky-high false-positive rate.
I swear there needs to be some convention around throwing wild accusations at people you don't know based exclusively on vibes and with zero actual evidence.
The problem today is that we build specific, short-lived apps that lock data into formats only they can read. If you don't use universal formats, your system is fragile. We can still open JPEGs from 1995 because the files don't depend on the software used to make them. Using obscure or proprietary formats is just technical debt that will eventually kill your project. File or forget.
It is convenient to be able to undo crops or filters, but I wish the industry would standardize so these changes are portable.
It’s proven several times over that it’s the correct approach. Abstractions (formerly Google photos, currently Immich) should just be built on top - but these proprietary databases are only for convenience.
For work, I’m having the same experience as the author and everything is just markdown and csv files for Claude Code (for research and document writing).
Ostensibly, things like MacOS Spotlight can bring real utility and value to the file system, and extended attributes through the sidecar indexing, etc. But Spotlight is infamous for its unreliability.
The other issue with file systems is simply that the user (potentially) has "direct access" to them, in that they can readily move files in and up and around whimsically. The "structure" is laid bare for them to potentially interfere with, or, such as the case with the extended attributes, drag a file to a USB fob, and then copy it back -- inadvertently removing those attributes.
And thats how we end up with everything being stuffed into a SQLite DB.
What are your thoughts on XMP sidecar files? I'm torn right now between digital negative + external metadata versus all-in-one image with mutable properties. Portability vs. Durability etc.
Thanks for starring the repo and let me know if you need any help.
It pulled back Plan 9, and I was shocked: this is exactly what we need today, as I'm convinced we need to think about minimizing agent permissions the exact same way companies do. Plan 9 was just too early.
The article gets some fundamentals completely wrong though: file systems are full graphs, not strict trees and are definitely not acyclic
Production grade systems might be written by agents running on filesystem skills, but the production systems themselves will run on consistent and scalable data structures.
Meanwhile the UI of AI agents will almost certainly evolve away from desktop computers and toward audio/visual interfaces. An agent might get more context from a zoom call with you, once tone and body language can be used to increase the bandwidth between you.
Saw this video recently, by an AI company working to get contextual cues from tone and body language. I think they're converting it to text and feeding it into a LLM, so not natively multimodal, but I still thought it was really cool.
The challenge is how to structure messy data as a filesystem the agent can use. That is a lot harder than querying a vector db for a semantic query.
The code bases we’ve been using agents in had been pruned and maintained over years, we’ve got principles like DRY that pushed us to put the answer in one place… implicitly building and maintaining that graph with all the actors in the system invested in maintaining this. This is not the case for messy data, so while I see the authors point and agree that a filesystem is a better structure for context over time, we haven’t supplanted search yet for non-code data.
File systems are not a good abstraction mechanism for remote procedure calls, though. I think it's important to distinguish between the two, since I find there are a lot of articles conflating both - comparing MCPs to SKILLs, which are completely different things.
I think the confusion comes from the fact that MCP came before SKILLs, and there's a mental model where SKILLs are somehow "better than" MCPs. This is like saying local Word documents are better than a fully integrated collaborative office suite. It's just not the same thing.
The reason SKILLs work so well is because there's 50 years of accumulated knowledge of how to run rudimentary Unix tools.
the TLDR
File systems - organising information MCP/APIs - remote procedure calls