Posted by denssumesh 4 days ago
I would have used Fuse if it got to that point as then it is an actual filesystem.
RIP RAG: lasted one year at a skillset that recruiters would list on job descriptions, collectively shut down by industry professionals
Am I the only one who read this and thought this is fucking insane? Who in their right mind would even consider spinning up a virtual machine and cloning a repo on every search query? And if all you need is a real filesystem why would you emulate a filesystem on top of a database (Chroma)? If you need a filesystem just use an actual filesystem! This sounds like insane gymnastics just to fit a “serverless” workflow. 850,000 searches a month (less than 1 request per second) sounds like something a single raspberry pi or Mac Mini could handle.
Not to be "that guy" [0], but (especially for users who aren't already in ChromaDB) -- how would this be different for us from using a RAM disk?
> "ChromaFs is built on just-bash ... a TypeScript reimplementation of bash that supports grep, cat, ls, find, and cd. just-bash exposes a pluggable IFileSystem interface, so it handles all the parsing, piping, and flag logic while ChromaFs translates every underlying filesystem call into a Chroma query."
It sounds like the expected use-case is that agents would interact with the data via standard CLI tools (grep, cat, ls, find, etc), and there is nothing Chroma-specific in the final implementation (? Do I have that right?).
The author compares the speeds against the Chroma implementation vs. a physical HDD, but I wonder how the benchmark would compare against a Ramdisk with the same information / queries?
I'm very willing to believe that Chroma would still be faster / better for X/Y/Z reason, but I would be interested in seeing it compared, since for many people who already have their data in a hierarchical tree view, I bet there could be some massive speedups by mounting the memory directories in RAM instead of HDD.