We replaced RAG with a virtual filesystem for our AI documentation assistant

Posted by denssumesh 1 day ago

We replaced RAG with a virtual filesystem for our AI documentation assistant(www.mintlify.com)

141 points | 69 commentspage 2

tylergetsay 3 hours ago|

I dont understand the additional complexity of mocking bash when they could just provide grep, ls, find, etc tools to the LLM

skeptrune 3 hours ago||

I agree that would have been the way to go given more time and resources. However, setting up a FUSE mount would have taken significantly longer and required additional infrastructure.

wahnfrieden 3 hours ago||

agents are trained on bash grep/ls/find, not on tool-calling grep/ls/find

MeetRickAI 2 hours ago||

[dead]

pboulos 4 hours ago||

I think this is a great approach for a startup like Mintlify. I do have skepticism around how practical this would be in some of the “messier” organisations where RAG stands to add the most value. From personal experience, getting RAG to work well in places where the structure of the organisation and the information contained therein is far from hierarchical or partition-able is a very hard task.

khalic 3 hours ago||

The use case is well defined here, let’s not jump the gun. Text search, like with code, is a relatively simple problem compared to intrinsic semantic content in a book for example. I think the moral here is that RAG is not a silver bullet, the claude code team came to the same conclusion.

pboulos 2 hours ago|||

I agree with your assessment.

dominotw 1 hour ago|||

> he claude code team came to the same conclusion.

github copilot uses rag

skeptrune 3 hours ago|||

Modern OCR tooling is quite good. If the knowledge you are adding into your search database is able to be OCR'd then I think the approach we took here is able to be generalized.

GandalfHN 2 hours ago||

Layering a virtual FS over a spaghetti-doc org is an indexer in drag, and you still need access control or it's a complaince disaster.

kenforthewin 3 hours ago||

I don't get it - everybody in this thread is talking about the death of vector DBs and files being all you need. The article clearly states that this is a layer on top of their existing Chroma db.

dominotw 3 hours ago|

what value is chromadb adding in that setup

skeptrune 2 hours ago||

yea chromadb is not the point. multiple data storage solutions work

kenforthewin 2 hours ago||

I see .. so you're not using the vectors at all. Where are the evaluations showing this chromaFS approach is performing better than vectors?

skeptrune 30 minutes ago||

Working on publishing those, but publishing benchmarks requires a lot of attention to detail so it will likely be a bit longer.

jdthedisciple 2 hours ago||

But SQLite is notoriously 35% faster than the filesystem [0], so why not use that?

[0] https://news.ycombinator.com/item?id=14550060

tomComb 2 hours ago|

And Turso has built a Virtual Filesystem on top of their SQLite.

AgentFS https://agentfs.ai/ https://github.com/tursodatabase/agentfs

Which sounds like a great idea, except that is uses NFS instead of FUSE (note that macFUSE now has a FSKit backend so FUSE seems like the best solution for both Mac and Linux).

dmix 3 hours ago||

This puts a lot of LLM in front of the information discovery. That would require far more sophisticated prompting and guardrails. I'd be curious to see how people architect an LLM->document approach with tool calling, rather than RAG->reranker->LLM. I'm also curious what the response times are like since it's more variable.

skeptrune 3 hours ago|

Hmmm, the post is an attempt to explain that Mintlify migrated from embedding-retrieval->reranker->LLM to an agent loop with access to call POSIX tools as it desires. Perhaps we didn't provide enough detail?

dmix 3 hours ago||

That matches what I'm curious about. Where an LLM is doing the bulk of information discovery and tool calling directly. Most simpler RAGs have an LLM on the frontend mostly just doing simpler query clean up, subqueries and taxonomy, then again later to rerank and parse the data. So I'd imagine the prompting and guardrails part is much more complicated in an agent loop approach, since it's more powerful and open ended.

bluegatty 3 hours ago||

RAG should no have have been represented as a context tool but rather just vector querying ad an variation of search/query - and that's it.

We were bitten by our own nomenclature.

Just a small variation in chosen acronym ... may have wrought a different outcome.

Different ways to find context are welcome, we have a long way to go!

skeptrune 3 hours ago|

agreed!

mandeepj 4 hours ago||

> even a minimal setup (1 vCPU, 2 GiB RAM, 5-minute session lifetime) would put us north of $70,000 a year based on Daytona's per-second sandbox pricing ($0.0504/h per vCPU, $0.0162/h per GiB RAM)

$70k?

how about if we round off one zero? Give us $7000.

That number still seems to be very high.

all2 29 minutes ago||

At that point I would buy an old mini PC off of ebay and just put it on my desk.

lstodd 3 hours ago||

Hm. I think a dedicated 16-core box with 64 ram can be had for under $1000/year.

It being dedicated there are no limits on session lifetime and it'd run 16 those sessions no problem, so the real price should be around ~$70/year for that load.

maille 4 hours ago||

Let's say I want a free, local or free-tier-llm, simple solution to search information mostly from my emails and a little bit from text, doc and pdf files. Are there any tool I should try to have ollamma or gemini able to reply with my own knowledge base?

ghywertelling 3 hours ago|

https://onyx.app/

This could be useful.

tschellenbach 3 hours ago||

I think generally we are going from vector based search, to agentic tool use, and hierarchy based systems like skills.

ghywertelling 3 hours ago||

Agents doing retrieval has been around for quite a while

https://huggingface.co/docs/smolagents/en/examples/rag

Agentic RAG: A More Powerful Approach We can overcome these limitations by implementing an Agentic RAG system - essentially an agent equipped with retrieval capabilities. This approach transforms RAG from a rigid pipeline into an interactive, reasoning-driven process.

The innovation of the blogpost is in the retrieval step.

skeptrune 3 hours ago||

Vector search has moved from a "complete solution" to just one tool among many which you should likely provide to an agent.

dust42 3 hours ago|

If grep and ls do the trick, then sure you don't need RAG/embeddings. But you also don't need an LLM: a full text search in a database will be a lot more performant, faster and use less resources.

More comments...