Posted by tmaly 1 day ago
Ask HN: How are you doing RAG locally?
Are you using a vector database, some type of semantic search, a knowledge graph, a hypergraph?
ragtune explain "your query" --collection prod
Shows scores, sources, and diagnostics. Helps catch when your chunking
or embeddings are silently failing or you need numeric estimations to base your judgements on.Open source: https://github.com/metawake/ragtune
So I use hosted one to prevent this. My business use vector db, so created a new db to vectorize and host my knowledge base. 1. All my knowledge base is markdown files. So I split that by header tags. 2. The split is hashed and hash value is stored in SQLite 3. The hashed version is vectorized and pushed to cloud db. 4. When ever I make changes , I run a script which splits and checks hash, if it is changed the. I upsert the document. If not I don’t do anything. This helps me keep the store up to date
For search I have a cli query which searches and fetches from vector store.
The real challenge wasn't model quality - it was the chunking strategy. Financial data is weirdly structured and breaking it into sensible chunks that preserve context took more iteration than expected. Eventually settled on treating each complete record as a chunk rather than doing sliding windows over raw text. The "obvious" approaches from tutorials didn't work well at all for structured tabular-ish data.
Not sure how useful it is for what you need specifically: https://blog.yakkomajuri.com/blog/local-rag
https://aws.amazon.com/blogs/machine-learning/use-language-e...
The code for it is here: https://github.com/aws-samples/rss-aggregator-using-cohere-e...
The example link no longer works, as I no longer work at AWS.