Top
Best
New

Posted by tmaly 1 day ago

Ask HN: How are you doing RAG locally?

I am curious how people are doing RAG locally with minimal dependencies for internal code or complex documents?

Are you using a vector database, some type of semantic search, a knowledge graph, a hypergraph?

343 points | 139 commentspage 6
ehsanu1 1 day ago|
Embedded usearch vector database. https://github.com/unum-cloud/USearch
geuis 1 day ago||
I don't. I actually write code.

To answer the question more directly, I've spent the last couple of years with a few different quant models mostly running on llama.cpp and ollama, depending. The results are way slower than the paid token api versions, but they are completely free of external influence and cost.

However the models I've tests generally turn out to be pretty dumb at the quant level I'm running to be relatively fast. And their code generation capabilities are just a mess not to be dealt with.

mooball 19 hours ago||
i thought rag/embeddings were dead with the large context windows. thats what i get for listening to chatgpt.
juleshenry 21 hours ago||
SurrealDB coupled with local vectorization. Mac M1 16GB
lormayna 1 day ago||
I have done some experiments with nomic embedding through Ollama and ChromaDB.

Works well, but I didn't tested on larger scale

eajr 1 day ago||
Local LibreChat which bundles a vector db for docs.
nineteen999 1 day ago||
A little BM25 can get you quite a way with an LLM.
jacekm 22 hours ago||
I am curious what are you using local RAG for?
sinandrei 1 day ago|
Anyone use these approaches with academic pdfs?
urschrei 1 day ago||
Another approach is to teach Claude Code how to use your Zotero library's full-text search: https://github.com/urschrei/zotero_search_skill.
alansaber 20 hours ago|||
I've not seen any impressive products. But products do exist ie https://scibite.com/solutions/semantic-search/
amelius 1 day ago||
Anyone using them for electronics datasheets?
bradfa 21 hours ago||
I would like to. I haven't yet found a solution that works well.

The problems with datasheets is tables which span multiple pages, embedded images for diagrams and plots, they're generally PDFs, and only sometimes are they 2-column layout.

Converting from PDF to markdown while retaining tables correctly seems to work well for me with Mistral's latest OCR model, but this isn't an open model. Using docling with different models has produced much worse results.

sosojustdo 21 hours ago||
I've been working on a tool specifically to handle these messy PDF-to-Markdown conversions because I ran into the same issues with tables and multi-column layouts.

I’ve optimized https://markdownconverter.pro/pdf-to-markdown to handle complex PDFs, including those tricky tables that span multiple pages and 2-column formats that usually trip up tools like Docling. It also extracts embedded diagrams/images and links them properly in the output.

Full disclosure: I'm the developer behind it. I’d love to see if it handles your specific datasheets better than the models you've tried. Feel free to give it a spin!

bradfa 20 hours ago||
Cool! But given that often electronics documentation is covered by NDAs, my preferred solution is local-first if at all possible.
More comments...