Show HN: A Karpathy-style LLM wiki your agents maintain (Markdown and Git)

Posted by najmuzzaman 2 days ago

Show HN: A Karpathy-style LLM wiki your agents maintain (Markdown and Git)(github.com)

I shipped a wiki layer for AI agents that uses markdown + git as the source of truth, with a bleve (BM25) + SQLite index on top. No vector or graph db yet.

It runs locally in ~/.wuphf/wiki/ and you can git clone it out if you want to take your knowledge with you.

The shape is the one Karpathy has been circling for a while: an LLM-native knowledge substrate that agents both read from and write into, so context compounds across sessions rather than getting re-pasted every morning. Most implementations of that idea land on Postgres, pgvector, Neo4j, Kafka, and a dashboard.

I wanted to go back to the basics and see how far markdown + git could go before I added anything heavier.

What it does: -> Each agent gets a private notebook at agents/{slug}/notebook/.md, plus access to a shared team wiki at team/.

-> Draft-to-wiki promotion flow. Notebook entries are reviewed (agent or human) and promoted to the canonical wiki with a back-link. A small state machine drives expiry and auto-archive.

-> Per-entity fact log: append-only JSONL at team/entities/{kind}-{slug}.facts.jsonl. A synthesis worker rebuilds the entity brief every N facts. Commits land under a distinct "Pam the Archivist" git identity so provenance is visible in git log.

-> [[Wikilinks]] with broken-link detection rendered in red.

-> Daily lint cron for contradictions, stale entries, and broken wikilinks.

-> /lookup slash command plus an MCP tool for cited retrieval. A heuristic classifier routes short lookups to BM25 and narrative queries to a cited-answer loop.

Substrate choices: Markdown for durability. The wiki outlives the runtime, and a user can walk away with every byte. Bleve for BM25. SQLite for structured metadata (facts, entities, edges, redirects, and supersedes). No vectors yet. The current benchmark (500 artifacts, 50 queries) clears 85% recall@20 on BM25 alone, which is the internal ship gate. sqlite-vec is the pre-committed fallback if a query class drops below that.

Canonical IDs are first-class. Fact IDs are deterministic and include sentence offset. Canonical slugs are assigned once, merged via redirect stubs, and never renamed. A rebuild is logically identical, not byte-identical.

Known limits: -> Recall tuning is ongoing. 85% on the benchmark is not a universal guarantee.

-> Synthesis quality is bounded by agent observation quality. Garbage facts in, garbage briefs out. The lint pass helps. It is not a judgment engine.

-> Single-office scope today. No cross-office federation.

Demo. 5-minute terminal walkthrough that records five facts, fires synthesis, shells out to the user's LLM CLI, and commits the result under Pam's identity: https://asciinema.org/a/vUvjJsB5vtUQQ4Eb

Script lives at ./scripts/demo-entity-synthesis.sh.

Context. The wiki ships as part of WUPHF, an open source collaborative office for AI agents like Claude Code, Codex, OpenClaw, and local LLMs via OpenCode. MIT, self-hosted, bring-your-own keys. You do not have to use the full office to use the wiki layer. If you already have an agent setup, point WUPHF at it and the wiki attaches.

Source: https://github.com/nex-crm/wuphf

Install: npx wuphf@latest

Happy to go deep on the substrate tradeoffs, the promotion-flow state machine, the BM25-first retrieval bet, or the canonical-ID stability rules. Also happy to take "why not an Obsidian vault with a plugin" as a fair question.

99 points | 44 commentspage 2

souravroy78 1 day ago|

Don’t know if Karpathy even wrote this version. Where are the citations?

najmuzzaman 1 day ago||

karpathy's llm wiki tweet and gist: https://x.com/karpathy/status/2040470801506541998?s=20

i used this idea to create a version that works for a team of ai agents

souravroy78 1 day ago||

Nice

goodra7174 1 day ago||

I was looking for something similar to try out. Cool!

Unsponsoredio 1 day ago||

love the bm25-first call over vector dbs. most teams jump to vectors before measuring anything

vlady_nyz 1 day ago||

need to try out asap. love the „the office“ vibe

imafish 1 day ago||

Cool idea. But is anyone actually building real stuff like this with any kind of high quality?

Every time I hear someone say "I have a team of agents", what I hear is "I'm shipping heaps of AI slop".

psanchez 1 day ago||

Even though I did not know about Andrej Karpathy's tweet from earlier this month, I ended up converging on something very similar.

A couple of weeks ago I built a git-based knowledge base designed to run agents prompts on top of it.

I connected our company's ticketing system, wiki, GitHub, jenkins, etc, and spent several hours effectively "onboarding" the AI (I used Claude Opus 4.6). I explained where to find company policies, how developers work, how the build system operates, and how different projects relate to each other.

In practice, I treated it like onboarding a new engineer: I fed it a lot of context and had it organize everything into AI-friendly documentation (including an AGENTS.md). I barely wrote anything myself, mostly I just instructed the AI to write and update the files, while I guided the overall structure and refactored as needed.

The result was a git-based knowledge base that agents could operate on directly. Since the agent had access to multiple parts of the company, I could give high-level prompts like: investigate this bug (with not much context), produce a root cause analysis, open a ticket, fix it, and verify a build on Jenkins. I did not even need to have the repos locally, the AI would figure it out, clone them, analyze, create branches using our company policy, etc...

For me, this ended up working as a multi-project coordination layer across the company, and it worked much better than I expected.

It wasn't all smooth, though. When the AI failed at a task, I had to step in, provide more context, and let it update the documentation itself. But through incremental iterations, each failure improved the system, and its capabilities compounded very quickly.

dominotw 1 day ago||

how is this related to parent comment . slop.

psanchez 1 day ago||

Well, my comment was meant as an example of a setup for actually building something real with reasonable quality. I was answering to that part of the previous comment.

In my experience, the difference is context. Agents without structure produce slop, but with a well-curated knowledge base and iteration, they can be useful. I was just sharing a setup that has been working for me lately.

Edit: minimal changes for clarity

jbjbjbjb 1 day ago|||

I remember the personal wiki was a bit of trend 5 years ago but it kind of died because it had an unclear purpose for the most part. I kept one but never really referred to any of the notes and then just went back to a paper and to do list. I’m sure this is useful for those who kept up the habit.

najmuzzaman 1 day ago||

this is not a personal wiki though. it is a team wiki. agents are responsible to manage it and keep it fresh and always visible with human oversight.

tbh this won't be much useful as a personal wiki.

frrandias 1 day ago|||

Hey, contributor to Wuphf here,

We have been using it as a sounding board. I think that in its current state it's actually more useful for someone to learn about how to run a business - "what does a CEO vs PM do" and/or learn about the pros/cons of running a bunch of agents at once.

najmuzzaman 1 day ago|||

let's talk about real stuff. we built an AI-native CRM backed by HubSpot founder Dharmesh Shah last year before this, had revenue, iterated to focus on context graph infra which looked like the right moat to focus on, did enterprise PoCs, and all of that distilled into this personal project i built on the side to help my own work. turned out to be right interface for making context infra usable.

the team is of 4 HubSpotters who built HubSpot's largest platforms - search, nav, notifs, permissions, AI.

we are in the process of opening up large pieces of our enterprise context architecture to WUPHF and also ship the cloud enterprise version of WUPHF (https://nex.ai/new-home).

hansmayer 1 day ago|||

+100 for this comment.

stavros 1 day ago||

The problem with your comment is that the word "real" is just there to move the goalposts. There are people building high-quality stuff like this, yes.

I built a tiny utility like this that works very well yesterday:

https://github.com/skorokithakis/gnosis

hyperionultra 1 day ago||

[flagged]

mirekrusin 1 day ago||

Feels like disliking musician for fanaticism towards musical instruments.

newsicanuse 1 day ago||

[flagged]

William_BB 1 day ago|||

I have the same feeling ever since his infamous LLM OS post

spiderfarmer 1 day ago||

Probably just envy.

wiseowise 1 day ago||

Obviously it is envy, and not scepticism over a guy who practically lives on Twitter and has unhinged[1] follower base.

1 -https://x.com/__endif/status/2039810651120705569

kid64 1 day ago||

That's a 404

davedigerati 1 day ago||

why not an Obsidian vault with a plugin?

najmuzzaman 1 day ago||

Two structural reasons. 1. Obsidian is a single-user editor. It does not have the concept of "agent A drafted this, agent B promoted it, the team approved it." The promotion flow needs a state machine that lives outside the editor. a plugin can simulate it but the source-of-truth has to be a process the agents talk to instead of a vault file. 2. Agents need an MCP surface. An Obsidian plugin API won't do. /lookup, entity_fact_record, notebook_write, and team_wiki_promote are MCP tools the agent runtimes call directly. Obsidian's plugin API targets human users and the Electron app. You would be reimplementing the MCP layer to bridge. Practical compatibility: you can absolutely point Obsidian at ~/.wuphf/wiki/ and use it as a vault (we got someone from our Reddit post do this). Obsidian can be reader while WUPHF stays the writer.

kid64 1 day ago|||

He presumably wanted the result to be good.

tomtomistaken 1 day ago|||

what plugin are you using?

davedigerati 1 day ago||

srsly tho this looks slick & love the office refs / will go play with it :)

frrandias 1 day ago||

Awesome, let us know if there's any features you want/bugs you hit :)

agentminds 1 day ago|

[dead]