Show HN: Recall – Local project memory for Claude Code

Posted by mateenah 2 days ago

Show HN: Recall – Local project memory for Claude Code(github.com)

134 points | 84 comments

mikeocool 2 days ago|

I apparently use Claude differently the most people who talk about using Claude on the internet.

I’ll typically have a bunch of short sessions over the course of a day. Anytime I start a task that isn’t going to very directly benefit from the existing context I start fresh.

I don’t find a lot of benefit in explaining the project overall to Claude — I’ve deleted a lot of that explanation from my Claude.md because it didn’t seem to impact much.

I typically start a task by pointing it to 1-2 files and giving it some explanation of what I want done, and it figures it out.

Basically never hit context window limits or compactions, and can’t remember the last time I hit a 5 hour or a weekly limit.

egamirorrim 2 days ago||

I'm the same. Every time I see people talking about explaining the project every session I'm puzzled. I've just never had to do that.

majormajor 2 days ago|||

I do something similar. I've recently started having a few "starting point" files to re-explain common context (less than thirty lines per markdown file, usually) that I can point the agent at at the start of a new session, each tightly scoped to a certain domain and/or task type. That's been nice to avoid repeating myself, without the side-tracking or over-aggressive biasing-towards-previous-conversations that I've seen happen if I use long sessions or let it try to decide on its own what to pull in from larger files or trees of files. Sometimes I'll tell it to update that file based on new info from a current task, but I keep tight control over what gets pulled into task start context.

They aren't really "explaining the project" either, but more module- or task-specific preferences, hand reference pointers, or other things like "there are mixed examples of how to do certain things in this project, prefer X to Y." I use a write-everything-twice approach. After I find myself having to correct an implementation because it didn't figure out one of these things on its own from the existing code, I'll add an entry. That also avoids bloating things with "I think this is relevant" compared to "I have noticed that this is necessary."

I keep doing this because it lets me experiment with different approaches to problems without risk of it fixating on things from a previous abandoned attempt, and particularly because sometimes I'm wrong and I haven't found the agent harnesses particularly reliable at taking my word for it from a POV of "yes I know I said we need xyz earlier, but let's please entirely forget about that."

bengotow 2 days ago|||

This is my usage pattern and I agree it works really well. I start almost every conversation by asking Claude to read, not write. Then once it's explored a particular slice I let it rip.

This takes a couple minutes (and I suppose I'm spending tokens each time), but sessions rarely reach compaction length and I like that I'm not trying to keep a whole separate pile of docs in sync.

smt88 2 days ago||

Claude seems to read when it needs to when I ask it to do something. Did you have a different experience?

mock-possum 2 days ago|||

I keep a plan file that records what I’m doing, how I’m doing it, and what I’ve done so far - every time I sit down to have another session, I feed Claude the plan file first, then tell it to begin on the next unchecked todo. Every time I run into something new, I tell it to add it to the todos in the plan file.

It basically takes care of itself, or at least as close as it can.

blitzar 2 days ago|||

In some sessions I find claude to be incredibly dumb - other times we are on the same wavelength and everything flows.

I guess I need to do some claude.md work or find other ways to prime the session so i get the good personality and not the evil twin.

andai 2 days ago||

Claude Code has a big system prompt, most of which isn't necessary for the more recent models. (Codex too.)

I've been running Claude and GPT in my own agent harness. The main difference I notice is that tasks take about 7x longer to complete if they're run in the official Claude or Codex harness (and cost me 7x more).

You would think this would lead to increased correctness, but that doesn't seem to be the case. Today I tested both side by side. They both resulted in data loss. (I had a backup obviously.)

GPT running in the official harness did a bunch of extra tests and double checking, and ended up with the same result regardless (it permanently deleted a bunch of documentation).

All else being equal, I like getting my data loss 7x faster and cheaper ;)

jack_pp 2 days ago||

So you are using the API directly in your own harness without the subscription?

andai 2 days ago||

Yeah, I made a simple agent based on this tutorial:

https://minimal-agent.com/

It's slightly bigger now, but here's a ~50 line version for reference. I added the missing outer while-loop, so it takes user input etc.

https://gist.github.com/a-n-d-a-i/bd50aaa4bdb15f9a4cc8176ee3...

I mostly use it with GLM via their coding plan, I got a year for like $20 when it was on sale. But I also hooked it up to Sonnet, Opus, GPT, etc.

canpan 2 days ago|||

I work similar, but still have architecture mds for a few selected cross cutting features. As useful for human readers as for AI.

Normally I do it exactly as you say, point at a few files, but if I know these features are involved I point at the corresponding mds instead. Its a shortcut for me to type less.

whalesalad 2 days ago|||

Honestly sounds like you’re not doing anything difficult. If I’m doing easy tasks it’s fine but if you need to do a major architectural project that spans 3 codebases and 2 clouds you’re gonna have a hard time without substantial context/memory management.

theshrike79 1 day ago||

But that's an anomaly. I'm pretty sure you're not doing major architectural changes over 3 codebases and 2 clouds daily?

whalesalad 1 day ago||

you'd be surprised.

computerex 2 days ago||

I work in a completely different manner. I have a chief of staff agent, which is one Claude code instance that orchestrates work across all my projects simultaneously with sub agents. In this way the agent helps me context switch and drive work towards everything I’m working on. I only use 1 session, I compact when necessary with todos and on file system files to track wip

nibbleyou 2 days ago||

Mind sharing details of your setup?

computerex 1 day ago||

It's very simple. I have told Claude that it's my chief of staff and that it must delegate all tasks to subagents. That as my COS it must show initiative and only come to me for vision/decision making. I have hooks further enforcing this. It works really well because I am usually working on like 12 things at once. Feels like playing simultaneous chess. The COS agent helps me context switch/orchestrate everything. Subagents have their own context window so the COS context does not get polluted/filled up with low level subtask details. COS can effectively prompt the subagents. My main session does grow and I compact it from time to time. But since all the WIP is already captured in external state, compaction of the main session does not degrade performance.

I have used this same pattern in my own harness and it works well there too. https://github.com/computerex/z

I hooked an instance of my harness up to telegram and now I talk to it from everywhere it and dispatches work out to subagents.

serial_dev 2 days ago||

I might be missing out on something but I never had to explain my project. Just give it a task, or if you really want to, type it quickly, then you are good to go.

I can’t imagine this being worth optimizing. The issue is never that Claude can’t figure out what the projects is about…

Am I missing something or does this project not solve a problem most regular people have?

suprjami 2 days ago||

There are many other posts here which agree with you. Filling context with what you think the model needs adds nothing and possibly just inflates context which is harmful.

A good method seems to be only make a skill or memory when the LLM gets something wrong, or if you actually observe it's always doing the same step and you can get the model to the same place with less tokens.

chatmasta 2 days ago|||

I’ve basically never edited a skill or memory myself. I make the LLM do it as part of the /handoff skill before I clear a session. That also includes pruning existing skills/memories and resolving any drift.

Even the /handoff skill was written by the model…

airstrike 2 days ago|||

It's funny because with so many different implementations of /handoff, I wonder if anyone has benchmarked handoff-and-resume to figure out what the best performance implementation looks like.

I also imagine that varies by model.

chatmasta 1 day ago||

Mine are project-specific, which is a bit annoying since I’d like it to be global but there are some project-specific additions. Maybe I’ll (ask Claude to) refactor that to be more composable.

It should be a first class feature of the harness, tbh. It kind of is with the /compact [focus] parameter but this is coarse and leaves no record. I find keeping the handoff files in the repo to be useful for historical context and later debugging.

sdesol 2 days ago|||

> Filling context with what you think the model needs adds nothing and possibly just inflates context which is harmful.

The solution that I've developed is, let the agent figure things out efficiently, without inflating the context. I have what I call a smart repo that better explains this at

https://github.com/gitsense/smart-ripgrep

The basic idea is, when the agent does a ripgrep it gets back files + matching lines + context.

bfeynman 2 days ago|||

What I've finally come to understand is that there is a large amount of people who are now able to write and use software through claude and coding agents. Those people have different needs than more traditional software engineers who have more knowledge because even best llms often need steering, correction, and refactoring suggestions when iterating on code and it's fine to let it lose context because exactly like you said, you tell it to read file and then have to regurgitate the understanding so you can correct or validate it before continuing.

For those where the code is almost entirely a black box and cannot easily recover when something goes wrong. They are much more keen on this context management and planning because recovering from derailments is much harder (and takes longer) because its often a conversation with llm to try to recover to where they were before.

rkochanowski 2 days ago|||

There are bunch of tools to manage context or fix what Claude does wrong. They may be popular because non software engineers want to improve their workflows, like you mentioned.

But are they really working instead of making it worse? Are there any tests or real case studies done by users not tool's author? From my experience, removing from context works more often then adding.

catlifeonmars 1 day ago|||

Git commit?

airstrike 2 days ago|||

Depending on the scale of the project and the complexity of the specific thing you need to work on, it's advantageous to bring specific context into the session instead of hoping the model will connect the right dots.

theshrike79 1 day ago|||

I specifically have instructions for claude explaining the purpose of the project in pretty much all repos. Just a simple PROJECT.md is enough - and referenced from AGENTS.md

There I usually lay out stuff like "this is a personal greenfield project" and "don't bother with multi-user support" etc. Or Claude will default to creating something WEBSCALE for a simple tool that won't run outside of my local LAN-only Proxmox setup. And that'll also skip massive database migration support for a project that's 3 days old - the agent doesn't know that. I'm just dropping it on the project after a full memory wipe.

coldtea 2 days ago|||

>I might be missing out on something but I never had to explain my project. Just give it a task, or if you really want to, type it quickly, then you are good to go.

This means its changes will either be out of alignment with the overall project and its "style" and goals, or it waste tokens re-getting to know the basics about the project each time.

No third case.

torben-friis 2 days ago||

I guess that depends on the kind of project, how common the intent is, how self contained, etc.

SubiculumCode 2 days ago||

Sometimes its good to start fresh. LLMs need large context restart's sometimes so they can better identify holes that they become blind to.

derwiki 2 days ago|

Back in the human age of coding, I felt the same way sometimes

mohamedkoubaa 2 days ago|||

This was always one of the reasons to hire interns

senectus1 2 days ago|||

what a depressing statement.

cootsnuck 2 days ago||

Hm, I just keep a folder called something like `status_docs/` in any project I work on and I create a new file in that folder any day I'm working on a project that's dated (e.g. `project/status_docs/2026_06_21_status.md`). It's basically a project diary that both me and the LLM can reference.

I have the LLM at some point in the day while working on the project create that file with all the relevant context. And then I'll periodically have it update that file (often before I compact the context window, or before I switch to a new task). And then I just have the LLM update it whenever I'm done working on the project for the day.

Then no matter what, if I come back that project again a day later, a week later, a month later, whatever – I just literally point a fresh session at the most recent status doc to help both me and the LLM orient ourselves to the work at hand. What's really nice too is having it reference the status docs from previous days to help orient it for creating the new status doc for the current day.

I've been doing this informally for probably over a year now, and have started formalizing it so I do it with every project. It's been a big help to me personally given all the context switching between projects I've been doing more and more since using AI coding tools.

comrade1234 2 days ago||

IntelliJ handles this for you. Basically it sends half your project to Claude even if you're asking some question about Star Wars.

lxgr 2 days ago|

> IntelliJ handles this for you. Basically it sends half your project to Claude

Not sure I’d call that “stopping wasting my tokens”.

comrade1234 2 days ago||

Yeah but it could have sent the other half of the project too.

anigbrowl 2 days ago||

I use Deepseek and just as it to generate a state.md file with a summary of the project every time I've reached a goal or milestone. I then take a few minutes to edit this and add in or take out details. Between token pricing and generous cache discounts This has proved very efficient so far, I reset every few hours of work and bypass the muddle of having too many priorities or over-extrapolation from un-nuanced instructions I gave at an earlier stage.

I do think that this project is interesting in several ways - prioritizing privacy, minimizing spend, and using objective semantic markers to sift and consolidate the key takeaways from long sessions. I'd like to try it on my cline project history. But while it would make a great recording of project history, I wonder if a lot of it doesn't end up detailing blind alleys the project went down and had to back out of.

Generally when this happens I feel that it's due to vague specification on my part, or avoiding architectural decisions I didn't want to deal with and implicitly inviting the model to implement a lowest-common-denominator solution.

KetoManx64 17 hours ago|

> But while it would make a great recording of project history, I wonder if a lot of it doesn't end up detailing blind alleys the project went down and had to back out of

Yes, I've run into this as well. I had the agent document the changes it would make along the way, but it would want to keep making note of things that we switched away from. Eg, "In phase3 we used mysql instead of sqlite". On some projects I let the agwnt do all the documentation and plan.md and state.md files, and every so often go through manually and delete some of the cruft that's no longer needed.

felixlu2026 2 days ago||

The hard part with project memory isn’t saving more stuff, it’s deciding what not to trust later. Stale plans and failed debugging guesses can quietly poison an agent pretty fast.

snovv_crash 2 days ago|

"correctly invalidating your cache" rears it's ugly head once again.

nberkman 23 hours ago||

Somewhat related, I built ccrider [1], which indexes the session transcripts that agents already write to disk (Claude Code, Codex, Copilot) into a local SQLite FTS database. It has a TUI, CLI, and an MCP server so the agent can search past sessions itself. Same local-first idea, based the raw logs rather than a maintained digest. Should work nicely with recall as well.

[1] https://github.com/neilberkman/ccrider

tt_dev 2 days ago||

How does this beat a Session specific README?

intothemild 2 days ago|

I think the majority here have stated the same... That CLAUDE.md or AGENTS.md effectively do this. Either that or the readme.

The only tip I can give is that your skill that builds or wraps up work. You should have it update those files if anything has changed.

Claude/Agents files shouldn't be bloated, but should imho act as a basic amount of context on the project so your agent and skills can pick up and go, with even the most basic initial prompt.

devmor 2 days ago|

> The only tip I can give is that your skill that builds or wraps up work. You should have it update those files if anything has changed.

Depending on the scope of work you’re doing, it might be better to have this removed from the context of the work that was done.

I keep a “Last Updated Hash” in my md and every so often will have the LLM pull a diff from that hash to the current head, then determine what doesn’t match.

More comments...