Open source memory layer so any AI agent can do what Claude.ai and ChatGPT do

Posted by alash3al 10 hours ago

Open source memory layer so any AI agent can do what Claude.ai and ChatGPT do(alash3al.github.io)

73 points | 28 comments

jFriedensreich 7 minutes ago|

All these agent memory systems seem so simultaneously over and under engineered and like a certain dead end. I cannot imagine any reality in which this does not rot and get out of sync with what the latest model need. For the one time you build a payment provider how many session will be tilted towards thinking about payments because of the "don't use stripe" memory?

ting0 5 minutes ago||

It's not clear to me how or why this works, and how it compares to just using md files in my project. For something like this, we really need benchmarks.

dwb 2 hours ago||

I’m certainly on the lookout for something like this and I’m happy to see your account has published software from before the LLM boom as well. I guess I’d like some kind of LLM-use-statement attached to projects: did you use an LLM to generate this, and if so, how much and what stages (design, build, test)? How carefully did you review the output? Do you feel the quality is at least what you could have produced by yourself? That sort of thing.

Not casting aspersions on you personally, I’d really like this from every project, and would do the same myself.

codebolt 43 minutes ago||

There are many ways to use an LLM to generate a piece of software. I base most of my projects these days around sets of Markdown files where I use AI first to research, then plan and finally track the progress of implementation (which I do step-wise with the plan, always reviewing as I go along). If I was asked to provide documentation for my workflow those files would be it. My code is 99% generated, but I take care to ensure the LLM generates it in a way that I am happy with. I'd argue the result is often better than what I'd have managed on my own.

dennisy 2 hours ago|||

This is a fair question, but not one I feel we can let people self answer.

I doubt many people will honestly admit they did no design, testing and that they believe the code is sub par.

It does give me an idea that maybe we need a third party system which can try and answer some of the questions you are asking… of course it too would be LLM driven and quite subjective.

embedding-shape 1 hour ago|||

> I doubt many people will honestly admit they did no design, testing and that they believe the code is sub par

I'd doubt any engineer that doesn't call most of their own code subpar after a week or two after looking back. "Hacking" also famously involves little design or (automated) testing too, so sharing something like that doesn't mean much, unless you're trying to launch a business, but I see no evidence of that for this project.

dwb 1 hour ago|||

> I doubt many people will honestly admit they did no design, testing and that they believe the code is sub par.

Well no, but if people want to see a statement like this, and given that most people will want to be at least halfway honest and not admit to slop, maybe it will help nudge things in the right direction.

chickensong 1 hour ago||

What's the point? You can make good or bad software, with or without LLMs. Do you ask a carpenter if they use a hammer or nail gun? Did they only use the nail gun for the roof and the deck?

If you care that much and don't have a foundation of trust, you need to either verify the construction is good, or build it yourself. Anything else is just wishful thinking.

hirako2000 40 minutes ago||

We do ask whether it's handmade or factory.

We even ask when cakes are made in house or frozen even though they look and taste great (at first).

tedggh 48 minutes ago||

A few things seem to work well for me (Codex):

1) An up-to-date detailed functional specification.

2) A codebase structured and organized in multiple projects.

3) Well documented code including good naming conventions; each class, variable or function name should clearly state what its purpose is, no matter how long and silly the name is. These naming conventions are part of a coding guidelines section in Agent.md.

My functional specification acts as the Project.md for the agent.

Then before each agentic code review I create a tree of my project directory and I merged it with the codebase into one single file, and add the timestamp to the file name. This last bit seems to matter to avoid the LLM to refer to older versions and it’s also useful to do quick diffs without sending the agent to git.

So far this simple workflow has been working very well in a fairly large and complex codebase.

Not very efficient tokens wise, but it just works.

By the way I don’t need to merge the entire codebase every time, I may decide to leave projects out because I consider them done and tested or irrelevant to the area I want to be working on.

However I do include them in the printed directory tree so the agent at least knows about them and could request seeing a particular file if it needs to.

swingboy 39 minutes ago|

Interesting approach. How do you do the merging? Is it manual? Just changed files? A hybrid?

_pdp_ 3 hours ago||

Well the project is promising something without providing any details how exactly this is achieved which to me is always a huge red flag.

Digging deeper I can see it is effectively pg_vector plus mcp with two functions: "recall" and "remember".

It is effectively a RAG.

You can make the argument that perhaps the data structure matters but all of these "memory" systems effectively do the same and none of them have so far proven that retrieval is improved compared to baseline vector db search.

hirako2000 26 minutes ago|

It's a cool website..it says memory. It shows LLM suck and this product magically just works.

In a way, if it does accomplish that, it is a vectordb needing glorification.

Incipient 2 hours ago||

I still haven't found useful "memory". It's either an agents.md with a high level summary, which is fairly useless for specific details (eg "editing this element needs to mark this other element as a draft") or something detailed and explaining the nitty gritty, which seems to give too much detail such that it gets ignored, or detail from one functional area contaminates the intended changes in another functional area.

The only approach I've found that works is no memory, and manually choosing the context that matters for a given agent session/prompt.

jvwww 2 hours ago||

Yeah I feel the same way. Wonder when/if we'll get continual learning from these models. I feel like they are smart enough already but their lack of real memory makes them a pain to deal with.

hirako2000 23 minutes ago||

Google Gemini does this sort of thing. External to the model k presume. And it's very annoying.

A friend told me he would like Claude to remember his personality, which is exactly what Gemini is trying to do.

A machine pretending to be human is disturbing enough. A machine pretending to understand you will spiral very far into spitting out exactly what we want to read.

clutter55561 2 hours ago||

All the memories Claude created for me fell in the category remember-to-not-forget, so I disabled it altogether.

great_psy 4 hours ago||

LLM Memeory (in general, any implementation) is good in theory.

In practice, as it grows it gets just as messy as not having it.

In the example you have on front page you say “continue working on my project”, but you’re rarely working on just one project, you might want to have 5 or 10 in memory, each one made sense to have at the time.

So now you still have to say, “continue working on the sass project”, sure there’s some context around details, but you pay for it by filling up your llm context , and doing extra mcp calls

dennisy 4 hours ago||

True! But this is a very naive implementation, a proper implementation could surpass these challenges.

awestroke 2 hours ago||

Well let's talk again when the problems have been solved, then. Until then, manually curated skills and documentation will beat this

vasco 3 hours ago||

And once you're being specific about what it needs to remember you are 0 steps away from having just told AI to write and read files with the "memory"

adithyassekhar 2 hours ago||

Is this only for vibecoders who work alone?

If I am working on a real project with real people, it won’t have the complete memory of the project. I won’t have the complete memory. My memory will be outdated when other PRs are merged. I only care about my tickets.

I am starting to think this is not meant for that kind of work.

braiamp 21 minutes ago||

What the heck is happening on this site with the pointer disappearing? For some reason the body tag has "cursor: none" which is never good.

kgeist 1 hour ago|

>Stash makes your AI remember you. Every session. Forever.

How does it fight context pollution?

More comments...