Memorizing session transcripts isn't useful

Posted by theahura 6 hours ago

Memorizing session transcripts isn't useful(12gramsofcarbon.com)

166 points | 142 commentspage 3

oefrha 4 hours ago|

I have this in my global CLAUDE.md after being annoyed by all the random crap memories.

> Don't start generating an auto-memory entry before asking me. Ask first, write only if I confirm — no speculative drafting.

No more crap after this.

Incidentally I don’t recall Opus 4.8 asking me once in the past few weeks. Older models did ask semi-frequently.

SaltyAstronaut 3 hours ago||

Those small, random items that pop-up later on in conversation actually make the experience feel better. But that's just my own personal experience.

aranw 4 hours ago||

t once had to tell claude 3-4 times to stop assuming the state of a system was the way it kept iterating it was cause it was in it's memory. I repeatably told it to otherwise and it just never updated it's memory and instead kept referencing it's memory about the state of a particular system

andai 3 hours ago||

In my harness I have all the code auto injected at startup (doing mostly very small codebases).

I found that every model will still manually check every file/function, they immediately assume that anything in context is stale.

That's sensible because often the user edits stuff while they're running.

What it does is save it from having to grep blindly about the codebase. But I think I'd get roughly the same benefit by just dumping the function headers then.

syntheticcdo 4 hours ago|||

Did you try to delete the memory yourself?

ErroneousBosh 2 hours ago||

Yeah but you also know humans that do that too, right? I know I do.

bluegatty 3 hours ago||

There's a lot of valuable information in there, its' too noisy.

dofm 4 hours ago||

Blog posts like this just blow me away.

> I believed this so strongly that my company built an entire product around this concept. I used to tell folks that "session transcripts were the new oil," that they were more valuable than the code itself.

> […]

> We don't really write code by hand anymore.

Honestly, isn't this just influencer spam? What possible value is there in reading about people who used to have products, but no longer write their own code, complaining about the inscrutable prediction machine they have handed that job and their livelihoods to?

Like, if you have complaints about the thing, perhaps you should address them to your supplier directly. None of your readers can help, and nobody's magic folk solution to your problem is better than yours.

And there are so many of these sorts of posts. Are we not entirely cooked?

(I think I have concluded that if people writing about AI aren't writing about interesting things they have achieved with small, local LLMs — which for clarity I am fully interested in reading - then I'm done reading. This whole blogging-about-cloud-AI genre is just weird and irresponsible now)

general_reveal 4 hours ago||

Look man, I’ve got a MMO that I’m working on that’s set in 2014 where everyone is a programmer in SV (might call it World of Legacy). It’s a period piece. I NEED as much blog training data of this type so that my NPCs can talk in a historically accurate way (god bless Medium.com, a historical treasure trove of a bygone medieval era).

It’s gonna be a living breathing world, you see. You’re going to be like “omg, this game even accurately captured the blog posts, woah”.

bryanrasmussen 3 hours ago|||

The perfect world was a dream that your primitive cerebrum kept trying to wake up from. Which is why the Matrix was redesigned to this: the peak of your civilization. I say your civilization, because as soon as we started thinking for you it really became our civilization, but the peak of your civilization was an MMO where everyone is a programmer in SV.

dofm 4 hours ago||||

I … I… don't want to play this, thanks ;-)

general_reveal 4 hours ago||

It’s the only way you’ll ever be able to pretend to be a programmer again though.

dofm 3 hours ago|||

Oh god, I just realised this really is the logical parallel to all those TV crime dramas set in the early 1900s.

general_reveal 3 hours ago||

It’ll be the programmers version of those civil war reenactments.

koolba 3 hours ago|||

The two sides would be the strongly typed union and the duck typed heretics of the confederacy.

dofm 3 hours ago||

Singing two different battle songs set to the tune of Code Monkey.

nerdsniper 3 hours ago|||

Leet code competitions will be as relevant as sailing regattas.

whateveracct 3 hours ago||||

so far, i just keep writing by hand and keep getting paid for it. weird.

pr337h4m 3 hours ago|||

I'm pretty sure there's an element of sarcasm here, but if this game is real, it does sound super promising.

goostavos 3 hours ago|||

>session transcripts were the new oil

Something about this idea really resonates with certain personality types. I equate it to the Zettelkasten hype phase from several years ago. People (...like me..) got really wrapped up in the belief that the process was more important that the content. "Linking" was an "activity." Something good will happen as long as you (a) take notes on stuff and (b) link them to other notes on stuff.

You see the same thing with the session transcripts people. They're building ever more sophisticated setups of indexing and storing and cross referencing every conversation they've ever had on the (I would argue) mistaken belief that the transcripts are the valuable part, rather than the uncomfortable part where you go do something. A lot of it, I say from falling in the trap, is fancy procrastination.

(Although, I have found myself jealous on many occasions where their fancy system retrieves something they vaguely recall from a conversation they had 3 months ago. So, who knows.)

theahura 3 hours ago|||

> Like, if you have complaints about the thing, perhaps you should address them to your supplier directly. None of your readers can help, and nobody's magic folk solution to your problem is better than yours.

I think you may just misunderstand the point of having / writing a personal blog. I write because it's fun! Whether the reader gets any value out of reading it is almost entirely beside the point.

(Also several comments here directly post a fix to the problem stated in the blog post, so readers can and do often help)

dofm 3 hours ago||

> I think you may just misunderstand the point of having / writing a personal blog.

I used to blog, as it goes, and I have supported and enabled many more, so no, not really.

LPisGood 4 hours ago|||

I have to ask: do you still write a lot of code yourself? I and most people I know do not.

dofm 4 hours ago|||

I am a freelancer recovering from severe burnout so the answer is a sort of irrelevant no.

I'm trying to rebuild my life so I am in an experimenting and learning phase rather than a massive coding phase, and most of my code work is maintenance of things I have built. That which I do code, I am still coding by hand, though I am dealing with other people's Claude output and I am really unimpressed by it. It's often rather crass.

But I would say to you that if you personally don't write code now but you do have a dependency on one of two presumably unprofitable cloud AI providers, aren't you in trouble? How is this not a three-alarm fire for you?

estearum 4 hours ago|||

> That which I do code, I am still coding by hand, though I am dealing with other people's Claude output and I am really unimpressed by it. It's often rather crass.

Unfortunately the point of code is rarely to impress people (certainly not other engineers) or to avoid being "crass." 99.99% of code exists to achieve business outcomes, and velocity matters a lot in many contexts. A lot more than elegance or impressiveness.

The platform risk is a valid concern but alleviated by China's theft and redistribution of open models.

dofm 4 hours ago|||

I'm not talking about impressing people.

We used to be concerned about code quality. Are we not anymore?

Crassness was a signal. Still is, to me — in a human I find that people who write crass code are going to cause me trouble.

estearum 4 hours ago|||

"Code quality" encompasses a lot of dimensions, one of which is impressing your colleagues, and many of which there's virtually no reason to care about now.

Arainach 3 hours ago||

On the contrary, it's more important than ever. With ever more code being generated, it's essential that the code be understandable and maintainable - by human and machine.

michaelchisari 3 hours ago||

And quality is the new differentiator when everyone can generate slop.

pydry 3 hours ago|||

Nobody cares about code quality /s

They only care about the things which you can only get with good code quality like reliability and speed of development.

estearum 3 hours ago||

Right, and to the extent that your coding practices contribute to reliability and speed of development, they are "of quality."

Now do the same exercise for "impressiveness" and "crassness."

Here, I'll do it for you:

> Nobody cares about code quality /s

> They only care about the things which you can only get with good code quality like impressiveness and lack of crassness.

Sounds silly doesn't it?

slopinthebag 3 hours ago||||

It doesn’t matter what materials or techniques you use to build a house. 99.99% of construction exists to achieve business outcomes, and velocity matters a lot more than using the right materials or techniques.

Of course the house must pass safety inspections and stuff, but the materials and techniques don’t matter one bit for that. All that matters is you achieve the desired outcome, and I will ignore the glaring fact that you achieve the desired outcome by using the right materials and techniques. The materials and techniques don’t matter, just the outcome.

yoyohello13 3 hours ago|||

> Of course the house must pass safety inspections and stuff, but the materials and techniques don’t matter one bit for that. All that matters is you achieve the desired outcome, and I will ignore the glaring fact that you achieve the desired outcome by using the right materials and techniques.

This analogy is more true than you think. This is why modern homes/appartments are trash. You can pass safety inspections using subpar materials and the house will fall apart after a few years, but who cares right? At least you achieved the business outcome!

This mentality is so infuriating. This is why I need to buy new shoes every year. Or why my washer/dryer motherboard craps out in 2 years instead of 10. Nobody gives a shit about quality anymore, this is why society is crumbling around us. Profit driven incentive for fast/cheap over everything else. And now I need to spend my day prompting an AI to fix AI slop code to keep the business hobbling along another day. What a fucking joke.

dofm 3 hours ago|||

It does feel like a good analogy.

e.g. the bill is definitely coming true for a lot of "non-traditional construction" materials and methods in immediately post-war properties in the UK. There are many unmortgageable properties using Mundic Block in Cornwall and to some extend Devon, in the heavily bombed south east there was a lot of pre-stressed concrete with catastrophic rebar failure, not to mention Orlit construction, and all across the country a lot of RAAC. Almost all of it for good, necessary, upbeat reasons.

It feels a bit like this kind of crisis from AI generated code could hit in ten, fifteen years time; people often fail to understand how long a bit of website code can last.

slopinthebag 3 hours ago|||

Yeah I agree. And you have people on this forum who gleefully point out that quality doesn’t matter to the business, as if they think they’re so intelligent because they noticed that employees are there to make the company money. Not realizing that A) it’s a very antisocial attitude and B) it’s not a tenable long term strategy.

estearum 3 hours ago||

Hang on now. GP didn't say "I care about quality" and I didn't say caring about quality is wrong.

GP said Claude's code "doesn't impress" them and that it's "crass."

Do you think a valid "long term strategy" is to create code that impresses GP and is not crass, but doesn't achieve the business outcomes it's meant to?

Inversely, do you think one can achieve business outcomes if "quality" is so abysmal that the code doesn't work or is unmaintainable?

Is it possible to write perfectly good, maintainable, performant, legible code that "doesn't impress" GP, or feels "crass" to them? Well gee, probably! Because "impressiveness" and "crassness" are literally meaningless.

dofm 3 hours ago||

Crassness, in the context I meant it, is not "literally meaningless" at all.

I will accept "of fully subjective value". But not "literally meaningless".

estearum 3 hours ago|||

No the materials and techniques matter a lot. This is why we need to build houses with sticks and jute cord, just like we always have. It's vital also that we paint our special symbols above the door to ward off the spirits.

It's insane to me that you're implying we could build houses with pre-fabricated materials or pneumatic nail guns and still somehow "have houses?" No sticks/jute cord and special symbols, then no house.

slopinthebag 3 hours ago||

The argument isn’t to not use better materials or techniques, it’s that inferior materials and techniques are fine because they don’t impact the end result, which is so obviously false when it comes to pretty much anything, but supposedly true when it comes to software.

estearum 3 hours ago||

I'm not sure who you saw arguing for inferior materials and techniques, but let me know when you find them.

What you saw in this thread was someone arguing against the dimensions of "impressiveness" and "crassness" as valid things to care about when it comes to code.

It's your mistake to assume that those are related to any meaningful concept of actual quality.

dofm 2 hours ago||

FWIW I never suggested that they were indicative of problems with the code. Unimpressive, crass code can run, after all.

I clearly said elsewhere that I think they are predictive of problems with the person who writes it, and I fear I can generalise that to LLM tooling that generates it.

techpression 3 hours ago|||

I’ve worked at many companies where this idea of velocity was claimed to matter, and it never did. The only thing it mattered for was to make it look like middle managers were worth anything, but the success was always in the foundational idea/concept.

jenniferhooley 3 hours ago||||

Programmers can use smaller models like deepseek v4 flash for 98% of the same productivity as SOTA models and cost (true cost) around $10-$30 a month. So I doubt most people who heavily use them are too concerned. It's only vibe/hobby coders who really need SOTA and they probably don't think about it much.

dofm 3 hours ago|||

To what extent does that ameliorate the problem?

Are you not, by developing this way, making yourself more interchangeable, less indispensable, than ever before?

Exoristos 3 hours ago|||

At this point just use a JetBrains product, get deterministic assistance, and 5x your speed. It's unfortunate the resistance to a true IDE just keeps going up. The blind lead the blind, I guess.

vidarh 4 hours ago||||

Personally I use 5 different model families, 3 of which are open weights with 3rd party inference providers (GLM, DeepSeek, Kimi), so if the frontier labs were to shut down it'd be a nuisance, nothing more.

andai 4 hours ago|||

Worst case scenario you just switch to a free model, which are 2025-ish in quality.

dofm 4 hours ago||

The open weights models I am interested in, and testing, learning, experimenting with etc.; I am confused and cynical, not insane.

I am not convinced it isn't vulnerable to the same problems but the whole tenor of the community around open source/open weights models just doesn't have the same YOLO madness to it.

AlotOfReading 3 hours ago||||

Of course? I'm still better than sonnet or opus, just slower and much more expensive.

Sometimes it takes me a day or more to find the one line fix or abstraction necessary, while claude can hammer through a hundred line fix in under an hour.

qup 3 hours ago||

Sounds like your definition of better is pretty narrow.

Quick and cheap are two of the three fabled: "Fast, cheap, and good: choose two"

AlotOfReading 3 hours ago|||

"good" can take lots of different meanings. Generally though, I want as little code as I can get away with. A majority of code lifecycle cost isn't in writing it.

dofm 3 hours ago||||

Are you perhaps missing the true message of that aphorism?

Or are you saying the industry is (because it is)

twister2920 3 hours ago|||

"more good" seems like a pretty decent definition of better to me. The words you are looking for are "cheaper" and "faster"

qup 3 hours ago||

In coding we usually change it to "cheap, fast or correct: choose two"

I reject your correction: I present the options as nouns, not modifiers to the work. Maybe I should say "Cheap, Fast, or Good" as a compromise.

Ronsenshi 4 hours ago||||

I am. I have Codex running, doing some tasks which I don't care much about, but anything I want to understand I write myself.

Same thing with hobby projects - I might ask ChatGPT or Gemini some questions about best practices in Swift for example, but writing code is done by hand.

As others said - if you don't use it, you'll lose it. And I'd rather keep my skills up to date.

hirako2000 3 hours ago|||

You have the privilege to keep yourself sharp, most businesses favor productivity over their workers' long term relevancy.

dofm 3 hours ago||

This is the thing that makes me saddest. Second to the fact that none of the management tier promoting and weaponising this insanity will meaningfully suffer consequences.

Right now I am lucky that I have the time to recover and learn.

kelnos 3 hours ago||||

Yes, nearly all of it. Having the agent write code for me doesn't really save me much time, and the code quality is usually worse (and it takes even more time if I insist on better code quality from the agent).

And I don't think I'm unique. I see enough posts like https://news.ycombinator.com/item?id=48777257 pop up that I'm reasonably confident all the hype around LLMs saving so much time and increasing productivity so much is, well, just that: hype.

Sure, if you can't code at all and want to build something, an LLM is going to be great for you, even if you can't evaluate the code quality or determine if there are bugs just by looking at the code. But I've been coding professionally for 25 years, and as a hobby since I was like 8 years old. I like to code! It's a passion of mine. If the LLM isn't doing it faster or better (and most of the time it isn't), why wouldn't I write code myself?

I'll have the LLM write boilerplate stuff or do tedious refactoring, because I just don't feel like it (even if it does take longer). But for the real work? Of course I do most of it myself.

One area where the LLM shines for me is finding the root causes of bugs. It can generally do that much faster than I do. Often orders of magnitude faster (like minutes instead of hours or days). But when it comes to write the fix for the bug? It's usually faster and better if I do it myself.

dofm 3 hours ago||

I am more fully invested in finding out ways AI can support me (documentation, code analysis, bughunting), though my experience with Claude as a bughunter is that it can miss the absolutely obvious if it is not in the shape it is expecting.

More generally I am interested in burnout-avoidance tools; things that help me start, finish, things that write tests I guess, certainly code scaffolding.

But I am fully unconvinced that my burnout will be improved by ending up owning the responsibility for wobbly or inscrutable AI-generated code with potential landmines in it; that will keep me up at night just the same.

LastTrain 3 hours ago||||

I still write code and sometimes it works well. I also use Claude and it writes code and sometimes that goes well. We have better success together, where I do the interesting stuff and let Claude write my unit tests, reconcile my documentation. That is to say, I’m using it for quality not quantity. There aren’t enough humans to deploy or consume all the sloppy shit it could write on its own.

walt_grata 4 hours ago||||

I write code by hand every day. I do the main part of the feature implementation myself and leave comments for the code i want the agent to write. I have some skills and a command that sets the stage to get the agent to fill in the rest

csomar 3 hours ago||||

I am now in the process of fixing code I wrote using AI. I have come to the realization that AI can't really write software and I am annoyed that it took me that long (months) to realize that.

techpression 3 hours ago||

This is quite terrifying to me, because I have a feeling I will soon come to the same conclusion. I’m starting to see some really glaring omissions in code I’m responsible for (using Opus) that at first (and second) look seemed fine, but really isn’t.

csomar 3 hours ago||

I talked with a friend on a different field (academic) and he had to re-review all things written by AI. Basically, he used AI to read/summarize/find stuff in large academic papers but realized later that many times AI makes glaring mistakes that on a first read pass the smell test.

andai 4 hours ago|||

I force myself to do it at least once a week, you know, like cardio. Keeps the doctor away.

dofm 3 hours ago||

Picard should have been a bergamot grower, not a winemaker.

ungreased0675 4 hours ago|||

It reminds me of the peak crypto days. Lots of resources consumed, many late nights, little to no value created.

singpolyma3 3 hours ago|||

I don't understand this line or reasoning. People use various cryptocurrencies to buy and sell legitimate products and services every day. Is the argument just that they could probably have done it some other way?

csomar 3 hours ago|||

I mean at least crypto provided value to criminals, tax evaders and Trump? (regardless of what you think of that). I don't see a parallel with AI.

operatingthetan 3 hours ago|||

This is pretty funny because it's about the depth of understanding of every 'AI expert' on Linkedin. People who praise the context window as basically magic have no idea how any of this works.

micromacrofoot 3 hours ago|||

Occasionally posts like this do get the attention of the company responsible, more than an email does... but indeed that's like a one in a million situation

ErroneousBosh 2 hours ago|||

> inscrutable prediction machine

"Spicy Autocomplete", I've heard it called.

fortyseven 4 hours ago||

[flagged]

dofm 4 hours ago||

I'm not the Blog Police, I'm a very naughty boy.

I have opinions people apparently don't like, for no subscriber money.

bigyabai 4 hours ago||

Settings > Capabilities > "Generate memory from chat history"

Toggle it off and never think about it again.

saagarjha 3 hours ago||

I mean, it’s pretty clear the people who work on Claude Code aren’t actually looking at what they’re implementing. The thought behind this feature seems like it goes nowhere beyond “oh wouldn’t it be nice if Claude could remember things about you? Ok Claude go implement this” and nobody bothered to see if it was useful or helpful.

charcircuit 4 hours ago||

>We have found zero performance benefit on SWE tasks when agents have search access to their previous transcript sessions

I refuse to believe this is true. The ability for an agent to find information from before a compaction is incredibly useful. At compaction time it's impossible to know what exactly may be still needed.

theahura 3 hours ago||

With the million-context-window models we never hit compaction, observed over hundreds of sessions. What are you doing that has you hitting compaction regularly?

charcircuit 1 hour ago||

For me logs can chew through a lot of tokens. And when the agent is trying a bunch of different experiments and then it may need to refer to what happened previously.

Million context models also are still not effective for the entire context size.

beepbooptheory 4 hours ago||

There has been this slow transition inside me, as someone who likes to not touch the AI as much as possible, where I've gone from skeptical and argumentative about it all to starting to just feel sad for all the Claude et al heads. Like, this is such a ridiculous house of cards you have to deal with all the time, which isn't even directly concerning the task at hand, presumably. Like you're cooking yourself a meal but its just nuking a burrito and then still somehow needing to wash the dishes for an hour.

Not that this isolated article is super damning or anything, but the accumulated set of all these reports has left me only empathetic, I think, of these other devs. Like, I just want to tell them, "it can be ok, it doesn't need to be like this.."

andai 3 hours ago|

I've been having a very nice time with Fable. I cooked up an Anki clone in like half an hour, with tech it's not familiar with. Nothing too ground breaking, but I was very pleased!

I think Opus might be on similar level for most of what I'm doing, but I haven't used it much recently, so I can't remember the difference. So I guess I'll find out on the 7th when they pull the plug again! (Free-ish trial of Fable ending.)

That being said, I tried using other frontier models to help with a Pong clone the other day and they were introducing new bugs at approximately the same rate as they were fixing it. On Pong!! I found that amusing because I couldn't think of a simpler game, so it didn't inspire confidence.

Fable's doing just fine on an online multiplayer game though. I have no idea how that works. (Maybe it would fail Pong too?? I haven't tested that!)

dijksterhuis 3 hours ago|

non-deterministic system behaves non deterministically. in other news, water is wet.

More comments...