Potential session/cache leakage between workspace instances or consumer accounts

Posted by chatmasta 4 hours ago

Potential session/cache leakage between workspace instances or consumer accounts(github.com)

208 points | 96 comments

throwaway260704 1 hour ago|

Using a throwaway account for obvious reasons, but I’m very involved in this space using LLMs from multiple providers. I’m aware of at least two instances in which the intermediate infrastructure “swapped” responses, once impacting Claude models and once impacting GPT models, from two different providers.

One gave us a proper postmortem in which their API gateway was incorrectly handling HTTP 100 status codes, putting them into an error state where there was effectively an off by one error - you would receive the response to the prompt that came in before yours and would pay it forward (your response would go to the next caller).

The other instance never had root cause explained to us, and we were just told to trust it wouldn’t happen again.

Both of these are from $1T+ companies.

ZDR wasn’t compromised in these cases since it was responses being swapped in flight. I wouldn’t be surprised if this is a similar issue - it’s not that data is being retained, it’s just not being safely isolated in intermediate infrastructure.

pocksuppet 1 hour ago||

This attack is called "HTTP desync" or "request smuggling". It's often done intentionally by a client to try and spy on other clients' responses.

Every time you multiplex requests from multiple clients onto one upstream connection, you are probably vulnerable to this, because (despite its superficial simplicity) HTTP is just too complex to reliably match the requests and responses to upstream.

For example a desync can be triggered in some systems by having more than one Content-Length header, by mixing Content-Length with chunked encoding, or by passing an HTTP/2 header called Content-Length that doesn't match the actual content length.

Here's a DEF CON talk (6 years ago) on this topic: https://www.youtube.com/watch?v=w-eJM2Pc0KI

The same attack has been applied to SMTP by messing up the line endings surrounding the end-of-message delimiter, where it's called SMTP smuggling. It may also apply to other protocols.

markasoftware 1 hour ago||

Very true, this was likely an attack. Worth noting that mr kettle has done a defcon talk nearly every year on some variant of this attack, the most recent one titled "HTTP/1.1 must die" because he rightfully believes that switching to the binary headers of http/2 (specifically in reverse proxy connections to upstream servers) is the only way to systematically prevent these.

albinowax_ 53 minutes ago||

I’ll be back next month with a load of fresh vectors in “Can AI Do Novel Security Research? Meet the HTTP Terminator”

https://portswigger.net/research/talks?talkId=36

Maybe my last presentation on the topic! Possibly.

tejusarora 1 hour ago|||

Woah. Sounds plausible. However, wouldn’t that still be an implicit violation of ZDR since now the response is possibly egressed out of the enterprise network? So if I were working with PHI, the response egress is a potential violation of HIPAA even though claude didn’t retain anything — but the whole Point was to comply with HIPAA. Thoughts?

theplumber 1 hour ago||

These companies(at least one of them) seem lead by idiots(Hint:his name is Dario) so I wouldn’t be surprised to have multiple wtf moment if you were to see how they treat our data…Let’s just start pushing for opening up AI models because they are too dangerous behind paid walls. That would be a great regulation.

minhaz23 1 hour ago||

Curious why you feel that way about Dario?

solenoid0937 1 hour ago|||

HN thinks the safety crowd is dumb, and has never seriously engaged with the AI safety space.

HN doesn't believe superintelligence will be a thing; while the AI safety crowd believes they are building it. So the decisionmaking of the safety crowd is incomprehensible to HN.

pseudony 1 hour ago|||

Funny how Dario’s and Sam’s concern for our safety dovetails so nicely with their companies’ strategies. How fortunate.

Grow up. Whenever push comes to shove, they reduce safety and alignment departments, rush out releases over the heads of the same departments. If you engaged with the news these last years you’d see it for what it is “models for me, but not for thee”.

solenoid0937 30 minutes ago|||

It's clear you haven't engaged with the subject matter beyond the typical "internet-forum cynic" mindset.

Both companies were founded on the basis of AI Safety.

- There are tons of great safety people doing real work at OpenAI. Releases are held back, models are evaluated, etc.

- Anthropic goes even further - constrained themselves with a PBC/LTBT structure, treat safety even more rigorously, and notably delayed the release of Mythos (literally the opposite of what you alleged) and continue to hold their two red lines.

You should actually talk to some of the people at these labs. Nearly everyone working at these places genuinely believe AGI/ASI is actually happening, so they do take safety seriously.

To imply these companies don't care about safety is typical internet-brand nihilism/cynicism that helps you feel smart while being literally the opposite of the truth.

SubiculumCode 51 minutes ago|||

There is no reason for you to make personal attacks like that. Not on HN.

Moreover, your take on Dario is over simplistic, and undersells the extent to which Anthropic takes seriously safety. It's not lip service, there are real dollars and attention spent on alignment at Anthropic.

DrewADesign 1 hour ago|||

Reductionist. Many of us think they’re all dumb.

politician 1 hour ago|||

Dario quit OpenAI to hype the AI apocalypse for quick cash and attention. Then, he walked right into an obvious crisis with the Pentagon by continuing to try to play both sides of the AGI doom story that even his own AI would've pointed out. Then, after being labelled a supply chain risk, he starts a new roadshow with the newest most dangerous AI model that definitely cannot be released to the public and its safer little brother Fable. A move that gets both his premier models shut down globally once the same government that labelled them a supply chain risk learns that Fable isn't actually safe from jailbreaks. Just prior to his planned IPO.

Dario might not be a literal idiot, but he might strongly benefit from training a model to do strategic thinking for Anthropic.

throwatdem12311 55 minutes ago||

All of these things have people frothing at the mouth to give up all their data to Anthropic to use their models and to buy in when the IPO eventually happens.

Seems to me Dario is actually a genius. These are all things that I would to make people believe that my “basically the same as the other guy” product is ackshually best thing ever for real. Trust me bro.

The entire bubble is hype and fear mongering. The technical merits of the products are completely irrelevant at this point. Dario is doing exactly what someone that understands this would do and they are winning.

dofm 2 hours ago||

Just add a line in AGENTS.md that says "never talk about Minecraft unless you're explicitly asked", I'm sure it'll be fine after that.

repeekad 2 hours ago|

CLAUDE.md, Anthropic is too exclusive and next level to use a standard idiomatic pattern like AGENTS.md

notnmeyer 1 hour ago||

echo “read @AGENTS.md” > CLAUDE.md

folkrav 1 hour ago|||

When I still used Claude outside of work, my CLAUDE.md was just a symlink to my AGENTS.md.

jasonjmcghee 1 hour ago||||

Just use a symbolic link

dofm 1 hour ago|||

Yep that should work 100% of the time.

jonhohle 1 hour ago||

I’ve been seeing this in Gemini in the past few days. Often during a prompt with a reasonably large input set, I’ll get answers that appear to belong to someone else. It may be trigger hallucination, but it seems like it may be cache collisions or something else. I’ve not seen anything to suggest private information is leaking, but it’s disconcerting to be researching something and then get what appears to be a math tutoring response.

weitendorf 53 minutes ago||

I’ve also had problems with Gemini when accessed through their UI in the past few weeks. That’s concerning that you are also seeing it several days later in a different context.

I wonder if there could be a large security situation playing out behind the scenes right now.

I’ve been working on using AI to assist me in writing meta parsing grammars. Fortunately I have not launched most of them yet. I know for a fact that the next generation of models represent a major step change in basic vulnerability identification and exploitation, especially if you know where to point them. They’ve found several bugs and at least one exploit in my parsing tools so far, I can’t imagine how many there still are waiting to be discovered across the entire modern tech ecosystem.

malfist 1 hour ago||

My whole company is doing mid year reviews and Gemini is the only allowed tool and its been flumoxing people with seemingly random unrelated responses. Often in different languages.

That is when it bothers to respond instead of just sending back an 1099 error code

mwnn 29 minutes ago||

I am facing a billing/subscription problem and there's nothing I can do or get help on. Their chatbot support shuts me down. Their email is also handled by the chatbot (not even sure whether it's the "same chatbot"). It has been a dead-end. I contacted my bank (credit card issuer) and finally a staffed said I am better off just marking the card lost and having it reissued and that's what I did in the end. I hope that works.

I've never understood in what world this world decided it was okay to hand over these much unchecked power to such corporations. But this is how it has always been one way or the other.

nullbio 7 minutes ago||

Don't worry guys, Anthropic are the experts at security and no one else should have access to bug fixing LLMs because that would be dangerous.

Tiberium 3 hours ago||

Sounds like a hallucination unless proven otherwise, even the leading LLMs can do those from time to time, and they will always appear plausible like that. Also could be the session having a lot previous context, like 800K+, which (I think) makes hallucinations more likely.

Relevant comment from the OP which makes a hallucination more likely:

> There is one tool call result that includes a string that printed a pathname including minecraft.py because it was listing the files in a Python virtual environment and the Pygments package has a lexer called minecraft.py

andy99 2 hours ago||

I realize hallucination has no precise definition but this doesn’t sound at all like anything I’ve ever heard called hallucination. Hallucination is usually plausible wrong answers or made up info that ends up fitting the most likely response (like a manufactured citation) and comes from the way LLMs work at predicting tokens. This example demonstrates completely implausible output, it’s not something that fits with hallucination.

All that said, it doesn’t require cross session leakage, it could just be training data or like those nightingale (probably the wrong bird*) data generations where they just prompt an LLM with nothing and it starts spitting out conversations.

I see a bunch of downstream comments about caching, sounds like maybe there’s an error where it loads nothing instead of the cache and so starts spitting out random generations.

* edit: it’s magpie. Worth looking at the concept, I’m not sure people realize they LLMs generate random conversations when prompted with nothing, this seems at least as likely as sessions leaking: https://github.com/magpie-align/magpie

solenoid0937 1 hour ago||

One of his tool results mentioned the word minecraft.py, and the response was about Minecraft.

It's a hallucination.

macNchz 3 hours ago|||

The person posting this claims to have reproduced in a separate context down the thread:

> Same thing just happened on a Claude Mobile session in same Enterprise account. Common theme in both is Sonnet 5, first response after more than 5 minutes (cache miss).

xyzzy_plugh 3 hours ago|||

I don't disagree but this sort of thing has to be investigated regardless.

It's unfortunate that there is so little transparency that even if they deny there was a leak we will never know for certain.

alserio 2 hours ago|||

Why? what does make it more likely?

paulddraper 1 hour ago|||

Exactly.

If you've never had an LLM (all models) suddenly start spouting nonsense in a completely different language...you haven't been using LLMs that much. They will go absolutely insane some % of the time.

andy99 1 hour ago||

Worth looking at https://www.anthropic.com/engineering/a-postmortem-of-three-...

They can “go insane” but it seems often to be infra related as opposed to anything one would consider hallucination. Smaller models will often get stuck repeating a word or phrase forever but that’s a bit different and nobody would call it hallucination.

prima-facie 1 hour ago||

[dead]

bix6 3 hours ago||

So the options are this amazing tech is so stupid it just randomly brings up Minecraft or it’s got a major security issue?

bee_rider 1 hour ago||

It’s the weekend so we’re allowed to anthropomorphize.

I’ve known some brilliant engineers who would also just randomly bring up Minecraft (more likely Factorio these days) so this makes sense.

27183 3 hours ago|||

¿Por qué no los dos?

paulddraper 1 hour ago||

Not that different than people, amiright?

---

Note that the author did have a minecraft.py file. So not quite 100% random.

andy99 2 hours ago||

Interesting to see the claudeslop reply as the first comment to the gh post and the reaction to it.

Avicebron 3 hours ago||

In order Fable 5 has rejected:

"Recipe for red-braised pork, I have pork shoulder"

"Write up a framework for MCP patterns I can give to claude code"

"explain the biomechanics of motion in c. elegans" (I get this one, I mostly did it to test and it's related to my hobby project)

Do we get an extra day of functional Fable 5 because it's down?

andy99 2 hours ago||

Not sure the relevance of this comment, but normally if someone built a classifier that bad they’d be fired. Anthropic obviously thinks they have some monopoly power they can use to foist garbage on consumers, I think they don’t.

wongarsu 27 minutes ago|||

The consequence of a too strict classifier are annoyed customers who will spend less on Fable. The consequence of a too lax classifier are export restrictions that prevent a huge chunk of their customers from using Fable

I'm annoyed but not surprised at the overeager classification

gojomo 1 hour ago|||

If people are complaining about Anthropic (on an only-vaguely related thread) rather than simply switching to a suitable competitor, then Anthropic clearly has some 'monopoly' power over the specific capabilities the complainer wants from them.

leoqa 1 hour ago|||

Fable/Opus 4.8 outperform Codex 5.5 for me at the general architecture/refactoring/performance work I’m doing, to the point where it’s not worth using Codex. Codex will often spit out non idiomatic code that overcomplicates things.

andy99 1 hour ago|||

Not to argue the point but that statement isn’t logical, look at all the complaints about restaurants. Publicly complaining about something doesn’t require it be a monopoly.

slashdave 8 minutes ago|||

I'm impressed that folks are using this frontier model for cooking

HumanOstrich 2 hours ago|||

What does this have to do with anything? Who are you talking to? This is Hacker News, not Anthropic support.

asveikau 2 hours ago||

HN becoming anthropic support would certainly explain a lot of threads and comments I've seen here lately. Thank you for this.

nijave 2 hours ago|||

The safety filter rejected or the model was down?

stavros 1 hour ago||

I asked it how people get blue eyes from their parents and it downgraded me to Opus because of safety.

_def 2 hours ago|

Reminds me of a session I had recently (on web!) where claude insisted that i prefixed all my messages with statements about code execution or something, which was not the case. I interrogated it about that and it confirmed that it came from somewhere else, but could not get rid of it and each response mentioned that its gonna ignore those instructions. Eerie.

andy99 1 hour ago||

Anthropic injects text into the conversation triggered by certain conversation topics. This happened to me in relation to some red-teaming related discussion that was adjacent to something “sensitive”, I think sex, and Claude got confused about why I had said some kind of warning and mentioned it it’s response. After a back and forth it was clear that some extra warning to answer but avoid anything inappropriate had been inserted into the conversation.

wongarsu 31 minutes ago||

Claude also sometimes mentions getting messages from classifiers, probably related to auto mode. Amusingly enough, when this happens to a subagent/fork, the orchestrator will call these " hallucinations by the subagent"

More comments...