Posted by chatmasta 4 hours ago
One gave us a proper postmortem in which their API gateway was incorrectly handling HTTP 100 status codes, putting them into an error state where there was effectively an off by one error - you would receive the response to the prompt that came in before yours and would pay it forward (your response would go to the next caller).
The other instance never had root cause explained to us, and we were just told to trust it wouldn’t happen again.
Both of these are from $1T+ companies.
ZDR wasn’t compromised in these cases since it was responses being swapped in flight. I wouldn’t be surprised if this is a similar issue - it’s not that data is being retained, it’s just not being safely isolated in intermediate infrastructure.
Every time you multiplex requests from multiple clients onto one upstream connection, you are probably vulnerable to this, because (despite its superficial simplicity) HTTP is just too complex to reliably match the requests and responses to upstream.
For example a desync can be triggered in some systems by having more than one Content-Length header, by mixing Content-Length with chunked encoding, or by passing an HTTP/2 header called Content-Length that doesn't match the actual content length.
Here's a DEF CON talk (6 years ago) on this topic: https://www.youtube.com/watch?v=w-eJM2Pc0KI
The same attack has been applied to SMTP by messing up the line endings surrounding the end-of-message delimiter, where it's called SMTP smuggling. It may also apply to other protocols.
https://portswigger.net/research/talks?talkId=36
Maybe my last presentation on the topic! Possibly.
HN doesn't believe superintelligence will be a thing; while the AI safety crowd believes they are building it. So the decisionmaking of the safety crowd is incomprehensible to HN.
Grow up. Whenever push comes to shove, they reduce safety and alignment departments, rush out releases over the heads of the same departments. If you engaged with the news these last years you’d see it for what it is “models for me, but not for thee”.
Both companies were founded on the basis of AI Safety.
- There are tons of great safety people doing real work at OpenAI. Releases are held back, models are evaluated, etc.
- Anthropic goes even further - constrained themselves with a PBC/LTBT structure, treat safety even more rigorously, and notably delayed the release of Mythos (literally the opposite of what you alleged) and continue to hold their two red lines.
You should actually talk to some of the people at these labs. Nearly everyone working at these places genuinely believe AGI/ASI is actually happening, so they do take safety seriously.
To imply these companies don't care about safety is typical internet-brand nihilism/cynicism that helps you feel smart while being literally the opposite of the truth.
Moreover, your take on Dario is over simplistic, and undersells the extent to which Anthropic takes seriously safety. It's not lip service, there are real dollars and attention spent on alignment at Anthropic.
Dario might not be a literal idiot, but he might strongly benefit from training a model to do strategic thinking for Anthropic.
Seems to me Dario is actually a genius. These are all things that I would to make people believe that my “basically the same as the other guy” product is ackshually best thing ever for real. Trust me bro.
The entire bubble is hype and fear mongering. The technical merits of the products are completely irrelevant at this point. Dario is doing exactly what someone that understands this would do and they are winning.
I wonder if there could be a large security situation playing out behind the scenes right now.
I’ve been working on using AI to assist me in writing meta parsing grammars. Fortunately I have not launched most of them yet. I know for a fact that the next generation of models represent a major step change in basic vulnerability identification and exploitation, especially if you know where to point them. They’ve found several bugs and at least one exploit in my parsing tools so far, I can’t imagine how many there still are waiting to be discovered across the entire modern tech ecosystem.
That is when it bothers to respond instead of just sending back an 1099 error code
I've never understood in what world this world decided it was okay to hand over these much unchecked power to such corporations. But this is how it has always been one way or the other.
Relevant comment from the OP which makes a hallucination more likely:
> There is one tool call result that includes a string that printed a pathname including minecraft.py because it was listing the files in a Python virtual environment and the Pygments package has a lexer called minecraft.py
All that said, it doesn’t require cross session leakage, it could just be training data or like those nightingale (probably the wrong bird*) data generations where they just prompt an LLM with nothing and it starts spitting out conversations.
I see a bunch of downstream comments about caching, sounds like maybe there’s an error where it loads nothing instead of the cache and so starts spitting out random generations.
* edit: it’s magpie. Worth looking at the concept, I’m not sure people realize they LLMs generate random conversations when prompted with nothing, this seems at least as likely as sessions leaking: https://github.com/magpie-align/magpie
It's a hallucination.
> Same thing just happened on a Claude Mobile session in same Enterprise account. Common theme in both is Sonnet 5, first response after more than 5 minutes (cache miss).
It's unfortunate that there is so little transparency that even if they deny there was a leak we will never know for certain.
If you've never had an LLM (all models) suddenly start spouting nonsense in a completely different language...you haven't been using LLMs that much. They will go absolutely insane some % of the time.
They can “go insane” but it seems often to be infra related as opposed to anything one would consider hallucination. Smaller models will often get stuck repeating a word or phrase forever but that’s a bit different and nobody would call it hallucination.
I’ve known some brilliant engineers who would also just randomly bring up Minecraft (more likely Factorio these days) so this makes sense.
---
Note that the author did have a minecraft.py file. So not quite 100% random.
"Recipe for red-braised pork, I have pork shoulder"
"Write up a framework for MCP patterns I can give to claude code"
"explain the biomechanics of motion in c. elegans" (I get this one, I mostly did it to test and it's related to my hobby project)
Do we get an extra day of functional Fable 5 because it's down?
I'm annoyed but not surprised at the overeager classification