Top
Best
New

Posted by speckx 12/19/2025

Prepare for That Stupid World(ploum.net)
175 points | 96 commentspage 2
mlsu 12/19/2025|
This piece is pretty ineffective. Not that I like the world of "AI", I probably share the author's opinion that its just another evolution in the bullshittification of the human experience.

But, the point of the article is not that you would implement an agent based vending machine business. Humans restock the machine because its a red-team exercise. As a red-team exercise it looks very effective.

> Why do you ever want to add a chatbot to a snack vending machine? The video states it clearly: the vending machine must be stocked by humans. Customers must order and take their snack by themselves. The AI has no value at all.

Like this is like watching the simpsons and being like "why are the people in the simpsons yellow? people in real life aren't yellow!!"

The point isn't to run a profitable vending machine, or even validate that an AI business agent could become profitable. The point is to conduct an experiment and gather useful information about how people can pwn LLMs.

At some level the red team guy at Anthropic understands that it is impossible by definition for models to be secure, so long as they accept inputs from the real world. Putting instructions into an LLM to tell it what to do is the equivalent of exposing an `eval()` to a web form: even if you have heuristics to check for bad input, you will eventually be pwned. I think this is actually totally intractable without putting constraints on the model from outside. You'll always need a human in the loop to pull the plug on the vending machine when it starts ordering playstations. The question is how do you improve that capability, and that is the anthropic red-team guy's job.

layer8 12/19/2025|
> The point isn't to run a profitable vending machine, or even validate that an AI business agent could become profitable.

Having an AI run an organization autonomously is exactly the point of Andon Labs [0], who provided the system that WSJ tested.

[0] https://andonlabs.com/

mrandish 12/20/2025||
I read that WSJ article before seeing this blog post. I found it mildly interesting and a little bit funny but unsurprising that the AI failed. However, I think this blog about the article misses a key point. Anthropic's goal was never to develop an AI-based vending machine. The WSJ clearly says:

> "Logan Graham, head of Anthropic’s Frontier Red Team, told me the company chose a vending machine because it’s the simplest real-world version of a business. “What’s more straightforward than a box where things go in, things go out and you pay for them?” he said."

This was a project of Anthropic's Red Team, not a product development team. Deploying the AI in a vending machine context was chosen as a minimal "toy model" with which to expose how LLMs can't even handle a grossly simplified "business" with the fewest possible variables.

> "That was the point, Anthropic says. The Project Vend experiment was designed by the company’s stress testers (aka “red team”) to see what happens when an AI agent is given autonomy, money—and human colleagues."

Anthropic had already done this experiment internally and it succeeded - by failing to operate even the simplest business but doing so in ways that informed Anthropic's researchers about failure modes. Later, Anthropic offered to allow the WSJ to repeat the experiment, an obvious PR move to promote Anthropic's AI safety efforts by highlighting the kinds of experiments their Red Team does to expose failure modes. Anthropic knew it would fail abjectly at the WSJ. The whole concept of an AI vending machine with the latitude to set prices, manage inventory and select new products was intended to be ludicrous from the start.

spit2wind 12/19/2025||
Excuse me if someone already asked and I missed it: how does one prepare for such a world?

Is it some Viktor Frankl level acceptance or should I buy a copy of the Art of Electronics or what?

Advice welcome.

chunkmonke99 12/19/2025|
I don't think there is anything more than the standard advice. Just stay curious, make friends/build a community, keep learning, stay healthy. Why not get the AoE? you can also, check out "Practical Electronics for Inventors": AoE assumes you have some Electronics background imo. But seriously, I don't get the doom/gloom: things are going to be rough ... but maybe they won't? Many things I learned I did for their own sake! Things have always been uncertain and absurd I guess we might as well embrace it!
conorcleary 12/20/2025||
(also get age of empires)
chunkmonke99 12/20/2025||
Hahah absolutely!! Man that brings back memories.
brador 12/19/2025||
It was always tasks reaching obsolescence, but now it’s the human organism. But the human as a unit is the only known conscious being in the universe, the only entity capable of generating meaningful goals (even if only to them) not related to the 4fs.

Humans were just not needed anymore, and it terrifies.

sallveburrpi 12/19/2025||
Other beings than humans have demonstrated consciousness and “meaningful” goals besides humans. Crows for instance, but there are many others.

Humans were never needed (for what?)

neogodless 12/19/2025||
What are "4fs"? Is that the "4X" e.g. games where you eXplore, eXpand, eXploit, eXterminate?
stryan 12/19/2025||
The four basic actions in evolutionary biology: Feeding, Fleeing, Fighting, "Mating".
tim-tday 12/26/2025||
The author is missing the whole point. The exercise answers the implicit question: “can Ai do real work in this world? Can it replace workers, can it run a business?” And the experiment answers that question decisively “no it cannot”.

Why not? Read the article and ponder the seven to eleven distinct ways the AI goes wrong. Every AI enabled workflow will fail the same way. It even raises the question “is this technology even deserving of the title AI at all?” (If you ask me, probably not)

It lacks common sense, it lacks a theory of the world it lacks the ability to identify errors and self-correct. It lacks the ability to introspect. Now consider the thousands of people out there pushing the idea of fully autonomous AI “agents” doing work in the real world. Are we really for that? Again the exercise answers the question definitely. “No, the technology is not ready and cannot be trusted, and maybe in the current incarnation (Namely LLM based) cannot ever be trusted”

Now consider the tens of millions of people who think the technology IS ready for that. Anthropic is in the business of studying AI safety, publishing research, examining what AI is and how far we can trust it. They did that job smashingly.

snickerbockers 12/19/2025||
Someday the mcdonalds kiosk will want to be your friend. It will remember who you are and ask you how your kids are doing. It will recommend new specials and maybe even give you "specials friend" deals. And I'll just tell it to shut the fuck up and queue me an order for the egg mcmuffin combo with a coffee and the fried potato patty because this bullshit is fucking obnoxious.
sanbor 12/19/2025||
I have a different point of view. This was a test to see if the AI could perform a specific task. Asking AI to draw a pelican riding a bike is another test. I find the experiment interesting because it proves that currently LLMs are not able to perform a simple task reliably for a long period of time.

If the journalist was not asking the right questions, or was too obvious the article was PR it’s another thing (I haven’t read WSJ’s piece, only the original post by Anthropic)

inatreecrown2 12/29/2025||
some of the most intelligent things I read this year.
ursAxZA 12/20/2025|
If vending machines are the benchmark now, the logical next step is obvious: let AI run AI.
More comments...