But, the point of the article is not that you would implement an agent based vending machine business. Humans restock the machine because its a red-team exercise. As a red-team exercise it looks very effective.
> Why do you ever want to add a chatbot to a snack vending machine? The video states it clearly: the vending machine must be stocked by humans. Customers must order and take their snack by themselves. The AI has no value at all.
Like this is like watching the simpsons and being like "why are the people in the simpsons yellow? people in real life aren't yellow!!"
The point isn't to run a profitable vending machine, or even validate that an AI business agent could become profitable. The point is to conduct an experiment and gather useful information about how people can pwn LLMs.
At some level the red team guy at Anthropic understands that it is impossible by definition for models to be secure, so long as they accept inputs from the real world. Putting instructions into an LLM to tell it what to do is the equivalent of exposing an `eval()` to a web form: even if you have heuristics to check for bad input, you will eventually be pwned. I think this is actually totally intractable without putting constraints on the model from outside. You'll always need a human in the loop to pull the plug on the vending machine when it starts ordering playstations. The question is how do you improve that capability, and that is the anthropic red-team guy's job.
Having an AI run an organization autonomously is exactly the point of Andon Labs [0], who provided the system that WSJ tested.
> "Logan Graham, head of Anthropic’s Frontier Red Team, told me the company chose a vending machine because it’s the simplest real-world version of a business. “What’s more straightforward than a box where things go in, things go out and you pay for them?” he said."
This was a project of Anthropic's Red Team, not a product development team. Deploying the AI in a vending machine context was chosen as a minimal "toy model" with which to expose how LLMs can't even handle a grossly simplified "business" with the fewest possible variables.
> "That was the point, Anthropic says. The Project Vend experiment was designed by the company’s stress testers (aka “red team”) to see what happens when an AI agent is given autonomy, money—and human colleagues."
Anthropic had already done this experiment internally and it succeeded - by failing to operate even the simplest business but doing so in ways that informed Anthropic's researchers about failure modes. Later, Anthropic offered to allow the WSJ to repeat the experiment, an obvious PR move to promote Anthropic's AI safety efforts by highlighting the kinds of experiments their Red Team does to expose failure modes. Anthropic knew it would fail abjectly at the WSJ. The whole concept of an AI vending machine with the latitude to set prices, manage inventory and select new products was intended to be ludicrous from the start.
Is it some Viktor Frankl level acceptance or should I buy a copy of the Art of Electronics or what?
Advice welcome.
Humans were just not needed anymore, and it terrifies.
Humans were never needed (for what?)
Why not? Read the article and ponder the seven to eleven distinct ways the AI goes wrong. Every AI enabled workflow will fail the same way. It even raises the question “is this technology even deserving of the title AI at all?” (If you ask me, probably not)
It lacks common sense, it lacks a theory of the world it lacks the ability to identify errors and self-correct. It lacks the ability to introspect. Now consider the thousands of people out there pushing the idea of fully autonomous AI “agents” doing work in the real world. Are we really for that? Again the exercise answers the question definitely. “No, the technology is not ready and cannot be trusted, and maybe in the current incarnation (Namely LLM based) cannot ever be trusted”
Now consider the tens of millions of people who think the technology IS ready for that. Anthropic is in the business of studying AI safety, publishing research, examining what AI is and how far we can trust it. They did that job smashingly.
If the journalist was not asking the right questions, or was too obvious the article was PR it’s another thing (I haven’t read WSJ’s piece, only the original post by Anthropic)