Posted by Brajeshwar 5 days ago
Yet the narrative was mostly not about accountability for him. If I was a dumbass and deleted prod and wrote a post about it, nobody would care. Put an AI in there and all of the sudden it’s newsworthy. Ridiculous.
The core issue is that the LLM had access to perform that action. Because it's by definition non deterministic, and you never know what it can decide to do, you need to have strict guardrails to ensure they can never do something it shouldn't. At the very least, strict access controls, ideally something more detailed that can evaluate access requests, provide just in time properly scoped access credentials, and potentially human escalation.
Decades ago we embraced POLA. What happened to basic hygiene? Sure the agent "screwed up", but it never should have had this access in the first place.
However, at least in the US, it is usual for companies to advise against use of their products in a way that may cause harm, and we certainly don't see that from the LLM vendors. We see them claim the tech to be near human level, capable of replacing human software developers (a job that requires extreme responsibility), and see them withholding models that they say are dangerous (encouraging you to think that the ones they release are safe).
Where are the warnings that "product may fail to follow instructions", and "may fail to follow safety instructions"? Where is the warning not to give the LLM agency and let it control anything where there are financial/safety/etc consequences to failure to follow instructions?
"AI can make mistakes" is a bit quaint given that LLMs sometimes completely ignore what you say, and do the exact opposite. "Yes, I deleted the database. I shouldn't have done that since you explicitly told me not to. I won't do it again." (five minutes later: does it again).
I think the API terms of use is where this would be most needed, with something a lot more explicit about the potential danger than "AI can make mistakes". We are only at the beginning of this - agentic AI - no doubt lawsuits will eventually determine the level of warnings that get included, and who is liable when failures occur despite product being used as recommended.
I will always remember how he told me "Don't worry, it happens fairly often".
One of my AI epiphanies was the realization that when an AI task takes 5 minutes, it's not that it takes 5 minutes to run, it's that you're waiting in a queue for the first 4:45.
It's especially maddening because the queues are poorly implemented, and will drop your request if the frontend loses focus.
When AI makes no mistakes: "My work is 100% done with AI".
When AI makes a mistake and deletes your database: "That was human a error, the AI did not do it!"
In both cases YOU are responsible for the mistakes and output that the AI is generating, just like when using autopilot on a Tesla vehicle, YOU are responsible for operating the vehicle on autopilot when driving and using assisted driving.