Posted by brdd 1 day ago
is it "hobbled" to:
1. not give an LLM access to personal finances 2. not allow everyone in the world a write channel to the prompt (reading messages/email)
I mean, okay. Good luck I guess.
One thing I'm curious about: as the agent ingests more external content (documentation, code samples, forum answers), the attack surface for prompt injection expands. Malicious content in a Stack Overflow answer or dependency README could potentially influence generated code.
Does Apple's implementation have any sanitization layer between retrieved content and what gets fed to the model? Or is the assumption that code review catches anything problematic? Seems like an interesting security challenge as these tools go mainstream.
It's been discussed a lot but fundamentally there isn't a way to solve this yet (and it may not be solvable period). I'm sure they've asked their model(s) to not do anything stupid through the system prompt. Remember, prepending and appending text to the user's request to an LLM is the all you can do. With an LLM it's only text string in then text string out. That's it.
Fortune favors the bold, I guess.
1. https://openclaw.ai/ [also clawd.bot which is now a redirect here]
They all have similar copy which among other things touts it having a "local" architecture:
"Private by default—your data stays yours."
"Local-First Architecture - All data stays on your device. [...] Your conversations, files, and credentials never leave your computer."
"Privacy-First Architecture - Your data never leaves your device. Clawdbot runs locally, ensuring complete privacy and data sovereignty. No cloud dependencies, no third-party access."
Yet it seems the "local" system is just a bunch of tooling around Claude AI calls? Yes, I see they have an option to use (presumably hamstrung) local models, but the main use-case is clearly with Claude -- how can they meaningfully claim anything is "local-first" if everything you ask it to do is piped to Claude servers? How are these claims of "privacy" and "data sovereignty" not outright lies? How can Claude use your credentials if they stay on your device? Claude cannot be run locally last I heard, am I missing something here? Ox Security, a "vibe-coding security platform," highlighted these vulnerabilites to its creator, Peter Steinberg. The response wasn't exactly reassuring.
“This is a tech preview. A hobby. If you wanna help, send a PR. Once it’s production ready or commercial, happy to look into vulnerabilities.”[1]
In light of this I'm inclined to conclude- yeah, they're just lying about the privacy stuff.1. https://www.xda-developers.com/please-stop-using-openclaw/
Short term hacky tricks:
1. Throw away accounts - make a spare account with no credit card for airbnb, resy etc.
2. Use read only when it's possible. It's funny that banks are the one place where you can safely get read only data via an API (plaid, simplefin etc.). Make use of it!
3. Pick a safe comms channel - ideally an app you don't use with people to talk to your assistant. For the love of god don't expose your two factor SMS tokens (also ask your providers to switch you to proper two factor most finally have the capability).
4. Run the bot in a container with read only access to key files etc.
Long term:
1. We really do need services to provide multiple levels of API access, read only and some sort of very short lived "my boss said I can do this" transaction token. Ideally your agent would queue up N transactions, give them to you in a standard format, you'd approve them with FaceID, and that will generate a short lived per transaction token scoped pretty narrowly for the agent to use.
2. We need sensible micropayments. The more transactional and agent in the middle the world gets, the less services can survive with webpages,apps,ads and subscriptions.
3. Local models are surprisingly capable for some tasks and privacy safe(er)... I'm hoping these agents will eventually permit you to say "Only subagents that are local may read my chat messages"