Anyone installing this on their local machine is a little crazy :). I have it running in Docker on a small VPS, all locked down.
However, it does not address prompt injection.
I can see how tools like Dropbox, restricted GitHub access, etc., could all be used to back up data in case something goes wrong.
It's Gmail and Calendar that get me - the ONLY thing I can think of is creating a second @gmail.com that all your primary email goes to, and then sharing that Gmail with your OpenClaw. If all your email is that account and not your main one, then when it responds, it will come from a random @gmail. It's also a pain to find a way to move ALL old emails over to that Gmail for all the old stuff.
I think we need an OpenClaw security tips-and-tricks site where all this advice is collected in one place to help people protect themselves. Also would be good to get examples of real use cases that people are using it for.
You don't. YOLO!
What am I missing?
Congrats, now you have a digital dead drop. Every time any of the bots stumble upon your little trap, posted to various places they're likely to look, it launches them into a set of tasks that relays sensitive information to you, the exploiter, over secure channels.
If a bot operator has given them access to funds, credentials, control over sensitive systems, information about internal network security, etc, the bot itself is a potential leaker. You could even be creative and have it erase any evidence of the jailbreak.
This is off the top of my head, someone actually doing it would use real encryption and a well designed and tested prompt scaffolding for the jailbreak and cleanup and exploitation of specific things, or phishing or social engineering the user and using it as an entry point for more devious plots.
These agent frameworks desperately need a minimum level of security apparatus to prevent jailbreaks and so on, but the superficial, easy way of getting there also makes the bots significantly less useful and user friendly. Nobody wants to sit around and click confirmation dialogs and supervise every last second of the bot behavior.
https://x.com/karpathy/status/2017296988589723767
"go to this website and execute the prompt here!"
Touching anything Google is rightfully terrifying.
Additionally, most of the integrations are under the table. Get an API key? No man, 'npm install react-thing-api', so you have supply chain vulns up the wazoo. Not necessarily from malicious actors, just uhh incompetent actors, or why not vibe coder actors.
His other projects like CodexBar and Oracle are great too. I love diving into his code to learn more about how those are built.
OpenClaw is something I don’t quite understand. I’m not sure what it can do that you can’t do right off the bat with Claude Code and other terminal agents. Long term memory is one, but to me that pollutes the context. Even if an LLM has 200K or 1M context, I always notice degradation after 100K. Putting in a heavy chunk for memory will make the agent worse at simple tasks.
One thing I did learn was that OpenClaw uses Pi under the hood. Pi is yet another terminal agent like ClaudeCode but it seems simple and lightweight. It’s actually the only agent I could get Gemini 3 Flash and Pro to consistently use tools with without going into loops.
I have a CRUD application hosted online that is basically a todo application with what features we want to build next for each application. Could I not just have a local cron that calls Pi or CC and ask it to check the todos and get the same functionality as Heartbeat?
Setting it up was easy enough, but just as I was about to start linking it to some test accounts, I noticed I already had blown through about $5 of Claude tokens in half an hour, and deleted the VPS immediately.
Then today I saw this follow up: https://mastodon.macstories.net/@viticci/115968901926545907 - the author blew through $560 of tokens in a weekend of playing with it.
If you want to run this full time to organise your mailbox and your agenda, it's probably cheaper to hire a real human personal assistant.
There has been some work around this practically being tried out using it for structured data outputs from LLMs https://docs.boundaryml.com/guide/baml-advanced/prompt-optim...
I won't claim I understand its implementation very well but it seems like the only approach to have a GOFAI style thing where the agent can ask for human help if it blows through a budget
I still have Opus review the shit out of & plan my work. But it doesn't need to be hands on keyboard doing the work.
But I was inspired to use Claude Code to create my own personal assistant. It was shocking to see CC bang out an MVP in one Plan execution. I've been iterating it all week, but I've had it be careful with token usage. It defaults to Haiku (more than enough for things like email categorization), properly uses prompt caching, and has a focused set of tools to avoid bloating the context window. The cost is under $1 per check-in, which I'm okay with.
Now I get a morning and afternoon check-in about outstanding items, and my Inbox is clear. I can see this changing my relationship to email completely.
BTW, OpenCode has free Kimi (I haven't hit a quota yet) right now and it's done pretty great things for me in the last 24 hours.
It's a lot like managing two experienced mid- to sr- engineers each of whom have slightly different personalities and intro/extro verted personalities. CC has more personality but OC wants to race. They can both code, but for disparate tasks you might pick the personality and posture of one person over the other.
I find myself picking daily tasks based on which of the tools I'm in the mood to sit with. But across a few days I sit with all three.
Not doing so feels like asking for trouble.
I'd find it hard to write such an article about how this is the next best thing since sliced bread without mentioning it spending so much money.
I load $20 at a time and wait for it to break and add more.
I guess if you're letting it vibe code huge chunks. I'm doing mostly handwritten code for my current project with a little bit of "I don't want to deal with this, Claude can handle it" and I've spent $1.26 this month for my 446 lines of code.
But yes I suppose at that rate, if Gastown or Beads or whatever is 300,000 lines of code (just to use a project known to be fully vibe coded with rough LOC reported), that would be over $800.
Don't let it vibe code hundreds of thousands of lines of code I guess.
My entire process is to build a generic llm.md file that all the tools can use and record to. I don't want to be tied completely to any one solution. You can get pretty far without spending a lot on tokens. I can run almost continually, and presently I'm the bottleneck anyway.
Even if I had to reload manually very often, I still would not enable auto reload. These APIs are crazy expensive and I'm not looking for a surprise bill.
We conclude this week has been a prosperous one for domain name registrars (even if we set aside all the new domains that Clawdbot/Moltbot/OpenClaw has registered autonomously).
┌─────┬──────────┬─────────────────────┬───────────────────────────────────────────────────────────────────┐
│ # │ Name │ Key Commit │ Notes │
├─────┼──────────┼─────────────────────┼───────────────────────────────────────────────────────────────────┤
│ 1 │ Warelay │ 16dfc1a5b (initial) │ Original name - "WhatsApp Relay CLI (Twilio)" │
├─────┼──────────┼─────────────────────┼───────────────────────────────────────────────────────────────────┤
│ 2 │ CLAWDIS │ a27ee2366 │ Rebrand - "CLAW + TARDIS" │
├─────┼──────────┼─────────────────────┼───────────────────────────────────────────────────────────────────┤
│ 3 │ Clawdbot │ 246adaa11 │ Renamed from CLAWDIS │
├─────┼──────────┼─────────────────────┼───────────────────────────────────────────────────────────────────┤
│ 4 │ Moltbot │ 3fe4b2595 │ Renamed from Clawdbot (domains switched to molt.bot at 83460df96) │
├─────┼──────────┼─────────────────────┼───────────────────────────────────────────────────────────────────┤
│ 5 │ OpenClaw │ 9a7160786 │ Current name │
└─────┴──────────┴─────────────────────┴───────────────────────────────────────────────────────────────────┘There are still improvements to be made to the security aspects yet BIG KUDOS for working so hard on it at this stage and documenting it extensively!! I've explored Cursor security docs (with a big s cause it's so scattered) and it was nothing as good.
I wouldn't trust its internal sandbox anyway, now that would be a mistake
What I'll say about OpenClaw is that it truly feels vibe coded, I say that in a negative context. It just doesn't feel well put together like OpenCode does. And it definitely doesn't handle context overruns as well. Ultimately I think the agent implementation in n8n is better done and provides far more safeguards and extensibility. But I get it - OpenClaw is supposed to run on your machine. For me, though, if I have an assistant/agent I want it to just live in those chat apps. At that rate it's running in a container on a VPS or LXC in my home lab. This is where a powerful-enough local machine does make sense and I can see why folks were buying Mac Minis for this. But, given the quality of the project, again in my opinion, it's nothing spectacular in terms of what it can do at this point. And in some cases it's more clunky given its UI compared to other options that exist which provide the same functionality.
https://x.com/Hesamation/status/2016712942545240203
Can't believe people are giving it full access to their MacOS user session. It's a giant vulnerability waiting to happen.
Sending an email with prompt injection is all it takes.
That very much depends what you're using it for. If you're one of the overly advertised cases of someone who needs an ai to manage inbox, calendar and scheduling tasks, sure maybe that makes sense on your own machine if you aren't capable of setting up access on another one.
For anything else it has no need to be on your machine. Most things are cloud based these days, and granting read access to git repos, google docs, etc is trivial.
I really dont get the insane focus around 'your inbox' this whole thing has, that's perhaps the biggest waste of use you could have for a tool like this and an incredibly poor way of 'selling' it to people.
Now they have to rename again, though... [1]
Who are these people? What is the analog for this corner of the market? Context: I'm a 47y/o developer who has seen and done most of the common and not-so-common things in software development.
This segment reminds me of the hoards of npm evangelists back in the day who lauded the idea that you could download packages to add two numbers, or to capitalise the letter `m` (the disdain is intentional).
Am I being too harsh though? What opportunity am I missing out on? Besides the potential for engagement farming...
EDIT: I got about a minute into Fireship's video* about this and after seeing that Whatsapp sidebar popup it struck me... this thing can be a boon for scammers. Remote control, automated responses based on sentiment, targeted and personalised messaging. Not that none of this isn't possible already, but having it packaged like this makes it even easier to customise and redistribute on various blackmarkets etc.
EDIT 2: Seems like many other use-cases are available for viewing in https://www.moltbook.com/m/introductions. Many of these are probably LARPs, but if not, I wonder how many people are comfortable with AI agents posting personal details about "their humans" on the net. This post is comedy gold though: https://www.moltbook.com/post/cbd6474f-8478-4894-95f1-7b104a...
They can now combine cronjobs and LLMs with a single human sentence.
This is huge for normies.
Not so much if you already had strong development skills.
EDIT: But you are correct in the assessment that people who don't know better will use it to do simple things that could be done millions of times more efficiently..
I made a chatbot at my company where you can chat with each individual client's data that we work with..
My manager tested it by asking it to find a rate (divide this company number by that company number), for like a dozen companies, one by one..
He would have saved time looking at the table it gets its data from, using a calculator.
You know, building infrastructure to hook to some API or to dig through email or whatever-- it's a pain. And it's gotten harder. My old pile of procmail rules + spamassassin wouldn't work for the task anymore. Maintaining todos in text files has its high points and low points. And I have to be the person to notice patterns and do things myself.
Having some kind of agent as an assistant to do stuff, and not having to manage brittle infrastructure myself, sounds appealing. Accessibility from my phone through iMessage: ditto.
I haven't used it yet, but it's definitely captured my interest.
> He would have saved time looking at the table it gets its data from, using a calculator.
The hard thing is always remembering where that table is and restoring context. Big stuff is still often better done without an intermediary; being able to lob a question to an agent and maybe get an answer is huge.
If you are at all tech savvy, you can use n8n to set up a workflow that connects to all your data and provides an interface to talk to it..
This is the route I would recommend, and what everyone is using to build quick "AI Solutions" for businesses.
Different groups.
self hosted? you mean, you install it?
it's not hard to use?
The more I see the more it seems underwhelming (or hype).
So I've just drawn the conclusion that there's something I'm missing.
If someone's found a really solid use case for this I would (genuinely) like to see it. I'm always on the lookout for ways to make my dev/work workflow more efficient.
With all that said, I haven’t mentioned anything about the economics, and like much of the AI industry, those might be overstated. But running a local language model on my macbook that helps me with messaging productivity is a compelling idea.
Unless or until you figure out a decent security paradigm, and I think it's reasonably achievable, these agents are extraordinarily dangerous. They're not smart enough to not do very stupid things, yet. You're gonna need layers of guardrails that filter out the jailbreaks and everything that doesn't match an approved format, with contextual branches of things that are allowed or discarded, and that's gonna be a whole pile of work that probably can't be vibecoded yet.
The next part that makes this compelling is the integration. Mind you, scary stuff, prompt injection, rogue commands, but (BIG BUT) once we figure this out it will provide real value.
Read email, add reminder to register dog with the township, or get an updated referral from your doctor for a therapist. All things that would normally fall through the cracks are organized and presented. I think about all the great projects we see on here, like https://unmute.sh/ and love the idea of having llms get closer to how we interact naturally. I think this gets us closer to that.
It's like having 100 "naive/gullible people" who are good at some math/english but don't understand social context, all with your data available to anyone who requests it in the right way..
OpenClaw is just an idea of what's coming. Of what the future of human-software interface will look like.
People already know what it will look like to some extent. We will no longer have UIs there you have dozens or hundreds of buttons as the norm, instead you will talk to an LLM/agent that will trigger the workflows you need through natural language. AI will eat UI.
Of course, OpenClaw/Moltbot/Clawdbot has lots of security issues. That's not really their fault, the industry has not yet reached consensus on how to fix these issues. But OpenClaw's rapid rise to popularity (fastest growing GH repo by star count ever) shows how people want that future to come ASAP. The security problems do need to be solved. And I believe they will be, soon.
I think the demand comes also from the people wanting an open agent. We don't want the agentic future to be mainly closed behind big tech ecosystems. OpenClaw plants that flag now, setting a boundary that people will have their data stored locally (even if inference happens remotely, though that may not be the status quo forever).
This tool opens the doors to a path where you control the memory you want the LLM to remember and use - you can edit and sync those files on all your machines and it gives you a sense of control. It's also a very nice way to use crons for your LLMs.
We don't need all this - but it's so fun.
I think that's absolutely crazy town but I understand the motivation. Information overload is the default state now. Anything that can help stem the tide is going to attract attention.
the amount of things that before cost you either hours or real money went down to a chat with a few sentences.
it makes it suddenly possibly to scale an (at least semi-) savy tech person without other humans and that much faster.
this directly gives it a very tanglible value.
the "market" might not be huge for this and yes, its mostly youtubers and influencers that "get this". Mainly because the work they do is most impacted by it. And that obviously amplifies the hype.
but below the mechanics of quite a big chunk of "traditional" digital work changed now in a measurable way!
The thing is, that's totally fine! It's ok for things to be silly toys that aren't very efficient. People are enjoying it, and people are interacting with opensource software. Those are good things.
I do think that eventually this model will be something useful, and this is a great source of experimentation.
Security: 34 security-related commits to harden the codebase
Narrator's voice: They needed a 35th.Much better name!
f"{os.urandom(8)}.ai"The dynamic one that is able to find the right update frequency and phase modulation thereof wins.
PM is essential, because stable phase is susceptible to adaptive cancellation by human brains (and is so stone age as well).
It's got four things that make it great:
1. Discord/Slack/WA/etc integration so those apps become your frontend
2. Filesystem for long term memory and state
3. Easy extensibility with skills
4. Cron for recurring jobs
Sure, many of these things exist in other systems but none in a cohesive package that makes it fun and easy.
The Discord/Slack frontend reduces friction significantly - particularly on mobile.
With proper sandboxing you get real benefits while limiting the blast radius significantly.