Posted by Cyphase 1 day ago
Why are Karpathy and SimonW trying to push new terms on us all the time? What are they trying to gain from this weird ass hype cycle?
What could go wrong.
Disappointing. There is a Rust-based assistant that can run comfortably in a Raspberry PI (or some very old computer you are not using) https://zeroclawlabs.ai/ https://github.com/zeroclaw-labs/zeroclaw (Built by Harvard and MIT students, looks like)
EDIT: sorry top Google result led to a fake ZeroClaw!
This is the official repo https://github.com/zeroclaw-labs/zeroclaw and its website: https://zeroclawlabs.ai/
> Anyway there are many others - e.g. nanobot, zeroclaw, ironclaw, picoclaw (lol @ prefixes).
I am one of those people and I work at a FANG.
And while I know it seems annoying, these teams are overwhelmed with not only innovators but lawyers asking so many variations of the same question it's pretty hard to get back to the innovators with a thumbs up or guidance.
Also there is a real threat here. The "wiped my hard drive" story is annoying but it's a toy problem. An agent with database access exfiltrating customer PII to a model endpoint is a horrific outcome for impacted customers and everyone in the blast radius.
That's the kind of thing keeping us up at night, not blocking people for fun.
I'm actively trying to find a way we can unblock innovators to move quickly at scale, but it's a bit of a slow down to go fast moment. The goal isn't roadblocks, it's guardrails that let you move without the policy team being a bottleneck on every request.
I work on commercial OSS. My fear is that it’s exfiltrated to public issues or code. It helpfully commits secrets or other BS like that. And that’s even ignoring prompt injection attacks from the public.
I get handed an application developed by my company for use by partner companies. It's a java application, shipped as a jar, nothing special. It gets signed by our company, but anybody with the wherewithal can pull the jar apart and mod the application however they wish. One of the partner companies has already done so, extensively, and come back to show us their work. Management at my company is impressed and asks me to add official plugin support to the application. Can you guess where this is going?
I add the plugin support,the application will now load custom jars that implement the plugin interface I had discussed with devs from that company that did the modding. They think it's great, management thinks its great, everything works and everybody is happy. At the last minute some security policy wonk throws on the brakes. Will this load any plugin jar? Yes. Not good! It needs to only load plugins approved by the company. Why? Because! Never mind that the whole damn application can be unofficially nodded with ease. I ask him how he wants that done, he says only load plugins signed by the company. Retarded, but fine. I do so. He approves it, then the partner company engineer who did the modding chimes in that he's just going to mod the signature check out, because he doesn't want to have to deal with this shit. Security asshat from my company has a melt down and long story short the entire plugin feature, which was already complete, gets scrapped and the partner company just keeps modding the application as before. Months of my life down the drain. Thanks guys, great job protecting... something.
You seem to blame the person who is trying to save the company from security issues, rather than placing the blame on your boss that made you do work that would never gotten approved in the first place if they just checked with the right person first?
Yes, management was ultimately at fault. They're at fault for not tard wrangling the security guys into doing their jobs up front. They're also at fault for not tard wrangling the security guys when they object to an inherently modifiable application being modified.
Why did the security team initially give the okay to checking signatures on plugin jars? They're supposed to be security experts, what kind of security expert doesn't know that a signature check like that could be modded out? I knew it when I implemented it, and the modder at the partner corp obviously knew it but lacked the tact to stay quiet about it. Management didn't realize it, but they aren't technical. So why didn't security realize it until it was brought to their attention? Because they were retarded.
By the way, this application is still publicly downloadable, still easily modded, and hasn't been updated in almost 10 years now. Security review is fine with that, apparently. They only get bent out of shape when somebody actually tries to make something more useful, not when old nominally vulnerable software is left to rot in public. They're not protecting the company from a damn thing.
They insist we can't let client data [0] "into the cloud" despite the fact that the client's data is already in "the cloud" and all I want to do is stick it back into the same "cloud", just a different tenant. Despite the fact that the vendor has certified their environment to be suitable for all but the most absolutely sensitive data (for which if you really insist, you can call then for pricing), no, we can't accept that and have to do our own audit. How long is that going to take? "2 years and $2 million". There is no fucking way. No fucking way that is the real path. There is no way our competitors did that. There is no way any of the startups we're seeing in this market did that. Or! Or! If it's true, why the fuck didn't you start it back two years ago when we installed this was necessary the first time? Hell, I'd be happy if you had started 18 months ago, or a year ago. Anything! You were told several times, but the president of our company, to make this happen, and it still hasn't happened?!?!
They say we can't just trust the service provider for a certain service X, despite the fact that literally all of our infrastructure is provided by same service provider, so if they were fundamentally untrustworthy then we are already completely fucked.
I have a project to build a new analytics platform thing. Trying to evaluate some existing solutions. Oh, none of them are approved to be installed on our machines. How do we get that approval? You can't, open source sideways is fundamentally untrustworthy. Which must be why it's at the core of literally every piece of software we use, right? Oh, but I can do it in our new cloud environment! The one that was supposedly provided by an untrustworthy vendor! I have a bought-and-paid-for laptop with fairly decent specs and they seriously expect me and my team to remote desktop into a VM to do our work, paying exorbitant monthly fees for equivalent hardware to what we will now have sitting basically idle on our desks! And yes, it will be "my" money. I have a project budget and I didn't expect to have to increase it 80% just because "security reasons". Oh yeah, I have to ask them to install the software and "burn it into the VM image" for me. What the fuck does that even mean!? You told me 6 months ago this system was going to be self-service!
We are entering our third year of new leadership in our IT department, yet this new leadership never guts the ranks of the middle managers who were the sticks in the mud. Two years ago we hired a new CIO. Last year we got a deputy CIO to assist him. This year, it's yet another new CIO, but the previous two guys aren't gone, they are staying in exactly their current duties, their titles have just changed and they report to the new guy. What. The. Fuck.
[0] To be clear, this is data the client has contracted us to do analysis on. It is also nothing to do with people's private data. It's very similar to corporate operations data. It's 100% owned by the client, they've asked us to do a job with it and we can't do that job.
Fine. The compliance catastrophe will be his company's not yours'.
So did "Move fast and break things" not work out? /i
"unlock innovators" is a very mild example; perhaps you shouldn't be a jailor in your metaphors?
A few things help a lot (for BOTH sides - which is weird to say as the two sides should be US vs Threat Actors, but anyway):
1. Detach your identity from your ideas or work. You're not your work. An idea is just a passerby thought that you grabbed out of thin air, you can let it go the same way you grabbed it.
2. Always look for opportunities to create a dialogue. Learn from anyone and anything. Elevate everyone around you.
3. Instead of constantly looking for reasons why you're right, go with "why am I wrong?", It breaks tunnel vision faster than anything else.
Asking questions isn't an attack. Criticizing a design or implementation isn't criticizing you.
Thank you,
One of the "security people".
I'm okay with the people in charge of building on top of my private information being jailed by very strict, mean sounding, actually-higher-than-you people whose only goal is protecting my information.
Quite frankly, if you changed any word of that, they'd probably be impotent and my data would be toast.
They will also burn other people, which is a big problem you can’t simply ignore.
https://theshamblog.com/an-ai-agent-published-a-hit-piece-on...
But even if they only burned themselves, you’re talking as if that isn’t a problem. We shouldn’t be handing explosives to random people on the street because “they’ll only blow their own hands”.
Isn't the whole selling point of OpenClaw that you give it valuable (personal) data to work on, which would typically also be processed by 3rd party LLMs?
The security and privacy implications are massive. The only way to use it "safely" is by not giving it much of value.
For example, a bot account cannot initiate conversations, so everyone would need to first message the bot, doesn't that defeat the entire purpose of giving openclaw access to it then? I thought they were supposed to be your assistant and do outbound stuff too, not just react to incoming events?
If you mean it's not outbound as in it can't message arbitrary random users out of nowhere, well yeah, and that's a very desirable trait.
https://github.com/skorokithakis/stavrobot
At least I can run this whenever, and it's all entirely sandboxed, with an architecture that still means I get the features. I even have some security tradeoffs like "you can ask the bot to configure plugin secrets for convenience, or you can do it yourself so it can never see them".
You're not going to be able to prevent the bot from exfiltrating stuff, but at least you can make sure it can't mess with its permissions and give itself more privileges.
You don't need to store any credentials at all (aside from your provider key, unless you want to mod pi).
Your claw also shouldn't be able to talk to the open internet, it should be on a VPN with a filtering proxy and a webhook relay.
The security concerns are valid, I can get anyone running one of these agents on their email inbox to dump a bunch of privileged information with a single email..
This is so relatable. I remember trying to set up an LLM gateway back in 2023. There were at least 3 different teams that blocked our rollout for months until they worked through their backlog. "We're blocking you, but you’ll have to chase and nag us for us to even consider unblocking you"
At the end of all that waiting, nothing changed. Each of those teams wrote a document saying they had a look and were presumably just happy to be involved somehow?
One of the lessons in that book is that the main reasons things in IT are slow isn't because tickets take a long time to complete, but that they spend a long time waiting in a queue. The busier a resource is, the longer the queue gets, eventually leading to ~2% of the ticket's time spent with somebody doing actual work on it. The rest is just the ticket waiting for somebody to get through the backlog, do their part and then push the rest into somebody else's backlog, which is just as long.
I'm surprised FAANGs don't have that part figured out yet.
I do know the feeling you're talking about though, and probably a better balance is somewhere in the middle. Just wanted to add that the solution probably isn't "Let devs deploy their own services without review", just as the solution probably also isn't "Stop devs for 6 months to deploy services they need".
If you had advertised this as a "regular service which happens to use LLM for some specific functions" and the "output is rigorously validated and logged", I am pretty sure you would get a green-light.
This is because their concern is data-privacy and security. Not because they care or the company actually cares, but because fines of non-compliance are quite high and have greater visibility if things go wrong.
1. The compliance box tickers and bean counters are in the way of innovation and it hurts companies.
2. Claws derive their usefulness mainly from having broad permissions, not only to you local system but also to your accounts via your real identity [1]. Carefulness is very much warranted.
[1] People correct me if I'm misguided, but that is how I see it. Run the bot in a sandbox with no data and a bunch of fake accounts and you'll see how useful that is.
2. Those that don't have much technical chops, but can get by with a surface level understanding of several areas and then perform "security shamanism" to intimidate others and pull out lots of jargon. They sound authoritative because information security is a fairly esoteric concept and because you can't argue against security like you can't argue against health and safety, the only response is "so you don't care about security?!"
It is my experience that the first are likely to work with you to help figure out how to get your application past the hurdles and challenges you face viewing it as an exciting problem. The second view their job as to "protect the organization" not deliver value. They love playing dressup in security theater and their depth of their understanding doesn't even pose a drowning risk to infants, which they make up for with esoterica, and jargon. They are also unfortunately the one's cooking up "standards" and "security policies" because it allows them to feel like they are doing real work, without the burden of actually knowing what they are doing, and talented people are actually doing something.
Here's a good litmus test to distinguish them, ask their opinion on the CISSP. If it's positive they probably don't know what the heck they are talking about.
Source: A long career operating in multiple domains, quite a few of which have been in security having interacted with both types (and hoping I fall into the first camp rather than the latter)
This made me lol.
It's a good test, however, I wouldn't ask it in a public setting lol, you have to ask them in a more private chat - at least for me, I'm not gonna talk bad about a massive org (ISC2) knowing that tons of managers and execs swear by them, but if you ask for my personal opinion in a more relaxed setting (and I do trust you to some extent), then you'll get a more nuanced and different answer.
Same test works for CEH. If they felt insulted and angry, they get an A+ (joking...?).
Though with the recent layoffs and stuff, the security in Amazon was getting better. Even the best-practices for IAM policies that was the norm in 2018, is just getting enforced by 2025.
Since I had a background of infosec, it always confused me how normal it was to give/grant overly permissive policies to basically anything. Even opening ports to worldwide (0.0.0.0/0) had just been a significant issue in 2024, still, you can easily get away with by the time the scanner finds your host/policy/configuration...
Although nearly all AWS accounts managed by Conduit (internal AWS Account Creation and Management Service), the "magic-team" had many "account-containers" to make all these child/service accounts joining into a parent "organization-account". By the time I left, the "organization-account" had no restrictive policies set, it is up to the developers to secure their resources. (like S3 buckets & their policies)
So, I don't think the policy folks are overall wrong. In the best case scenario, they do not need to exist in the first place! As the enforcement should be done to ensure security. But that always has an exception somewhere in someone's workflow.
All these claws throw caution to the wind in enabling the LLM to be triggered by text coming from external sources, which is another step in wrecklessness.
then the heads changed and we were back to square one.
but for a moment it was glorious of what was possible.
Now for the more reasonable point: instead of being adversarial and disparaging those trying to do their job why not realize that, just like you, they have a certain viewpoint and are trying to do the best they can. There is no simple answer to the issues we’re dealing with and it will require compromise. That won’t happen if you see policy and security folks as “climbing out of their holes”.
The only innovation I want to see coming out of this powerblock is how to dismantle it. Their potential to benefit humanity sailed many, many years ago.
What a surprise that someone working in Big Tech would find "pesky" policies to get in their way. These companies have obviously done so much good for the world; imagine what they could do without any guardrails!
I have been using and evolving my own personal agent for years but the difference is that models in the last year have suddenly become way more viable. Both frontier and local models. I had been holding back releasing my agents because the appetite has just not been there, and I was worried about large companies like X ripping off my work, while I was still focused on getting things like security and privacy right before releasing my agent kit.
It's been great seeing claws out in the wild delighting people, makes me think the time is finally right to release my agent kit and let people see what a real personal digital agent looks like in terms of presentation, utility and security. Claws are still thinking too small.
Nondeterministic execution doesn’t sound great for stringing together tool calls.