Posted by lharries 3/31/2025
It connects to your personal WhatsApp account via the WhatsApp Web multi-device API (using whatsmeow from the Beeper team), and doesn't rely on third-party APIs. All messages are stored locally in SQLite. Nothing is sent to the cloud unless you explicitly allow your LLM to access the data via tools – so you maintain full control and privacy.
The MCP server can:
- Search your messages, contacts, and groups
- Send WhatsApp messages to individuals or groups
Why build this?
99% of your life is stored in WhatsApp, by connecting an LLM to WhatsApp you get all this context. And your AI agent can execute tasks on your behalf by sending messages.
That's quite a presumption to make.
If you connect a not-selhosted LLM to this, you're effectively uploading chat message with other people to a third-party server. The people you chat with have an expectation of privacy so this would probably be illegal in many jurisdictions.
It could be legal to record and use as evidence in court later, but that doesn't mean you're allowed to share it with some AI company.
Sure you can, but the people can sue you if you paste it into something public. I don't know if you're making some deep philosophical comment but this is something people have been sued and lost for before.
And it also doesn't even matter because WhatsApp claims to be E2E-encrypted.
It's up to you to trust Meta or not, but people who trust them do have an expectation of privacy.
It also misses the mark because you're talking about an eavesdropper intercepting messages and the OP is the receiver sharing the messages with a third party themself.
Name one
You have a "allgemeines Persönlichkeitsrecht" (general personal rights?) that prevents other people from publishing information that's supposed to be private.
Here's a case where someone published a facebook dm for example:
This scenario however is "I take my personal data an run it through tools to make my life easier" (heck, even backup could fit the bill here). If I'm allowed to do that... am I allowed to do that only with tools that are perfectly secure? Can I send data to the cloud? (subcases: I own the cloud service & hardware/it's a nextcloud instance; I own it, but it's very poorly secured; Proton owns it and their terms of use promise to not disclose it; OpenAI owns it and their terms of use say they can make use of my data)
> am I allowed to do that only with tools that are perfectly secure?
No, actual security doesn't matter at all, but you have to think that they are reasonably secure.
> Can I send data to the cloud?
Yes, if you can expect the data to stay private
> (subcases: I own the cloud service & hardware/it's a nextcloud instance;
Yes
> I own it, but it's very poorly secured;
No
> Proton owns it and their terms of use promise to not disclose it;
Yes, if Proton is generally considered trustworthy.
> OpenAI owns it and their terms of use say they can make use of my data)
No
I guess you can argue that "I should've known that OpenAI will use my conversations if I send them to ChatGPT" but I'm not convinced it'd be crystal clear in court that I'm liable. Like I said.... I think until actually litigated, this is very much a gray area.
P.S. The distinction you make between "properly secured" and "improperly secured" nextcloud instance would, again, be a legal nightmare. I guess there could be an example of "criminal negligence" in extreme cases, but given companies get hacked all the time (more often than not with relatively minor consequences), and even Troy Hunt was hacked(https://www.troyhunt.com/a-sneaky-phish-just-grabbed-my-mail...) - I have a hard time believing the average Joe would face legal consequences for failing to secure their own Nexcloud instance.
Your initial „name one“ comment sounded like you didn’t believe there would be a jurisdiction where it is illegal.
Nope
My family members all back up our conversations to Google Drive, I doubt WhatsApp would provide that feature if it were illegal.
But if they use your input as training data, that would probably be enough.
My German isn't good enough to read the original text about this case, but if the sentiment behind https://natlawreview.com/article/data-mining-ai-systems-trai... is correct, I wouldn't be surprised if this would also fall under some kind of legal exception.
The biggest problem, of course, is that regardless of legality, this software will probably be used (and probably already is being used) because it's almost impossible to prove or disprove its use as a remote party.
That's something completely different. One is about copyright of stuff that was shared publically, while the other is about sharing private communications, violating their personal rights (not copyright).
But of course, we'll have to see, I'm not a lawyer either.
my bad.
These are tools where the AI may tell you it’s doing one thing and then accidentally do another (I had an LLM tell me it would make a directory using mkdir but then called the shell command kdir (thankfully didn’t exist)). Sandboxing MCP servers is also important!
For some people it is a requirement to have a social life. It is not your choice to use it or not. Network effects are taking care of that. If you think Signal or whatever is a better choice, good on you. But if you don't want to cut ties with some of your friends, prepare to use multiple apps. Including WhatsApp.
Back, when data plans were around 1GB or less some network providers didn't charge you for using whatsapp on specific plans in Europe. There were also whatsapp branded sim cards, but I haven't seem them for a long time though.
Since on-device processing is neither as objectionable nor could be very large
I don't use WhatsApp myself because of who runs it and there are plenty of better options out there, so I certainly agree with the sentiment of steering clear, but this claim does seem pretty far out there
We do need to trust Meta that they really don't, to some extent, but people way smarter than me have researched the WA implementation of the Signal protocol and it seems solid. I.E: Meta appears to simply be unable to read what you chat and send. (but TBC: they do see with whom and when you do this, just not the contents).
Presumably they use proper HTTPS, so all the data is essentially encrypted twice, if they just concatenate some packets with keys, it would be extremely difficult to detect as you'd need to decrypt HTTPS (which is possible if you can install your own certificates on a device), then dig through random message data to find a random value you don't even know.
People find exploits in proprietary code, or even SaaS (where researchers cannot even access the software) every day.
People at Meta might leak this information too.
"Information wants to be free"
My point is: the risk of this becoming known is real.
Reputation
Or what's the translation of bank run but generic for any service? Leegloop in Dutch. Translator gives only nonsense. Going for the descriptive route: many people would leave because of the tarnished reputation
The trick is to have Facebook continue to believe that this reputation/trust is more valuable than reading the messages of those who stay behind, which can partially be done by having realistic alternatives for people to switch to so that there is no reason to stay when trust is broken. Which kinda means pre-emptively switching (at least to build up a decent network effect elsewhere), which is what I've chosen to and encourage anyone to also do. But I'm not a conspiracy theorist who thinks that, at the present time, they'll try to roll out such an update in secret, at least not to everyone at once (intelligence agencies might send NSLs with specific targets)
The only way I can think of, is by pushing an update that grabs all your keys and pushes them to their servers.
Otherwise, it's pretty decent set up (if I am to believe Moxie, which I do)
A third reason besides privacy would be the purpose. Is the purpose generating automatic replies? Or automatic summaries because the recipient can't be bothered to read what I wrote? That would be a dick move and a good reason to object as well, in my opinion
The same thing that happens now, when 100% of power consumption is fed to other purposes. What's the problem with that?
Also don't forget it's just one of three aspects I can think of off the top of my head. This isn't the only issue with LLMs...
Edit: while typing this reply, I remembered a fourth: I've seen many people object morally/ethically to the training method in terms of taking other people's work for free and replicating it. I don't know how I stand on that one myself yet (it is awfully similar to a human learning and replicating creatively, but clearly on an inhuman scale, so idk) but that's yet another possible reason not to want this
I'm also happy to have them pay for the full cleanup cost rather than discourage useless consumption, but somehow people don't seem to think crazy energy prices are a great idea either
Also you're still very focused on this one specific issue rather than looking at the bigger picture. Not sure if the conversation is going anywhere like this
Citation needed.
It's a local LLM with access to an extraordinary amount of personal data. In the EU at least that personal data is supposed to be handled with care. I don't see people freaking out, but simple pointing out the leap of handing it over to ANOTHER company.
That doesn't make the metadata private. Meta can use that as they want. But not the contents, nor the images, not even in group chats (as opposed to Telegram, where group-chats aren't (weren't?) E2E encrypted).
What you say or send on WA is private. Meta cannot see that. Nor governments nor your ISP or your router. Only you and the person or people you sent it to can read that.
It's a d*ck move if they then publicize this. And, as others pointed out, illegal even in many jurisdictions: AFAIK, it is in my country.
If they configure it to indicate a prefix, for instance when answering questions like "when are you free to hang out" and it responding "[AI] according to X's calendar and work schedule, they may be available on the following days" I might also consider that somewhat useful (I just wouldn't take it as something they actually said).
If they're using LLMs to reword themselves or because they're not really interested in conversing, that's a definite ick.
I would personally use such a system in a receive-only mode for adding things to calendars or searching. But I'd also stick to local LLMs, so the context window would probably be too small to get much out of it anyway.
This seems sketchy to me.
Processing them (like compressing them to mp3 files or storing them in cloud storage) is probably legal in most cases.
The potential problem with LLMs is that they use your input to train themselves.
As of right now, the legal status of AI is very much up in the air. It's looking like AI training will be exempt from things like copyright laws (because how else would you train an LLM without performing the biggest book piracy operation in history?), and if that happens things like personal rights may also fall to the AI training overlords.
I personally don't think using this is illegal. I'm pretty sure 100% of LinkedIn messages are being passed through AI already, as are all WhatsApp business accounts and any similar service. I suppose we'll have to wait for someone to get caught using these tools and the problem making it to a high enough court to form jurisprudence.
This might a actually be helpful for people with poor memory or neurodivergent minds, to help surface relative context to continue their conversation.
Or sales people to help with their customer relationship management.
Try Briar, I think it does not store metadata either?
It's a closed source client. End to end encryption means nothing.
> so you maintain full control and privacy
Of course everybody is free to chose who they hand over data to.
One more comment to your writing:
> take their favorite apps away
there was no mention of taking anybody's app away. If people want to contact me they will need to use something that is not owned by a big advertising company. One can install additional apps or use services that do not need any apps.