> Yeah, but only because the LLM bots simply don’t run JavaScript.
I don't think that this is the case, because when Anubis itself switched from a proof-of-work to a different JavaScript-based challenge, my server got overloaded, but switching back to the PoW solution fixed it [0].
I also semi-hate Anubis since it required me to add JS to a website that used none before, but (1) it's the only thing that stopped the bot problem for me, (2) it's really easy to deploy, and (3) very few human visitors are incorrectly blocked by it (unlike Captchas or IP/ASN bans that have really high false-positive rates).
One thing I noticed though was that the Digital Ocean Marketplace image asks you if you want to install something called Crowdsec, which is described as a "multiplayer firewall", and while it is a paid service, it appears there is a community offering that is well-liked enough. I actually was really wondering what downsides it has (except for the obvious, which is that you are definitely trading some user privacy in service of security) but at least in principle the idea seems kind of a nice middleground between Cloudflare and nothing if it works and the business model holds up.
What I realised recently is for non user browsers my demos are effectively zip bombs.
Why?
Because I stream each frame and each frame is around 180kb uncompressed (compressed frames can be as small as 13bytes). This is fine as the users browser doesn't hold onto the frames.
But, a crawler will hold onto those frames. Very quickly this ends up being a very bad time for them.
Of course there's nothing of value to scrape so mostly pointless. But, I found it entertaining that some scummy crawler is getting nuked by checkboxes [1].
It's kind of a self fulfilling prophecy, you make it the visitor experience worse, giving a self justification why llm giving the content is wanted and needed.
All of that because in the current lambda/cloud computing word, it became very expensive to process only a few requests.
A web forum I read regularly has been playing whack-a-mole with LLM scrapers for much of this year, with multiple weeks-long periods where the swarm-of-locusts would make the site inaccessible to actual users.
The admins tried all manner of blocks, including ultimately banning entire countries' IP ranges, all to no avail.
The forum's continued existence depends on being able to hold off abusive crawlers. Having to see half-a-second of the Anubis splashscreen occasionally is a small price to pay for keeping it alive.
> have to face a 3s stupid nagscreens like the one of anubis, I'm very pissed off and pushed even more to bypass the website when possible to get the info I want directly from llm or search engine.
Most (freely accessible) LLMs will take more than 3s to "think". Why are you pissed off about Anubis, but not the slow LLM? And then you have to double check the LLM anyway...
> All of that because in the current lambda/cloud computing word, it became very expensive to process only a few requests.
You're making some very arrogant assumptions here. FOSS repos and bugtrackers are generally not lambda/cloud hosted.
Most of them are simply throwing one of those tools on a VPS or such, which is perfect for their community size, and then falls over under LLM companies' botnets DDoSing them.
Work functions make sense in password hashes because they exploit an asymmetry: attackers will guess millions of invalid passwords for every validated guess, so the attacker bears most (really almost all) of the cost.
Work functions make sense in antispam systems for the same reason: spam "attacks" rely on the cost of an attempt being so low that it's efficient to target millions of victims in the expectation of just one hit.
Work functions make sense in Bitcoin because they function as a synchronization mechanism. There's nothing actually valorous about solving a SHA2 puzzle, but the puzzles give the whole protocol a clock.
Work functions don't make sense as a token tax; there's actually the opposite of the antispam asymmetry there. Every bot request to a web page yields tokens to the AI company. Legitimate users, who far outnumber the bots, are actually paying more of a cost.
None of this is to say that a serious anti-scraping firewall can't be built! I'm fond of pointing to how Youtube addressed this very similar problem, with a content protection system built in Javascript that was deliberately expensive to reverse engineer and which could surreptitiously probe the precise browser configuration a request to create a new Youtube account was using.
The next thing Anubis builds should be that, and when they do that, they should chuck the proof of work thing.
"The work" is providing those better alternatives to anubis, that everyone in this thread except for Xe seem to know all about.
The humility is about accepting the fact that the solution works for some people, the small site operators that get hammered by DDoSes and unethical LLM over crawling, despite not being perfect. And if that inconveniences you as a user of those sites - which I imagine is what you mean by "user backlash", the solution for you is to stop going there, not talk down at them for doing something about an issue that impacts them.
[1] https://sourcehut.org/blog/2025-04-15-you-cannot-have-our-us...
Also, a lower "server load" has nothing to do with the system being collectively "a good outcome" that justifies labeling criticism as supporting "the bad guys".
Agreed, residential proxies are far more expensive than compute, yet the bots seem to have no problem obtaining millions of residential IPs. So I'm not really sure why Anubis works—my best guess is that the bots have some sort of time limit for each page, and they haven't bothered to increase it for pages that use Anubis.
> with a content protection system built in Javascript that was deliberately expensive to reverse engineer and which could surreptitiously probe the precise browser configuration a request to create a new Youtube account was using.
> The next thing Anubis builds should be that, and when they do that, they should chuck the proof of work thing.
They did [0], but it doesn't work [1]. Of course, the Anubis implementation is much simpler than YouTube's, but (1) Anubis doesn't have dozens of employees who can test hundreds of browser/OS/version combinations to make sure that it doesn't inadvertently block human users, and (2) it's much trickier to design an open-source program that resists reverse-engineering than a closed-source program, and I wouldn't want to use Anubis if it went closed-source.
[0]: https://anubis.techaro.lol/docs/admin/configuration/challeng...
Either way: what Anubis does now --- just from a CS perspective, that's all --- doesn't make sense.
Its time to start do own walled gardens, build overlay VPN networks for humans. Put services there, if someone misbehave? BAN his IP. Came back? BAN again. Came back? wtf? BAN VPN provider.. Just clean the mess.. different networks can peer and exchange. Look, Internet is just network of networks, its not that hard.
Allowed
https://web.archive.org/web/20250419222331if_/https://anubis...
https://web.archive.org/web/20250419222331if_/https://anubis...
https://web.archive.org/web/20250420152651if_/https://anubis...
https://web.archive.org/web/20250420152651if_/https://anubis...
Blocked
https://web.archive.org/web/20250424235436if_/https://anubis...
https://web.archive.org/web/20250510230703if_/https://anubis...
https://web.archive.org/web/20250511110518if_/https://anubis...
https://web.archive.org/web/20250630101240if_/https://anubis...
https://web.archive.org/web/20250808051637if_/https://anubis...
https://web.archive.org/web/20250909160601if_/https://anubis...
Allowed
https://web.archive.org/web/20250921062513if_/https://anubis...