Top
Best
New

Posted by HypnoticOcelot 4 hours ago

Cloudflare Turnstile requiring fingerprintable WebGL(hacktivis.me)
260 points | 147 comments
denysvitali 3 hours ago|
Cloudflare is known to use fingerprinting to detect scrapers For example, they use JA3 fingerprints and match them against the UA to block stuff like cURL while allowing OkHttp (Android clients) - but this can be easily be spoofed with packages such as CycleTLS [1].

I don't want to defend them, because they gate away a good chunk of the internet with their "bot protection", but unless you do PoW (which is also ecologically a nightmare), probably fingerprinting is the way to go - completely destroying the privacy of everyone involved.

Cromite, a privacy conscious fork of Chromium for Android, has constantly issues with CloudFlare Turnstile [2] because they (Cloudflare) try to fingerprint it in multiple ways in order to pass the challenge. The only way to get it to work would be to join the CloudFlare Browser Developer program - which requires signing an NDA. Rightfully so, the project maintainer didn't want to do it.

If you want to see the extent of what CloudFlare does to fingerprint the browsers, just have a look in the issue [2] and see which flags need to be disabled in order to allow CloudFlare to pass the challenge.

I understand both sides, but at least CloudFlare could be flexible enough to fall back to PoW instead of just blocking people from sending forms or accessing websites...

[1]: https://github.com/Danny-Dasilva/CycleTLS

[2]: https://github.com/uazo/cromite/issues/2365

jwr 1 hour ago||
> I don't want to defend them, because they gate away a good chunk of the internet with their "bot protection"

They also gate away a good many people with their "bot protection". I am extremely worried about how so many seem to have outsourced the control over who can access their websites to a company, with no second thoughts whatsoever.

binaryturtle 33 minutes ago|||
I can no longer access any website that's "protected" by Cloudflare. As soon a website enables that stuff… "Shoot, another one bites the dust." I wonder if the website owners realise at all how many actual users they lose by this sort of "protection."
tardedmeme 10 minutes ago||
Cloudflare will just tell them that 70% traffic drop is because 70% of their traffic was bots, and everything is working fine, and hey, don't you want to upgrade to a paid plan to block 50% of the remainder? Think about how many bots will be blocked with that upgrade!
denysvitali 1 hour ago||||
They sometimes have to comply with legal requests (which I understand), but at the same time they have a huge market share - which means that the internet is becoming less and less decentralized and more in their control. We've seen the effects of that in previous outages...
tardedmeme 11 minutes ago||||
It's just one more facet of the enshittoscene, the era where actual product quality is completely irrelevant. Put it in the same bucket as websites that lag when you scroll, apps that refuse to show you video without a huge play/pause button overlaid in the middle of it that never goes away, and the movie Melania. My hypothesis is that billion-dollar businesses no longer exist to sell things to customers, but only to impress other billionaires to get their investment money.
stackghost 1 hour ago|||
>I am extremely worried about how so many seem to have outsourced the control over who can access their websites to a company, with no second thoughts whatsoever.

I think the Web is on its last legs, anyway. Generative AI and LLM-instead-of-search has destroyed what little value remained.

sandeepkd 59 minutes ago|||
> I don't want to defend them, because they gate away a good chunk of the internet with their "bot protection", but unless you do PoW (which is also ecologically a nightmare), probably fingerprinting is the way to go - completely destroying the privacy of everyone involved.

Bot protection with fingerprinting is just an illusion. Any signals like this which is on client side can be spoofed by an above average person. Fingerprinting is just way to consolidate the market for advertising business. Assigning Reputation to residential IP addresses and commercial blocks is is another approach to achieve the desired result. Providers would be a lot more careful to allow their IP addresses for misuses, however turns out that it would bring down the DDOS business on both sides, attackers and protectors.

Ironically, more than often its the same companies that invest in building their own bots and finding ways to stop bots from other companies.

esrauch 19 minutes ago||
> Bot protection with fingerprinting is just an illusion. Any signals like this which is on client side can be spoofed by an above average person.

At the upper bound, fraud can always be committed by paying real people with real accounts to perform the desired action in a way that is 100% truly indistinguishable from organic. There's fundamentally actual prevention technique at the limit.

So the entire game is only "increasing the costs until it's not viable ROI", not "holistically prevent", which is why fingerprinting is a relevant technique here.

b65e8bee43c2ed0 2 hours ago|||
it's all for nothing, because Cloudflare's scraping protection works about as well as a $5 padlock - good enough to dissuade bored teens, not good enough to dissuade even an amateur burglar. if someone wants to scrap your publicly visible data, they will. there's nothing you can do.
ACCount37 2 hours ago|||
At the same time: it sure works well enough to annoy anyone with a "bad ASN" IP with 80 captchas a day.
shideneyu 2 hours ago||
exactly that's what I was thinking... like the day they provided a solution to the issue they posed
mootothemax 59 minutes ago|||
Exactly. I’m constantly amazed at how little you actually need to bypass CF, Amazon, Azure WAFs and so on (Incapsula springs to mind too). When you look at the code you’ve come up with, it’s actually quite small and compact.

More to the point, these systems actually help scraping because proof of work unlocks essentially unlimited scraping, in my experience.

That said - from my experience on the other side, sure you can’t stop people like me or you, but you can stop 99% of the others. That’s more than worth it operationally.

petu 2 hours ago|||
> but unless you do PoW (which is also ecologically a nightmare)

Can you expand? I don't see a problem with some napkin math. 5W load for 2 seconds is 0.002Wh (we have to let smartphones pass and not by doing PoW for 10s of seconds). 8 billion checks a day for a year = 8GWh.

denysvitali 1 hour ago||
I stand corrected. It's not a nightmare scenario (as for Bitcoins) - but I'm still of the idea that "useless" computations should be avoided (as we should avoid having 10MB websites).

In any case, according to some napkin math done by Kimi 2.6 (which by itself is probably already consuming more than all of my PoW challenges for the upcoming 5 years) - the situation looks incredibly in favor of PoW: https://www.kimi.com/share/19e7ef40-a432-8912-8000-0000b4a71...

Which makes me wonder why CloudFlare isn't switching to this already

dcrazy 1 hour ago||
Because it doesn’t solve the problem of residential botnets.
PearlRiver 3 hours ago||
This is why I have two separate browsers. If you want to do official stuff like paying for things you need to get through cloudflare.
notafox 2 hours ago|||
You can use Firefox with different profiles and configure it to launch particular profile directly, without launching default profile and using about:profiles.

Firefox with a non-default profile can be created like that:

  ./firefox -CreateProfile "profile-name /home/user/.mozilla/firefox/profile-dir/"
  # For, say, cloudflare that would be:
  ./firefox -CreateProfile "cloudflare /home/user/.mozilla/firefox/cloudflare/"
And you can launch it like that:

  ./firefox -profile "/home/user/.mozilla/firefox/profile-dir/"
  # For cloudflare that would be:
  ./firefox -profile "/home/user/.mozilla/firefox/cloudflare/"
So, given that /usr/bin/firefox is just a shell script, you can

    - create a copy of it, say, /usr/bin/firefox-cloudflare
    - adjust the relevant line, adding the -profile argument
If you use an icon to run firefox (say, /usr/share/applications/firefox.desktop), you'll need to do copy/adjust line for the icon.

Of course, "./firefox" from examples above should be replaced with the actual path to executable. For default installation of Firefox the path would be in /usr/bin/firefox script.

So, you can have a separate profiles for something sensitive/invasive (linkedin, cloudflare, shops, banks, etc.) and then you can have a separate profile for everything else.

And each profile can have its own set of extensions.

tardedmeme 9 minutes ago|||
They're blocking Firefox quite often. Stripe does something that makes Firefox hang. I use Chrome for those sites and then go back to Firefox...
t_mahmood 58 minutes ago||||
You do now do this from `Profiles` menu too, without going down to CLI path. It's extremely simple now.
ferfumarma 1 hour ago|||
Except that fingerprinting means that both profiles are actually tied together by cloudflare (and other tech companies)
VoidWhisperer 33 minutes ago||
I think the idea is that they have the functionality that cloudflare is using to generate the fingerprint (like webGL in this case) disabled in their non-cloudflare profile and only use the cloudflare profile to do things they have to that are behind cloudflare
helterskelter 2 hours ago|||
Firefox added profile switching recently. Works good.

(That said, I still keep separate machines. One for doing "official" things, the other for everything else)

notafox 2 hours ago|||
> Firefox added profile switching recently.

I think this was as recent as 25 years ago?

Recently they added some new UI. There was and still is (I think) classic Profile Manager UI, which you can launch with

  ./firefox -ProfileManager
or access UI in about:profiles.

But you don't have to use any of those anyway - see my comment above (a response to parent).

opem 1 hour ago|||
They actually have at least 3 kinds of profile: 1. containers - As they say its somekind of sandbox, technically a profile 2. profiles that are accesible through about:proflies, which they had for years, and probably the one you are talking about... 3. New profiles that comes with a pop-up much like how chromium browsers shows it
thayne 52 minutes ago|||
The old UI was pretty difficult to use, and hard to discover unless you knew where to look though.
ajb 2 hours ago||||
Odd - they've had that for years, but only on the command line. Wonder if it's different under the hood? They also have firefox containers which also never quite became a first-class feature (you have to install a plugin).
b65e8bee43c2ed0 2 hours ago|||
>Works good.

does it? same binary, same machine, same display, same 781 other heuristics.

jeroenhd 1 hour ago||
> Plus privacy.resistfingerprinting isn't enabled even when selecting "Strict" "Enhanced Privacy Protection" in the settings, great job there Mozilla.

For good reason. I've run that setting for ages but I kept having to disable it and add workarounds because websites would break in weird ways. Timezones in scheduling websites being messed up nearly made me miss a couple of appointments. There's no way to tell the user Firefox isn't broken without displaying a permanent banner like "if websites are broken in any way or you see weird glitches or your computer's time is wrong or fonts look weird or videos don't always work right, click here to disable fingerprinting protection".

Interestingly, Turnstile breaks with resistfingerprinting but works with fingerprintingProtection, I guess the latter takes this crap into account.

croes 1 hour ago|
Maybe a good reason for not enabling it by default but a bad reason to not enabling it for strict settings.

I somewhat expect breaking sites with strict settings, I don’t expect an still wide open tracking path.

That’s deceiving.

konform 13 minutes ago||
I'm maintaining a minority browser[0] and as of a couple of weeks this is affecting several of our users[1]. While I'm currently not considering this a browser bug (one could be involved, of course), more eyes are better and any help or ideas on improving or mitigating the situation would be appreciated.

[0]: https://konform-browser.codeberg.page/

[1]: Most? All? Without any telemetry, relying on user reports and our own testing here.

Animats 49 minutes ago||
Is there a deal between Google and Cloudflare to make non-Chrome browsers harder to use? The pressure to use Chrome keeps increasing, and the amount of ad filtering you can do in Chrome keeps decreasing.
wnevets 9 minutes ago||
I would wager to guess its one of the nature consequences of Chrome being the most popular browser on the web. Most legit traffic will be from Chrome.
tardedmeme 8 minutes ago||
Yes
malka1986 3 hours ago||
Thanks, i did not know about `privacy.resistfingerprinting`

I'll make sure to fail all cloudflare turnshit in the future.

gruez 3 hours ago|
I have it enabled and turnstile works fine.
jeroenhd 1 hour ago||
It breaks Turnstile for me on Android. Had to restart the browser for it to take effect of course.
adamtaylor_13 3 hours ago||
So if you need to prevent bot abuse, but also don't want an ugly captcha every time someone goes to sign up, is there a better option?
keynha 5 minutes ago||
Behavioral signals are the usual answer: risk-scored, invisible challenges; proof-of-work (cost without identity, though it taxes mobile); and signup-velocity/rate limits that stop cheap abuse before any challenge fires. The reason fingerprinting wins anyway is that it requires less operator effort, not that it is the only thing that works.
ribtoks 3 hours ago|||
Use proof-of-work captchas, many are private by default. Look into Private Captcha or Cap captcha.
mootothemax 55 minutes ago|||
Speaking from the scraper’s perspective, I like proof of work; a ten year old 96-core server will cost a couple of quid to run for a few hours and will grab an absurd number of pages thanks to the access granted by repeatedly solving proofs of work. Small slick codebases too!
tardedmeme 8 minutes ago||
There's also the Anubis idea where your PoW is persistent until your IP address or session cookie changes, so you get to skip PoW in exchange for making yourself identifiable, which means the PoW can then be ramped up to take a couple of minutes.

I don't use Anubis though. I just make my site not take five seconds to render a page so bots can overload it easily? It's not actually that hard?

phoronixrly 2 hours ago|||
How does proof of work stop bots?
stephantul 2 hours ago|||
Because it destroys the economics of scraping. It’s too expensive with proof of work, or at least not as economically viable
gruez 2 hours ago||
Depends on what type of scraping you're trying to stop. For the dumb scrapers that would try to scrape every page on a git forge (for which there are a bazillion pages for a modest project, because of how the site works), yeah it might deter them enough to stop. For anything high value (eg. reddit comments or retail prices), 10s of cpu time isn't going to stop them.
thayne 43 minutes ago|||
If it's high value, there isn't really much you can do that will be completely effective. Traditional captchas can often be beaten by AI, or by "captcha farms" where impoverished people are paid pennies to complete captchas. Fingerprinting can be beaten by using a full browser to make the requests. Basically anything you do is just a matter of making it more expensive for bots to access it.
stephantul 1 hour ago||||
Sure, the whole premise is exactly that proof of work reduces the value of scraping, while having negligible impact on users. If the data is so valuable that bot operators are willing to pay 10s of cpu, then other measures are necessary.

Nevertheless even for these high value cases, you can still argue that it disincentivizes the business model, it becomes less efficient.

pmontra 2 hours ago|||
It will not scare away bots but 10 seconds of wait (CPU or only a sleep) will turn away many real users. "This site is so slow, I'll use something else." A kind of reverse captcha.
Hnrobert42 1 hour ago|||
Maybe, the proof of work can run in the background.
btown 1 hour ago||
Or it can run as part of a checkout wizard's "verifying your browser and processing your payment, don't close your tab" step.
ray_v 2 hours ago|||
If it gets too expensive/time-consuming to scrape then it won't happen at scale (as much)?
ImPostingOnHN 3 hours ago||
The tool "Anubis" uses proof of work instead
BetterThanSober 2 hours ago|||
With a tuned cool down period this isn't a problem, especially if you frequent the sites. OpenWRT uses Anubis and usually when I need to peruse their site I'm on a very low-end device. I prefer waiting much more over finding Waldos

But in principle I agree that there's no good answer to this, scraping _is_ useful and I bet most of us here had scraped something, it is AI company and their use of human's material for training without consent and return that led us to this (I know botting exists in forum since forum is a thing but it is easily solved by human moderators and keyword filter)

timpera 2 hours ago||||
Anubis often takes more than 60 seconds to complete on low-end devices (especially old smartphones). It seems like there's no good solution.
QuantumNomad_ 2 hours ago|||
But after you’ve completed the Anubis PoW challenge for a site, it remains valid for some amount of time.

So it’s not quite as horrible as it sounds.

I have setting up Anubis for my own sites on my todo list. And I wish more people did it too. I don’t really mind waiting a little bit extra every now and then before the page loads. What I do mind is ReCaptcha asking me to click all the pictures with buses in them etc. And especially when I have to do it several times over before it’s happy. I’d rather wait a minute for a page to load than to ever solve a ReCaptcha again, if given the choice.

dangus 2 hours ago||||
That must be really low end then. I’ve never seen it complete in a timeframe that was slower than “I can’t even read the page before it redirects”
titularcomment 1 hour ago||
My guess is its an implementation error, not an hardware limitation. I have two 10-year-old devices and one passes instantaneously while the other halts for a good half minute every time.
ImPostingOnHN 2 hours ago|||
There's not an easy, perfect solution, for sure. Newer phones get faster, but spammer compute gets cheaper.

Some sort of decentralized trust web seems like another option, though less viable.

WesolyKubeczek 2 hours ago||
One of unexpected outcomes from AI-induced hardware shortage may be that, in fact, compute won’t be getting cheaper and may in fact get more expensive…
phoronixrly 2 hours ago|||
How does Anubis stop bots?
redwall_hp 39 minutes ago|||
Anubis is designed to stop a certain class of badly behaved bots. It intentionally doesn't run if a bot identifies itself with a UA, such as Googlebot, because then you can rate limit it or block by UA and with other tools.

Anubis is active when a user agent looks like a web browser (e.g. contains the "Mozilla" substring every major browser uses). The reverse proxy serves an interstitial page that does a proof-of-work check, validated server side, setting a cookie if it passes.

This means a legitimate user won't constantly get the proof of work check, because they already passed it. But AI bots rotating through tons of residential IPs to scrape your forum or git forge or whatever will be slowed down.

Overall, I like the idea. It's unobtrusive, privacy preserving, and seems to be working out well for a lot of sites.

basilikum 1 hour ago||||
The real answer is that it makes sites behave different requiring the bots to make slight adjustments.

And there are just not enough sites using Anubis for the people and companies running the bots to care to do that.

If you do care bypassing Anubis is trivial.

xena 2 hours ago|||
Bots don't execute JavaScript or follow complicated redirects.
pwg 2 hours ago||
Bots don't [currently] execute JavaScript or follow complicated redirects.

They don't now, but enough "high value to the bots" pages turning on JS or complicated redirects will simply result in the bot authors adding JS execution or redirect following so they can continue "botting" the sites they want to scrape.

It's a hole with no bottom. Each one-up on the anti-bot side will eventually be handled on the bot side.

4oo4 2 hours ago||
I tested this extension that I've been using for a long time on the turnstile page and it got through, fwiw. I think it's a bit more subtle than how resistfingerprinting works but not sure what the privacy tradeoff is.

https://github.com/kkapsner/CanvasBlocker

tosti 46 minutes ago||
Looks cool. And I wonder why I'd run this over JSshelter. It appears to do the same thing, no?
BoingBoomTschak 32 minutes ago||
Thanks for the report, I've been running this for a long time.
dblohm7 1 hour ago||
> Plus privacy.resistfingerprinting isn't enabled even when selecting "Strict" "Enhanced Privacy Protection" in the settings, great job there Mozilla.

That pref is there for the Tor Browser.

konform 49 minutes ago|
It's enabled by default in Tor Browser and I'm not sure it can even be disabled?

Also enabled by default for Konform Browser and Mullvad Browser, which borrow many of the privacy- and security-related patches from Tor Browser.

avallach 3 hours ago||
Doesn't this mean we just need to make the webgl fingerprint resistance implementation smarter? Instead of explicitly rejecting webgl access or responding with dummy data, respond with data that is random within space of N common and reproducible patterns. E.g. emulate webgl implementation of some low spec but actually popular devices.
btown 50 minutes ago||
The last screenshot in the OP article mentions that "a browser extension... adding random noise to canvas data" can be detected. Which isn't to say this perfectly detects all such randomization, but it's certainly an active part of the arms race.
bflesch 2 hours ago||
All of those advanced features should be enabled on a per-website basis but unfortunately even browsers whose marketing focuses on privacy don't allow you to do that. Same with TLS root CA certificates, there is no way to configure that a certain CA can only create certificates for certain domains.
More comments...