Cloudflare Turnstile requiring fingerprintable WebGL

Posted by HypnoticOcelot 5 hours ago

Cloudflare Turnstile requiring fingerprintable WebGL(hacktivis.me)

260 points | 147 commentspage 2

Kiboneu 2 hours ago|

In other words, Cloudflare requires you to substantially increase your browser’s attack surface in order to visit websites.

kordlessagain 3 hours ago||

I did warmups in Grub Crawler to fight this: https://deepbluedynamics.com/grub

JoshTriplett 4 hours ago||

"This makes your browser appear suspicious because it looks like you're trying to hide your identity."

Yeah, this needs to be burned to the ground.

gruez 3 hours ago|

Bad optics aside, it doesn't actually reflect reality. See my other comment. You can enable basically all the privacy settings and still pass turnstile. Tor browser in a VM passes it, of all things.

https://litter.catbox.moe/gaizpk692bhhs6b7.png

JoshTriplett 3 hours ago||

Any idea what the difference is between your setup and the one in the article that failed with fingerprint-resistance enabled?

gruez 3 hours ago||

He's using a custom browser, apparently: https://hacktivis.me/projects/badwolf

JoshTriplett 3 hours ago||

I'm talking about the screenshot from Firefox.

gruez 2 hours ago||

It didn't fail for him in firefox, even with privacy settings enabled.

JoshTriplett 2 hours ago||

It tripped "Canvas Randomization Detected". See the last screenshot.

Cloudflare's demo page still treats that as a pass, but complains about it. As is often the case with Cloudflare, I expect that they'll then take no responsibility for sites that use more aggressive settings.

Dwedit 3 hours ago||

Adding noise to a canvas element is a mistake anyway. It means you can't develop a proper paint program using web technologies because your browser will mess with the image.

tosti 2 hours ago|

You can still do that, but it may not be rendered correctly in a screenshot.

SilverElfin 1 hour ago||

This company makes the internet unusable if you value privacy and use VPNs or whatever. Evil.

nulledy 5 hours ago||

As turnstile users on several of our sites, I think we need to revisit that decision.

sammy2255 4 hours ago|

Out of curiosity, why did you have it on in the first place?

nulledy 3 hours ago||

Bot rejection for contact forms. Better UX than reCaptcha.

nlitened 16 minutes ago||

Did you think it rejects bots by using some kind of magic?

Wowfunhappy 4 hours ago||

...in the age of AI, does anyone have an actual solution for keeping out bots while preserving the privacy of humans?

Obviously this is terrible, but I think there's a possibility it's the least terrible option? Another option is IP reputation, which I think is worse. Or scanning a code with a non-rooted phone, which I think is even worse than that!

fidotron 4 hours ago||

> ...in the age of AI, does anyone have an actual solution for keeping out bots while preserving the privacy of humans?

There isn't one, and pretending otherwise is nonsense because humans will always provide their credentials to something to act on their behalf.

In the limit you end up with Chinese phone farms.

tardedmeme 4 hours ago||

Right. Botnet operators love cloudflare because they make so much money renting out compromised machines to pass their tests.

thisislife2 4 hours ago|||

The only solution is regulation. If all content created by anyone has a copyright, how does an implicit opt-in (which is what happens if you don't create a robots.txt file for your website) for scraping make any sense? Moreover, even if you have a robots.txt, AI (or whatever) bots often don't respect it (or use workarounds - they outsource scraping of such "restricted" sites to unethical third-parties to get the data; Meta has even resorted to piracy, openly!). So clearly, the logic and the "honour system" has failed.

Cloudflare, Google Captcha, HCaptcha etc. are all shitty technical solutions because, as we are all discovering, it comes at the cost of our privacy (i.e. our personal data may monetise these services) and / or our computing resource and time. If current copyright laws aren't sufficient to prevent this, we have to acknowledge the system is broken. The answer could be enhancing it with some kind of Digital Millennium Copyright Act (DMCA) -like laws, but in favour of the creators against BigTech or rogue actors.

- Web-scraping and copyright law - https://www.neudata.co/blog/web-scraping-and-copyright-law

- Why DMCA Claims Against Web Scrapers Face Long Odds - https://capstonedc.com/insights/why-dmca-claims-against-web-...

oceanplexian 3 hours ago|||

Or you could let information be free, at least the stuff that’s on the public net.

As for issues like bots overloading websites or using too many resources scaling laws will take care of it quickly, it’s not like you can’t serve thousands of RPS from a Raspberry Pi these days.

mschuster91 23 minutes ago||||

> The only solution is regulation.

The thing why Cloudflare got invented isn't AI scrapers. These are just the latest development... the original reason why Cloudflare got created and why it experienced such a meteoric growth is DDoS and botnets.

Yes. We need regulation in the AI space. But it will be useless as long as bad actors aren't held accountable - and a lot of the bad actors aren't in our jurisdictions. You got hacked devices all over the world in giant botnets, controlled by Russia, Chinese, Iranian and North Korean actors. You got Chinese AI scraper bots as China is heavily investing into training their own models. You got Indian, Filipino and Myanmar-based scammers.

And frankly I have no idea how to get all of that under control. As much as I'd like to see sanctions against both domestic and foreign enablers of abuse (which includes residential ISPs) - it's going to be one giant ass whack-a-mole game.

ImPostingOnHN 4 hours ago|||

I don't think regulation will stop web scraping, not least of which because it can be done from locations outside the jurisdiction of the regulations.

> we have to acknowledge the system is broken

The system is broken. It probably takes, what, 10 seconds or less to use a residential or foreign proxy, 6+ months to internationally track and prosecute a single offender? So like a million times more effort going the regulatory route.

thisislife2 4 hours ago||

Just as criminal laws don't end all crimes, copyright laws and anti-scraping regulation won't end all scraping. But it will greatly reduce it and limit it to rogue actors. Two examples I can cite here are the laws against email spams and laws against unsolicited marketing calls - they had a definite impact in reducing both (even in India, from where I am, where implementation of laws are often lax).

JoshTriplett 3 hours ago||

Exactly. Bot activity is a problem of volume, not all-or-nothing. Solving 95% of it would be a win.

cr125rider 4 hours ago|||

And identifying a bot that is acting on my behalf. Claude go search this topic is basically the same as Googling something and clicking on the results. Human driven AI searching needs to be in a different box than AI scraping for training data.

Which sounds extremely difficult to differentiate

JoshTriplett 4 hours ago||

Hopefully it stays that way; "a bot acting on my behalf" is still a bot. At least it's often a well-behaved bot and uses a user-agent that can be detected and blocked.

jeroenhd 2 hours ago|||

Remote attestation should still be possible with a rooted phone if phone manufacturers weren't so shit. If the attestation happens at hardware level, it doesn't matter what programs or kernels you're running.

ravenstine 2 hours ago|||

Or maybe we can actually start paying for all the things we use on the Web, making it prohibitively expensive to deploy fleets of bots.

Gander5739 4 hours ago|||

You don't need a non-rooted phone to pass captcha checks, I have a rooted phone and can pass the captchas that ask you to scan a qr code. But I doubt phones without google services would manage.

HWR_14 1 hour ago||

How does scanning a QR code prove any kind of captcha?

Gander5739 1 hour ago||

https://support.google.com/recaptcha/answer/16609652 - it just launches the verification service.

spacedoutman 4 hours ago|||

Private invite only internets

csomar 4 hours ago|||

They are not a problem unless you "believe" it is a problem. I estimate around 20-25K hits to my website from bots per day and I have all cloudflare protections disabled. Any decently optimized server should be able to easily handle that. (it's roughly 1 request every 3 seconds).

specialp 4 hours ago|||

Yes and that is just the bot background radiation of the internet. I run a primary source of information site and these botnets are aggressive to a DDOS level. All to do some sort of scraping. Because they have sophisticated enough tactics to DDOS us if they wanted to. However I am not sure their objective as they have wasted enough of our resources to have scraped all our content 1000s of times over. That 25k traffic is a couple of minutes for us. And that adds up. 80-90pct of our traffic is this

HWR_14 1 hour ago||||

Assuming that the bots aren't repackaging your content and preventing users from seeing your blog by serving that content to them first.

thisislife2 4 hours ago|||

True. But it still wastes your server resources, right? And it's sad that you have to accept that as part of the "cost" of hosting a site ...

ndriscoll 3 hours ago||

What resources are you concerned about? An n100 minipc should be capable of serving something like a blog at 20k+ requests/second (or saturating its network).

doctorpangloss 4 hours ago|||

web environment integrity

malka1986 4 hours ago|||

> keeping out bot

You can forget about it. It is not possible. Simple as that.

Wowfunhappy 4 hours ago||

Let's say I'm selling concert tickets. How do I prevent bots from buying up all the tickets and scalping them?

ranguna 3 hours ago|||

Do it like plane tickets do, tie a ticket to an identity + buyback up to a week or so before the concert in case someone wants to cancel (or authorize the transfer and capture only a week before). Ask for ID and ticket at the entrance.

MyMemoryfails 4 hours ago||||

I'd simply check filling speed, even with browser's autocomplete humans are slow due needing click submit.

Then when it's "processing", do them in bulk and prioritize slower users. There's huge opportunity do bot checks after checkout without affecting user experience.

Also on product launches you could add unique field which requires user to input, for example that way bots can't prepare for launches.

fragmede 3 hours ago||

huh. no wonder my password manager's auto submit triggers bot detection (it's a fairly popular one).

ndriscoll 3 hours ago||||

Sell them via a Dutch auction. Eliminate the arbitrage opportunity for scalpers and make more money in the process.

dcrazy 2 hours ago||

That’s how you wind up with only kids of millionaires at your Taylor Swift concert.

queenkjuul 1 hour ago||

So a Taylor Swift concert

luckylion 4 hours ago|||

Tie them to the buyer's identity, offer at-value buy-backs until X weeks before event, disallow resale.

ashishbijlani15 2 hours ago||

[dead]

megous 1 hour ago||

They use all kinds of obscure APIs, which you'll learn if you're privacy/security conscious and disable random web APIs that are of no use to YOU as a web user, but only can ever serve the people who serve you stuff or want to hack you or track you.

Normally websites feature test and just skip using obscure disabled APIs, or more likely, websites don't use those APIs at all or only tracking scripts use it, which are already optional usually.

Problem with CF is that if you want increased security they'll prevent you from gaining it everywhere, even on sites they don't protect, or prevent you from accessing services even the ones you paid for. Browsers don't allow disabling APIs per domain, so you're either at risk everywhere or you're blocked from accessing a lot of things for no particular reason.

CF can't be bothered to feature test.

zuzululu 1 hour ago||

Dont like it but is a reality due to bots

gruez 4 hours ago|

This blog post is filled with false assumptions.

>Turns out it's because Cloudflare wants to have a fingerprint of your device via WebGL, the only reason for doing this would be tracking.

> So Cloudflare just banned all WebKitGTK browsers as I guess they put an exception for Safari.

This is false. I ran firefox with:

* hardware acceleration disabled (so software renderer, nothing to fingerprint)

* resistfingerprinting enabled, including letterboxing with default window size

* webgl disabled

* VPN enabled

* In a Windows VM

By all accounts this should be the most suspicious fingerprint ever, but turnstile happily lets me through. If they want to track people, they're doing a pretty bad job. My guess is that OP's browser is getting banned because his WebKitGTK has a weird fingerprint, not because of webgl or whatever.

> Such things are blocked in WebKit, and have been for years. Meaning it's tracking so awful that even Apple would block it, and as far as I can tell it's not the kind of privacy protection you can easily disable in it.

This is also false. Webgl fingerprinting works just fine on Safari. They might try to mitigate it by adding some noise, but that's not so different than what firefox does, and is certainly not "blocked".

konform 1 hour ago||

I think your comment is also making plenty assumptions..

Official Firefox can be leaky unless you build it yourself with some build-time changes or use a fork with such[0]. Am I guessing right that you still have Webcompat, RemoteSettings, and Nimbus enabled still? How do you know a compatibility intervention isn't causing your browser to open the kimono just enough to "unbreak the page"?

> My guess is that OP's browser is getting banned because his WebKitGTK has a weird fingerprint, not because of webgl or whatever.

My guess is a different flavor of the same: Not matching an expected fingerprint (simplified: whitelist vs blacklist approach) combined with other factors.

[0]: I'm currently aware of Tor Browser, Konform Browser (am dev), Mullvad Browser, and to a certain extent Waterfox, LibreWolf, and r3df0x doing that.

gruez 1 hour ago||

>Official Firefox can be leaky unless you build it yourself with some build-time changes or use a fork with such[0]. Am I guessing right that you still have Webcompat, RemoteSettings, and Nimbus enabled still? How do you know a compatibility intervention isn't causing your browser to open the kimono just enough to "unbreak the page"?

See my other comment, tor browser works fine too: https://news.ycombinator.com/item?id=48346659

jeroenhd 2 hours ago|||

Enabling resistfingerprinting on my Android phone shows me the same error screen. It's not just webkit.

fingerprintingProtection works fine on the other hand, but then again that's intentionally less intrusive.

shiomiru 4 hours ago|||

> My guess is that OP's browser is getting banned because his WebKitGTK has a weird fingerprint, not because of webgl or whatever.

So why is Cloudflare saying the author got blocked because of WebGL?

> > Such things are blocked in WebKit, and have been for years. Meaning it's tracking so awful that even Apple would block it, and as far as I can tell it's not the kind of privacy protection you can easily disable in it.

> This is also false. Webgl fingerprinting works just fine on Safari. They might try to mitigate it by adding some noise, but that's not so different than what firefox does, and is certainly not "blocked".

While I don't have an iDevice to try, the assumption that they are special cased is fair... because they are: https://blog.cloudflare.com/eliminating-captchas-on-iphones-...

(Yes, this is basically WEI in a shinier package.)

gruez 3 hours ago||

>So why is Cloudflare saying the author got blocked because of WebGL?

No idea. I can't even reproduce the error OP got with webgl disabled.

https://litter.catbox.moe/y42l22k97tgv96nx.png

superkuh 4 hours ago||

Yep. Cloudflare and cloudflare's customers don't care about blocking people that use non-standard browsers (or accessible browsers, or feed readers, or whatever). Using cloudflare defaults is basically saying, "Only major corporate browsers released in the last year or two can access this site."

More comments...