CAPTCHAs have failed for 20 years

Posted by harsehaj 3 hours ago

CAPTCHAs have failed for 20 years(www.browserbase.com)

58 points | 46 comments

netik 1 hour ago|

So this is a basically a shill advertisement ending in "Your AI Agents can avoid captchas if you pay us."

The last example is a false narrative, that captchas will only happen if the "browser looks suspicious". Systems like Altcha put an end to this argument. They don't care if the browser looks suspicious, only that the browser can perform a proof-of-work to get past a captcha designed to slow down the request rate.

When applied consistently, it will effectively block and slow down AI crawlers, which is what this company wants to promote.

chrismorgan 40 minutes ago||

Proof-of-work is bad rate limiting: https://news.ycombinator.com/item?id=44093918. The playing field is wildly unbalanced. Even naive attackers tend to have a lot more computing power available than a lot of your normal users, and where it’s SHA-256 (which is almost the worst choice imaginable for a proof of work scheme, yet which every single service that I know of has used), an intelligent attacker goes from being hundreds of times as powerful to millions of times as powerful.

peeet 32 minutes ago|||

More advanced and targeted bots can "bypass" Proof of work as well though, e.g. using something like https://github.com/toman-tom/Incapsula-PoW

gruez 1 hour ago||

>Systems like Altcha put an end to this argument. They don't care if the browser looks suspicious, only that the browser can perform a proof-of-work to get past a captcha designed to slow down the request rate.

That doesn't really work out in reality because bots are happy to wait 5 seconds or even 5 minutes for a PoW challenge to complete. Humans on the other hand will not, especially if they're on a mobile device with limited compute and energy.

CM30 1 hour ago||

The issue is that anything that becomes a standard here automatically becomes a target. If the same sort of captcha protects everything from Gmail to Twitter to Cloudflare and Facebook, then bot creators and spammers have a huge incentive to bypass it no matter what. And if we've learnt anything about spam, it's that pretty much every system we can think of can be bypassed or automated away.

The solution is really a ton of different captcha like systems and anti spam solutions, all unpopular enough that an attacker may not even bother targeting them. If an attacker needs to target a few thousand different captcha style setups to get their spam through, then many of them won't bother.

It's like centralised vs decentralised communication systems. If everything is centralised, a bad actor (like a government, corporation, criminal group, etc) can go after one target to control the narrative. If it's decentralised, then suddenly they have to go after dozens or hundreds of different targets, many of which won't cooperate with them.

epgui 2 hours ago||

I thought half the point of captchas was to train vision models?

ben_w 1 hour ago|

This is in the article.

Indeed, half the point for reCAPTCHA: That how Google could justify supplying reCAPTCHA for free, but not why people wanted to use them.

chinathrow 43 minutes ago||

> That how Google could justify supplying reCAPTCHA for free, but not why people wanted to use them

This and Pokemon Go for collecting videos: are there other examples of users doing the free work for $large_co?

curtisboortz 46 minutes ago||

The Chrome extension angle is interesting here. We ship an extension that interacts with Gmail and have seen how much variance there is in what Google considers "bot-like" behavior from extensions vs. the browser tab. The line between "automated" and "assisted" is not well defined at the API level, which ends up being a similar underlying problem: distinguishing intent rather than pattern.

hombre_fatal 1 hour ago||

As TFA points out, a major change is that bot traffic now comes from honest users via their LLM sessions, so you don't even necessarily want to block automated bots anymore.

The game is shifting to a better ideal: how do you design a service knowing that any user/request might be automated?

Especially in place of the historical, easy solution/hack where you have some sort of gate that, once passed, puts the user in some trusted low-scrutiny tier, like a forum's registration page.

It's a similar question to designing a system so that it's resilient to account take-overs. (i.e. The user was a trusted human until now, and now it's a spammer)

Example: on a forum, run new posts through an LLM to classify it as spam which is a magic solution we always wish we had (remember akismet?) but was too rudimentary.

wildzzz 1 hour ago|

You use API tokens for things intended to be machine to machine communication and captchas for things intended to be filled out by humans. Not every site or service wants automated input, even if it's being directed by a human. I dont want forums like HN just filled with a bunch of agents talking to eachother, where's the human connection?

giancarlostoro 58 minutes ago||

I remember at one point in my teens, someone had made a web app that would snag the captcha and show you only the captcha, and you would just endlessly solve captchas, while the application tried different passwords on a backend, and logging any successful logins.

yieldcrv 53 minutes ago|

Some of the first bitcoin faucets in 2011, 2012 were bots doing that

Users thought the captcha was antispam prevention for them to receive bitcoin

It was really just the bot forwarding a captcha to continue its spam once solved, posting the user in bitcoin

giancarlostoro 13 minutes ago||

LOL I don't remember doing captcha, but I remember receiving bitcoin from a faucet, thought it was strange.

ra0x3 2 hours ago||

TLDR: They're promoting a product they're working on with Cloudfare under the guise of it being an "open standard" [1]. Of course, in the docs, Step 1 is "Sign in with your Cloudfare account". Comes across a bit land-grabby.

[1] https://www.browserbase.com/blog/cloudflare-browserbase-pion...

joehabeebs 1 hour ago||

The most recent variations that force you to click the boxes containing a certain artifact are incredibly frustrating and fail half the time. The large influx of AI-SEO optimized content being created makes me question CAPTCHAs efficacy today

SirMaster 1 hour ago||

What about those ones where you need to slide some piece of a puzzle in. I don't see those mentioned at all. Are they effective?

matteo8p 1 hour ago|

Really nice read Harsehaj!

I haven't looked deeply into Web Bot Auth, but is identification tied to the agent (one identity per agent) or is it tied to the underlying person using the agent (the user)?

Hope that question makes sense, lmk if you need clarification

peytoncasper 48 minutes ago|

Hey Matt,

I would say everyone is leaning towards organization/individual right now but I would image that flips as the number of agents grow

More comments...