Posted by eleye 4 days ago
If a computer (or “agent” in modern terms) wants to order you a pizza it can technically already do so.
The reason computers currently can’t order us pizza or book us flights isn’t because of a technical limitation, it’s because the pizza place doesn’t want to just sell you a pizza and the airline doesn’t want to just sell you a flight. Instead they have an entire payroll of people whose salaries are derived from wasting human time, more commonly know as “engagement”. In fact those people will get paid regardless if you actually buy anything, so their incentive is often to waste more of your time even if it means trading off an actual purchase.
The “malicious” uses of AI that this very article refers to are mostly just that - computers/AI agents acting on behalf of humans to sidestep the “wasting human time” issue. The fact that agents may issue more requests than a human user is because information is intentionally not being presented to them in a concise, structured manner. If Dominos or Pizza Hut wanted to sell just pizzas tomorrow they can trivially publish an OpenAPI spec for agents to consume, or even collaborate on an HPOP protocol (Hypertext Pizza Ordering Protocol) to which HPOP clients can connect (no LLMs needed even). But they don’t, because wasting human time is the whole point.
So why would any of these companies suddenly opt into this system? Companies that are after actual money and don’t profit from wasting human time are already ready and don’t have to do anything (if an AI agent is already throwing Bitcoin or valid credit card details at you to buy your pizzas, you are fine), and those that do have zero incentive to opt in since they’d be trading off “engagement” for old-school, boring money (who needs that nowadays right?).
I know that phrasing it like "large company cloudflare wants to increase internet accountability" will make many people uncomfortable. I think caution is good here. However, I also think that the internet has a real accountability problem that deserves attention. I think that the accountability problem is so bad, that some solution is going to end up getting implemented. That might mean that the most pro-freedom approach is to help design the solution, rather than avoiding the conversation.
Bad ideas:
You're getting lots of bot requests, so you start demanding clients login to view your blog. It's anti-user, anti-privacy, very annoying, readership drops, everyone is sad.
Instead, what if your browser included your government id in every request automatically? Anti-user, anti-privacy, no browser would implement it.
This idea:
But ARC is a middle ground. Subsets of the internet band together (in this case, via cloudflare) and strike a compromise with users. Individual users need to register with cloudflare, and then cloudflare gives you a million tokens per month to request websites. Or some scheme like this. I assume that it would be sufficiently pro-social that the IETF and browsers all agree to it and it's transparent & completely privacy-respecting to normal users.
We already sort of have some accountability: it's "proof of bandwidth" and "proof of multiple unique ip addresses", but that's not well tuned. In fact, IP addresses destroy privacy for most people, while doing very little to stop bot-nets.
This seems like it would just cause the tokens to become a commodity.
The premise is that you're giving out enough for the usage of the large majority of people, but how many do you give out? If you give out enough for the 95th percentile of usage then 5% of people -- i.e. hundreds of millions of people in the world -- won't have enough for their normal usage. Which is the first problem.
Meanwhile 95% of people would then have more tokens than they need, and the tokens would be scarce, so then they would sell the ones they're not using. Which is the second problem. The people who are the most strapped for cash sell all their tokens for a couple bucks but then get locked out of the internet.
The third problem is that the AI companies would be the ones buying them, and since the large majority of people would have more than they need, they wouldn't be that expensive, and then that wouldn't prevent scraping. Unless you turn the scarcity way up and make the first and second problems really bad.
Oh and also turns out if the data you share is easily collected it can be analyzed and tracked to prove your crimes like price gauging, IP infringement and other unlawful acts - that's not good for business either!
Wait I thought web 2.0 was DHTML / client-side scripting and XmlHttpRequest?
Part of this is the friction required to implement a client for a bespoke API that only one vendor offers, and the even bigger friction of building a standard.
AI and MCP servers might be able to fix this. In turn, companies will have a motivation to offer AI-compatible interfaces because if the only way to order a pizza is through an engagement farm, the AI agent is just going to order the pizza somewhere else.
Really, they could each do their own bespoke thing as long as they didn't go out of their way to shut out other implementers.
Instant messaging used to work like this until everyone wanted to own their customer bases and lock them in, for the time-wasting aspect
With AI browsers, all they have to do initially is not break them, and long term, each of them can individually choose to offer their API - no coordination required - and gain a slight advantage.
I wonder how long it will take for sellers to realize the war against agents cannot be won and that their compute resources are better spent giving agents a fast path to task completion.
Even if pizza hut wanted people to order pizza the most efficiently with no time wasted it would still want it to happen on their own platforms.
Because if people went to all-pizzas.com for their pizza need then each restaurant and chain would depend on them not to screw them up
This is precisely what makes food delivery ordering services (GrubHub, UberEats, Deliveroo, etc.) so challenging to operate and maintain. Practically every restaurant accepts orders in a different way, and maintaining custom mechanisms for each one is costly. Restaurant front-of-house technology companies like Toast are helping make them operate alike, but adoption is slow and there are many, many restaurants to tackle.
People were searching AOL keywords for things, and will again.
Only now: by asking OpenAI, Anthropic, or a competitor’s agent.
Also I wonder if credit card chargebacks are a concern. They might worry that allowing a single user to make a million orders would be a problem, so they might want to rate limit users.
If they end up as just a pizza-api they have no moat and are trivially replaced by another api and bakery, and will make less money.
Another contradiction at play here is that of inovation vs standardisation. Indeed, you could argue that dominoes' website is also a place where thay can inovate (bring your own recipes! delivery by drone! pay with tokens! wtv!) whereas a pizza protocol would slow down or prevent some inovation. And that LLMs are used to circumvent and therefore standardize the process of ordering a pizza (like you had user maintained APIs to query various incompatible banq websites; these days they probably use LLMs as well).
The big national pizza chains don't offer good prices on pizza. They offer bad prices on pizza, and then offer 'deals' that bring prices back down. These deals, generally, exist to steer customers towards buying more or buying higher-margin items (bread sticks, soda, etc).
If you could order pizza through an API, they wouldn't get the chance to upsell you. If it worked for multiple pizza places, it would advantage places who offer better value with their list prices.
Services need the ability to obtain an identifier that:
- Belongs to exactly one real person.
- That a person cannot own more than one of.
- That is unique per-service.
- That cannot be tied to a real-world identity.
- That can be used by the person to optionally disclose attributes like whether they are an adult or not.
Services generally don’t care about knowing your exact identity but being able to ban a person and not have them simply register a new account, and being able to stop people from registering thousands of accounts would go a long way towards wiping out inauthentic and abusive behaviour.
[0] https://news.ycombinator.com/item?id=41709792
[1] https://news.ycombinator.com/item?id=44378709
The ability to “reset” your identity is the underlying hole that enables a vast amount of abuse. It’s possible to have persistent, pseudonymous access to the Internet without disclosing real-world identity. Being able to permanently ban abusers from a service would have a hugely positive effect on the Internet.
Exactly one seems hard to implement (some kind of global registry?). I think relaxing this requirement slightly, such that a user could for instance get a small number of different identities by going to different attestors, would be easier to implement while also making for a better balance. That is, I don't want users to be able to trivially make thousands of accounts, but I also don't want websites to be able to entirely prevent privacy throwaway accounts, for a false ban from Google's services to be bound to your soul for life, to be permanently locked out using anything digital because your identifier was compromised by malware and can't be "reset", or so on.
Governments. Make it a digital passport.
> I also don't want websites to be able to entirely prevent privacy throwaway accounts, for a false ban from Google's services to be bound to your soul for life
People should be free to refuse to interact with you.
> to be permanently locked out using anything digital because your identifier was compromised by malware and can't be "reset", or so on.
Make it as difficult to reset as a passport. Not impossible, but enough friction that you wouldn’t want to keep doing it every time you get banned for spamming.
Some places don't have a sufficiently functional/digitally-competent government to manage it securely, and others would likely withhold/invalidate identifiers from groups they disfavor (like an ethnic/religious/political minority) - which would be fairly consequential if this is to dictate ability to communicate online. It's not the only way a government can do that, but it would be one that's alarmingly easy (requiring just inaction) and effective (to whatever extent the system works "as intended" in thwarting workarounds).
Presumably there also needs to be recourse against a corrupt government accepting bribes in exchange for giving out identifiers to spammers/etc., which to my understanding of the proposal would cut off all legitimate citizens of that country too if there's no redundancy.
Relaxing the requirement to allow for fallbacks (such that you can also apply to ICANN or some other international organization to get an identifier) should help, and if anything gives you more room to be picky about which organizations are accepted as attestors.
> People should be free to refuse to interact with you.
I think this conflates negative/passive rights (like the right to bear arms) with positive/active rights (like the right to counsel). Someone is free to refuse to interact with anyone who has worn fur if they can make that distinction, but that doesn't obligate me/society/governments to implement infrastructure to ensure that they can distinguish people who have worn fur - and people are (in general, not under oath/etc.) also free to lie about whether they have.
> - That a person cannot own more than one of.
These are mutually exclusive. Especially if you add 'cannot be tied to a real-world identity'.
I don't see how you can prevent multiple people sharing access to one HSM. Also, if the key is the same in hundreds of HSMs, this isn't fulfilled to begin with? Is this assuming the HSM holds multiple keys?
btw: "usually". Can you cite an implementation?
u2f has it: https://security.stackexchange.com/questions/224692/how-does...
>I don't see how you can prevent multiple people sharing access to one HSM.
Obviously that's out of scope unless the HSM has a retina scanner or whatever, but even then there's nothing preventing someone from consensually using their cousin's government issued id (ie. HSM) to access a 18+ site.
> Also, if the key is the same in hundreds of HSMs, this isn't fulfilled to begin with? Is this assuming the HSM holds multiple keys?
The idea is that the HSM will sign arbitrary proofs to give to relying parties. The relying parties can validate the key used to sign the proof is valid through some sort of certificate chain that is ultimately rooted at some government CA. However because the key is shared among hundreds/thousands/tens of thousands of HSMs/ids, it's impossible to tie that to a specific person/id/HSM.
> Is this assuming the HSM holds multiple keys?
Yeah, you'd need a separate device-specific key to sign/generate an identifier that's unique per-service. To summarize:
each HSM contains two keys:
1. K1: device-specific key, specific to the given HSM
2. K2: shared across some large number of HSMs
both keys is resistant to be extracted from the HSM, and the HSM will only use them for signing
To authenticate to a website (relying party):
1. HSM generates id, using something like hmac(site domain name, K1)
2. HSM generate signing blob containing the above id, and whatever additional attributes the user wants to disclose (eg. their name or whether they're 18+) plus timestamp/anti-replay token (or similar), signs it with k2, and returns to the site. The HSM also returns a certificate certifying that K2 is issued by some national government.
The site can verify the response comes from a genuine HSM because the certificate chains to some national government's CA. The site can also be sure that users can't create multiple accounts, because each HSM will generate the same id given the same site. However two sites can't correlate identities because the id changes depending on the site, and the signing key/certificate is shared among a large number of users. Governments can still theoretically deanonymize users if they retain K1 and work with site operators.
This is generally considered an unsolvable problem when trying to fulfill all of these requirements (cf. sibling post). Most subsets are easy, but not the full list.
Another issue is that people will hire (or enslave) others to effectively lend their identifiers, and it's very hard to distinguish between someone "lending" their identifier vs using it for themselves.
I've been thinking about hierarchical management. Roughly, your identifier is managed by your town, which has its own identifier managed by your state, which has its own identifier managed by your government, which has its own identifier managed by a bloc of governments, which has its own identifier managed by an international organization. When you interact with a foreign website and it requests your identity, you forward the request to your town with your personal identifier, your town forwards the request to your state with the town's identifier, and so on. Town "management" means that towns generate, assign, and revoke stolen personal identifiers, and authenticate requests; state "management" means that states generate, assign, and revoke town identifiers, and authenticate requests (not knowing who in the town sent the request); etc.
The idea is to prevent a much more powerful organization, like a state, from persecuting a much less powerful one, like an individual. In the hierarchical system, your town can persecute you: they can refuse to give you an identifier, give yours to someone else, track what sites you visit, etc. But then, especially if you can convince other town members (which ideally happens if you're unjustly persecuted), it's easier for you to confront the town and convince them to change, than it is to confront and convince a large government. Likewise, states can persecute entire towns, but an entire town is better at resisting than an individual, especially if that town allies with other towns. And governments can persecute entire states, and blocs can persecute entire governments, and the international organization can persecute entire blocs, but not the layer below.
In practice, the hierarchy probably needs many more layers; today's "towns" are sometimes big cities, states are much larger than towns, governments and much more powerful than states, etc. so there must be layers in-between for the layer below to effectively challenge the layer above. Assigning layers may be particularly hard because it requires balance, to enable most justified persecutions, e.g. a bloc punishing a government for not taking care of its scam centers, while preventing most unjustified persecutions. And there will inevitably be towns, states, governments, etc. where the majority of citizens are "unjust", and the layer above can only punish them entirely. So yes, hierarchical management still has many flaws, but is there a better alternative?
You don’t; you invalidate them. Let the real owner explain to the issuing authority what happened.
> Another issue is that people will hire (or enslave) others to effectively lend their identifiers, and it's very hard to distinguish between someone "lending" their identifier vs using it for themselves.
It doesn’t matter. If somebody uses your Facebook account to hurl abuse at people, you can expect your Facebook account to be banned. If somebody uses your email account to spam people, you can expect your email account to be added to spam filters.
I first thought this was just a crypto play with 1 wallet per real person (wasn't a huge fan), but with the proliferation of AI, it makes sense we'll eventually need safeguards to ensure a user's humanity, ideally without any other identifiers needed.
The flak should be because it's from Sam Altman. A billionaire tech bro giving us both the disease and the cure, and profiting massively along the way, is what's truly dystopian.
Cloudflare is helping to develop & eager to implement an open protocol called ARC (Anonymous Rate-Limited Credentials)
What is ARC? You can read the proposal here: https://www.ietf.org/archive/id/draft-yun-cfrg-arc-01.html#n...
But my summary is:
1. You convince a server that you deserve to have 100 tokens (probably by presenting some non-anonymous credentials)
2. You handshake with the server and walk away with 100 untraceable tokens
3. At anytime, you can present the server with a token. The server only knows
a. The token is valid
b. The token has not been previously used
Other details (disclaimer, I am not a cryptographer):- The server has a public + public key for ARC, which is how it knows that it was the one to issue the tokens. It's also how you know that your tokens are in the same pool as everyone else's tokens.
- It seems like there's an option for your 100 tokens to all be 'branded' with some public information. I assume this would be information like "Expires June 2026" or "Token Value: 1 USD", not "User ID 209385"
- The client actually ends up with a key which will generate the 100 tokens in sequence.
- Obviously the number 100 is configurable.
- It seems like there were already schemes to do this, but providing only one token (RFC 9497, RFC 9474) but I'm not sure how popular those were.
If it's up to the AI platform to issue limited tokens to users, and it's also the AI platform making the web requests, I'm not understanding the purpose of the cryptography/tokens. Couldn't the platform already limit a user to 100 web requests per hour just with an internal counter?
2. Cloudflare is one of the largest DDoS prevention providers.
3. Cloudflare is now, or soon will be, providing AI scraping services, per the linked article.
I would add that other large tech companies in the same problem spaces aren't innocent here, but given 1-3 it does seem like there is potential for monopolistic behavior here.
I'll explain my understanding.
Consider what problem CAPTCHA aims to solve (abuse) and how that's ineffective in an age of AI agents: it cannot distinguish "bot that is trying to buy a pizza" vs "bot that is trying to spider my site".
I don't understand Cloudflare's solution enough to explain that part.
I'm glad to see research here, because if we don't have innovation solutions, we might end up with microtransactions for browsing.
Think SMS verification but with cryptographic sorcery to make it private.
Depending on the level of hassle the service may even use SMS verification at setup. SMS verification is typically easy to acquire for as little as a few cents, but if the goal is to prevent millions of rate limited requests a few cents can add up.