Posted by queenelvis 19 hours ago
A Roblox cheat and one AI tool brought down Vercel's platform - https://news.ycombinator.com/item?id=47844431 - April 2026 (145 comments)
The only way to defend against these types of issues is to encrypt your environment with your own keys, with secrets possibly baked into source as there are no other facilities to separate them. An attacker would need to not only read the environments but also download the compiled functions and find the decryption keys.
It is not ideal but it could work as a workaround.
please don't suggest this. The right way is to have the creds fetched from a vault, which is programmed to release the creds auth-free to your VM (with machine level identify managed by the parent platform)
This is how Google Secrets or AWS Vaults work.
Or have whatever deployment tool that currently populates the env vars instead use the same information to populate files on the filesystem (like mounting creds).
For example, it is possible to create a vault lease for exactly one CI build and tie the lifetime of secrets the CI build needs to the lifetime of this build. Practically, this would mean that e.g. a token, some oauth client-id/client-secret or a username/password credential to publish an artifact is only valid while the build runs plus a few seconds. Once the build is done, it's invalidated and deleted, so exfiltration is close to meaningless.
There are two things to note about this though:
This means the secret management has to have access to powerful secrets, which are capable of generating other secrets. So technically we are just moving goal posts from one level to another. That is fine usually though - I have 5 vault clusters to secure, and 5 different CI builds every 10 minutes or so, or couple thousand application instances in prod. I can pay more attention to the vault clusters.
But this is also not easy to implement. It needs a vault cluster, dynamic PostgreSQL users take years to get right, we are discovering how applications can be terrible at handling short-lived certificates every month (and some even regress. Grafana seems to have with PostgreSQL client certs in v11/v12), we've found quite a few applications who never thought that certs with less than a year of lifetime even exists. Oh and if your application is a single-instance monolith, restarting to reload new short-lived DB-certs is also terrible.
Automated, aggressive secret management and revocation imo is a huge problem to many secret exfiltration attacks, but it is hard to do and a lot of software resists it very heavily on many layers.
Like, sure, you can go HAM here and use network proxy services to do secret decryption, and only talk from the app to those proxies via short-lived tokens; that's arguably a qualitative shift from app-uses-secret-directly, and it has some real benefits (and costs, namely significant complexity/fragility).
Instead, my favored option is to scope secret use to network locations. If, for example, a given NPM token can only be used for API calls issued from the public IP endpoint of the user's infrastructure, that's a significant added layer of security. People don't agree on whether or not this counts as a "token ACL", but it's certainly ACL-like in its functionality--just controlled by location, rather than identity.
This approach can also be adopted gradually and with less added fragility than the proxy-all-the-things approach: token holders can initially allowlist broad or shared network location ranges, and narrow allowed access sources over time as their networks are improved.
Of course, that's a fantasy. API providers would have to support network-scoped API access credentials, and almost none of them do.
Security researchers always need to give an answer whenever there's a security incident and the answer can never be "too much centralization risk" even when that is the only reasonable answer. You can't remove centralization risk.
IMO, the future is; every major centralized platform will be insecure in perpetuity and nothing can be done about it.
The service that encrypts the data should be the ONLY service that holds the private key to decrypt, and therefore the only service that can process the decrypted data.
It's easy to see how this would work with sufficiently sophisticated clients in some use-cases, say via a vault plugin, but posing this as a universal necessity feels like a big departure from typical oauth flows, and the added complexity could be harmful depending on what home-grown solutions are used to implement it.
As far as I’m concerned, the only sane way is to dump credentials in a well-known path and let the environment decide what to bind them with at runtime (which is how Kubernetes does it, at least the EKS version I’ve had to work with).
IOW, JEE variable binding (JNDI) did it right 20 years or so ago.
It might be worth for architecture designers to look back at that engineering monument (in all its possible meanings, it felt complicated at times) and study its solutions before coming up with a different solution to a problem it solved
Attributed without evidence from what I could tell. So it doesn't reveal much at all.
Vercel is understandably trying to shift all the blame on the third party but the fact their admin panel can be accessed with gmail/drive/whatever oauth scopes is irresponsible.
If you can only fix one thing (ideally you'd do both, but working in infosec has taught me that you can usually do one thing at most before the breach urgency political capital evaporates), fix the Google token scope/expiry, or fix the environment variable storage system.
IMO it's probably a bad idea to have an LLM/agent managing your email inbox. Even if it's readonly and the LLM behaves perfectly, supply chain attacks have an especially large blast radius (even more so if it's your work email).
It's "AI-enabled tradecraft" as in let's take a guess at Vercel leadership's pressure to install and test AI across the company, regardless of vendor risk? Speed speed speed.
This is an extremely vanilla exploit that every company operating without a strictly enforceable AI install allowlist is exposed to - how many AI tools like Context are installed across your scope of local and SaaS AI? Odds are, quite a bit, or ask your IT guy/gal for estimates.
These tools have access to... everything! And with a security vendor and RBAC mechanism space that'll exist in about... 18-24 months.
Vercel is the canary. It's going to get interesting here, no way in heck that Context is the only target. This is a well established, well-concerned/well-ignored threat vector, when one breaks open the other start too.
Implies a very challenging 6 months ahead if these exploits are kicking off, as everyone is auditing their AI installs now (or should be), and TAs will fire off with the access they have before it is cut.
Source - am a head of sec in tech
Designing for provider-side compromise is very hard because that's the whole point of trust...
Do any marketplaces have a good approach here? I know Cloudflare, after their similar Salesloft issue, has proposed proxying all 3rd party OAuth and API traffic through them. But that feels a little bit like trading one threat vector for another.
Other than standard good practices like narrow scopes, shorter expirations, maybe OAuth Client secret rotation, etc, I'm not sure what else can be done. Maybe allowlisting IP addresses that the requests associated with a given client can come from?
OAuth 2.1[0] (an RFC that has been around longer than I've been at my employer) recommends some protections around refresh tokens, either making them sender constrained (tied to the client application by public/private key cryptography) or one-time use with revocation if it is used multiple times.
This is recommended for public clients, but I think makes sense for all clients.
The first option is more difficult to implement, but is similar to the IP address solution you suggest. More robust though.
The second option would have made this attack more difficult because the refresh token held by the legit client, context.ai, would have stopped working, presumably triggering someone to look into why and wonder if the tokens had been stolen.
0: https://datatracker.ietf.org/doc/html/draft-ietf-oauth-v2-1
That's standard in oidc I believe
nextjs app bake all env vars on the client side code!! it's all public, unless you prefix the name with private_ or something.
You preface with PUBLIC_ to expose them in client side code.
EDIT: the writeup from context.ai themselves seems quite informative: https://context.ai/security-update, it seems like it was a personal choice of one of the Vercel employees to grant full access to their Google workspace.
"The attacker compromised this OAuth application — the compromise has since been traced to a Lumma Stealer malware infection of a Context.ai employee in approximately February 2026, reportedly after the employee downloaded Roblox game exploit scripts"
Also worth checking your Google Workspace OAuth authorizations. Admin Console > Security > API Controls > Third-party app access. Guarantee there are apps in there you authorized for a demo two years ago that are still sitting with full email/drive access.
Then you remove the old credential from the endpoint.
That statement in the report really confuses me; feels illogical and LLM generated.
An old deployment using an older env var doesn't do anything to control whether or not the credential is still valid. This is a footgun which affects availability, not confidentiality like implied.
Another section in the report is confusing, "Environment variable enumeration (Stage 4)". The described mechanics of env var access are bizarre to me -
> Pay particular attention to any environment variable access originating from user accounts rather than service accounts, or from accounts that do not normally interact with the projects being accessed.
Are people really reading credentials out of vercel env vars for use in other systems?
> The CEO publicly attributed the attacker's unusual velocity to AI
> questions about detection-to-disclosure latency in platform breaches
Typical! The main failures in my mind are:
1. A user account with far too much privileges - possible many others like them
2. No or limited 2FA or any form of ZeroTrust architecture
3. Bad cyber security hygiene
"Vercel CEO says AI accelerated attack on critical infrastructure"
Ironically, if the timeline is true that the attackers had been inside for months, the AIs they had access to are substantially weaker than today's frontier models. How much faster would they have achieved their goals with GLM 5.1?
Anyone know where these dates are being sourced from? eg,
> Late 2024 – Early 2025: Attacker pivots from Context.ai OAuth access to a Vercel employee's Google Workspace account -- CONFIRMED — Rauch statement
> Early - mid-2025: Internal Vercel systems accessed; customer environment variable enumeration begins -- CONFIRMED — Vercel bulletin