Top
Best
New

Posted by xnx 10 hours ago

Chrome DevTools MCP (2025)(developer.chrome.com)
402 points | 180 commentspage 5
justboy1987 5 hours ago|
[dead]
jerrygoyal 3 hours ago||
It's from 2025. The post should have a year tag.
tomhow 1 hour ago|
Done, thanks!
ptak_dev 8 hours ago||
[dead]
myrak 10 hours ago||
[dead]
AlexDunit 10 hours ago||
[flagged]
David-Brug-Ai 9 hours ago||
This is the exact problem that pushed me to build a security proxy for MCP tool calls. The permission model in most MCP setups is basically binary, either the agent can use the tool or it can't. There's nothing watching what it does with that access once its granted.

The approach I landed on was a deterministic enforcement pipeline that sits between the agent and the MCP server, so every tool call gets checked for things like SSRF (DNS resolve + private IP blocking), credential leakage in outbound params, and path traversal, before the call hits the real server. No LLM in that path, just pattern matching and policy rules, so it adds single-digit ms overhead.

The DevTools case is interesting because the attack surface is the page content itself. A crafted page could inject tool calls via prompt injection. Having the proxy there means even if the agent gets tricked, the exfiltration attempt gets caught at the egress layer.

rob 10 hours ago||
Someone left their bot on default settings.
Bengalilol 8 hours ago||
The other reply to this 'bot' looks like another default thing: <https://news.ycombinator.com/threads?id=David-Brug-Ai>
Sonofg0tham 10 hours ago||
[flagged]
simianwords 9 hours ago|
AI
rzmmm 9 hours ago||
Yes. Can someone tell me why even HN has bots. For selling upvotes to advertisement purposes?
Sonofg0tham 8 hours ago||
I'm not a bot and definitely not advertising - I'm new on HN and trying to contribute with a few comments where I can.
paseante 7 hours ago|
The real problem this thread exposes is that we're duct-taping browser automation (Playwright, CDP, MCP wrappers) onto an interface designed for humans — the DOM. Every approach discussed here is fighting the same battle: too many tokens to represent page state, flaky selectors, hallucinated DOM structures, massive context cost.

What we actually need is a standard for websites to expose a machine-readable interaction layer alongside the human one. Something like robots.txt but for agent capabilities — declaring available actions, their parameters, authentication requirements, and response schemas. Not scraping the DOM and hoping the AI figures out which button to click.

The web already went through this evolution once: we went from screen-scraping HTML to structured APIs. Now we're regressing back to scraping because agents need to interact with sites that only have human interfaces. A lightweight standard — call it agents.json or whatever — where sites declare "here are the actions you can take, here are the endpoints, here's the auth flow" would eliminate 90% of the token waste, security concerns, and fragility discussed in this thread.

Until that exists, we'll keep building increasingly clever hacks on top of a 30-year-old document format that was never designed for machine consumption.

raincole 6 hours ago||
The ultimate conflict of interest here is that the sites people want to crawl the most are the ones that want to be crawled the least (e.g. Youtube.)
Lucasoato 6 hours ago|||
They’re trying to solve it by making it easier to get Markdown versions of websites.

For example, you can get a markdown out of most OpenAI documentation by appending .md like this: https://developers.openai.com/api/docs/libraries.md

Not definitive, but still useful.

maxaw 6 hours ago|||
Fully agree. Will take some time though as immediate incentive not clear for consumer facing companies to do extra work to help ppl bypass website layer. But I think consumers will begin to demand it, once they experience it through their agent. Eg pizza company A exposes an api alongside website and pizza company B doesn’t, and consumer notices their agent is 10x+ faster interacting with company A and begins to question why.
codybontecou 7 hours ago|||
Is this just a well-documented API?
ElectricalUnion 7 hours ago|||
> interface designed for humans — the DOM.

Citation needed.

> The web already went through this evolution once: we went from screen-scraping HTML to structured APIs. Now we're regressing back to scraping because agents need to interact with sites that only have human interfaces.

To me, sites that "only have human interfaces" are more likely that not be that way totally on purpose, attempting to maximize human retention/engagement and are more likely to require strict anti-bot measures like Proof-of-Work to be usable at all.

imiric 6 hours ago|||
> What we actually need is a standard for websites to expose a machine-readable interaction layer alongside the human one.

We had this 20 years ago with the Semantic Web movement, XHTML, and microformats. Sadly, it didn't pan out for various reasons, most of them non-technical. There's remnants of it today with RSS feeds, which is either unsupported or badly supported by most web sites.

Once advertising became the dominant business model on the web, it wasn't in publishers' interest to provide a machine-readable format of their content. Adtech corporations took control of the web, and here we are. Nowadays even API access is tightly controlled (see Reddit, Twitter, etc.).

So your idea will never pan out in practice. We'll have to continue to rely on hacks and scraping will continue to be a gray area. These new tools make automated scraping easier, for better or worse, but publishers will find new ways to mitigate it. And so it goes.

Besides, if these new tools are "superintelligent", surely they're able to navigate a web site. Captchas are broken and bot detection algorithms (or "AI" themselves) are unreliable. So I'd say the leverage is on the consumer side, for now.

quotemstr 7 hours ago||
> expose a machine-readable interaction layer alongside the human one

Which is called ARIA and has been a thing forever.