Top
Best
New

Posted by namukang 4/7/2025

Show HN: Browser MCP – Automate your browser using Cursor, Claude, VS Code(browsermcp.io)
616 points | 217 commentspage 2
amendegree 4/7/2025|
So is MCP the new RPA (Robotics Process Automation)? Like generic yahoo pipes?
spmurrayzzz 4/7/2025||
I just view it as a relative minor convenience, but it's not some game-changer IMO.

The tool use / function calling thing far predates Anthropic releasing the MCP specification and it really wasn't that onerous to do before either. You could provide a json schema spec and tell the model to generate compliant json to pass to the API in question. MCP doesn't inherently solve any of the problems that come up in that sort of workflow, but it does provide an idiomatic approach for it (so there's a non-zero value there, but not much).

PantaloonFlames 4/7/2025|||
It seems the benefit of MCP is for Anthropic to enlist the community in building integrations for Claude desktop, no?

And if other vendors sign on to support MCP, then it becomes a self reinforcing cycle of adoption.

spmurrayzzz 4/7/2025|||
Yea it certainly does benefit Claude Desktop to some degree, but most MCP servers are a few hundred SLOC and the protocol schema itself is only ~400 SLOC. If that was the only major obstacle standing in the way of adoption, I'd be very surprised.

Coupled with the fact that any LLM trained for tool use can utilize the protocol, it doesn't feel like much of a moat that uniquely positions Claude Desktop in a meaningful way.

asabla 4/7/2025||||
> And if other vendors sign on to support MCP, then it becomes a self reinforcing cycle of adoption

This is exactly what's happening now. A good portion of applications, frameworks and actors are starting to support it.

I've been reluctant on adopting MCP in applications until there was enough adoption.

However, depending on your use case it may also be too complex for your use case.

JackYoustra 4/7/2025|||
MCP is useful because anthropic has a disproportionate share of API traffic relative to its valuation and a tiny share of first-party client traffic. The best way around this is to shift as much traffic to API as possible.
PantaloonFlames 4/7/2025||
First party client , meaning browser? User agent or … Electron app, or , any mobile app?
JackYoustra 4/8/2025||
first party client as in a claude subscription will give you access (mostly app + web)
kmangutov 4/7/2025|||
The interesting thing about MCP as a tool use protocol is the traction that it has garnered in terms of clients and servers supporting it.
wonderwhyer 4/8/2025|||
I would probably call it shipping containers for LLM tool integrations.

Containers are not a big deal when viewed in isolation. But when its common size/standard for all kinds of ships, cranes and trucks, it is a big deal then.

In that sense its more about gathering community around one way to do things.

In theory there are REST APIs and OpenAPI standard, but those were not made for LLMs but code. So you usually need some kind of friendly wrapper(like for candy) on top of REST API.

It really starts to feel like a a big deal when you work in integrating LLMs with tools.

tmvphil 4/8/2025||
I'm a bit stuck on this, maybe you can explain why an LLM would have any difficulty writing REST API calls? Seems like it should be no problem.
ajcp 4/7/2025||
No, since MCP is just an interface layer it is to AI what REST API is to DPA and COM/App DLLs are to RPA.

APA (Agentic Process Automation) is the new RPA, and this is definitely one example of it.

XCSme 4/7/2025||
But AI already supported function calling, and you could describe them in various ways. Isn't this just a different way to define function calling?
cadence- 4/7/2025||
Doesn't work on Windows:

2025-04-07T18:43:26.537Z [browsermcp] [info] Initializing server... 2025-04-07T18:43:26.603Z [browsermcp] [info] Server started and connected successfully 2025-04-07T18:43:26.610Z [browsermcp] [info] Message from client: {"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"claude-ai","version":"0.1.0"}},"jsonrpc":"2.0","id":0} node:internal/errors:983 const err = new Error(message); ^

Error: Command failed: FOR /F "tokens=5" %a in ('netstat -ano ^| findstr :9009') do taskkill /F /PID %a at genericNodeError (node:internal/errors:983:15) at wrappedFn (node:internal/errors:537:14) at checkExecSyncError (node:child_process:882:11) at execSync (node:child_process:954:15)

namukang 4/7/2025||
Can you try again?

There was another comment that mentioned that there's an issue with port killing code on Windows: https://news.ycombinator.com/item?id=43614145

I just published a new version of the @browsermcp/mcp library (version 0.1.1) that handles the error better until I can investigate further so it should hopefully work now if you're using @browsermcp/mcp@latest.

FWIW, Claude Desktop currently has a bug where it tries to start the server twice, which is why the MCP server tries to kill the process from a previous invocation: https://github.com/modelcontextprotocol/servers/issues/812

cadence- 4/7/2025||
It's working now with the 0.1.0 for me. But I will let you know if I experience any issues once I get updated to 0.1.1.

Thanks, great job! I like it overall, but I noticed it has some issues entering text in forms, even on google.com. It's able to find a workaround and insert the searched text in the URL, but it would be nice if the entry into forms worked well for UI testing.

cadence- 4/7/2025||
I was able to make it work like this:

1. Kill your Claude Desktop app

2. Click "Connect" in the browser extension.

3. Quickly start your Calude Desktop app.

It will work 50% of the time - I guess the timing must be just right for it to work. Hopefully, the developers can improve this.

Now on to testing :)

josefrichter 4/8/2025||
What I used this for:

"Go to https://news.ycombinator.com/upvoted?id=josefrichter, summarize what topics I am interested in, and then from the homepage pick articles I might be interested in."

Works like a charm.

washedDeveloper 4/7/2025||
Can you add a license to your code along with open sourcing the chrome extension?
makingstuffs 4/8/2025||
I don't see how an MCP can be useful for browsing the net and doing things like shopping as has been suggested. Large companies such as CloudFlare have spent millions on, and made a business from, bot detection and blocking.

Do we suppose they will just create a backdoor to allow _some_ bots in? If they do that how long will it be before other bots impersonate them? It seems like a bit of a fad from my small mind.

Suppose it does become a thing, what then? We end up with an internet which is heavily optimised for bots (arguably it already is to an extent) and unusable for humans?

Wild.

kraftman 4/8/2025||
There are already plenty of services that provide residential proxies and captcha bypass pretty cheaply.

https://brightdata.com/pricing/web-unlocker https://2captcha.com/pricing

TeMPOraL 4/8/2025|||
> Suppose it does become a thing, what then? We end up with an internet which is heavily optimised for bots (arguably it already is to an extent) and unusable for humans?

As opposed to the Web we now have, which is heavily optimized for... wasting human life.

What you're asking for, what "large companies such as CloudFlare have spent millions on", is verifying that on the other end of the connection is a web browser, and behind that web browser there is a human being that's being made to needlessly suffer and waste their limited lifespans, as they tediously work their way through the UI maze like a good little lab rat, watching ads at every turn of the corridor, while being constantly surveilled.

Or do you believe there is some other reason why you should care about whether you're interacting with a "human" (really: an user agent called "web browser") vs. "not human" (really: any other user agent)?

The relationship between the commercial web and its users is antagonistic - businesses make money through friction, by making it more difficult for users to accomplish their goals. That's why we never got the era of APIs and web automation for users. That's why we're dealing with tons of bespoke shitty SPAs instead of consistent interfaces - because no store wants to make it easy for you to comparison-shop, or skip their upsells, or efficiently search through the stock; no news service wants you to skip ads or make focused searches, etc.

As users, we've lost the battle for APIs and continue to be forced to use the "manual web" (with active cooperation of the browser vendors, too). MCP feels promising because we're in a moment in time, however brief, where LLMs can navigate the "manual web" for us, shielding us from all the malicious bullshit (ads, marketing copy, funneling, call to actions, confusing design, dark patterns, less dark patterns, the fact that your store is a bloated SPA instead of an endpoint for a generic database querying frontend, and so on) while remaining mostly impervious to it. This will not last long - the vendors de-facto ruling the web have every reason to shut it down (or turn it around and use LLMs against us). But for now, it works.

Adversarial interoperability is the name of the game. LLMs, especially combined with tool use (and right tools), make it much easier and much more accessible than ever before. For however brief a moment.

makingstuffs 4/8/2025||
Sorry it wasn't entirely clear that I was by no means saying the web in its current form is anything close to what it could/should be. My main point was that, by making backdoors for MCPs there will be a new possible entry point for bad actors by exploiting said backdoor.

As for the optimisation to _waste human life_ I do agree but the reality is that the sites which waste the majority of human life/time are the ones which would not be automated by the MCP and would, ultimately, see more 'real' usage by virtue of the fact that your average human will have more time to mindlessly scroll their favourite echo-chamber.

Then we have the whole other debate of whether we really believe that the VC funders whom are largely responsible for the current state of the web will continue pumping money into something which would hurt their bottom line from another angle?

TeMPOraL 4/9/2025||
Fair enough. Thanks for clarifying. I agree with what you're saying in this comment.

On the topic of:

> whether we really believe that the VC funders whom are largely responsible for the current state of the web will continue pumping money into something which would hurt their bottom line from another angle?

No, I don't believe that at all - which is why I keep saying the current situation is an anomaly, a brief moment in time. LLMs deployed in form of general-purpose chatbots/agents are giving too much power to the people, which is already becoming disruptive to many businesses, so that power will be gradually taken away. Expect less general-purpose AI agents, and more "AI powered features" that shackle LLMs behind some limited UI, to ensure you can only get as much benefit from AI as it fits the vendors' business strategies.

jedimastert 4/8/2025|||
Most thing that do this kind of fingerprinting bot detection aren't looking for a browser that's pretending to be a human, they're looking for other programs that are pretending to be a browser.
m11a 4/8/2025||
> Do we suppose they will just create a backdoor to allow _some_ bots in?

That, and maybe they will as CF seem quite big on MCP.[0] Or people just bypass the bot detection. It's already not terribly difficult to do; people in the sneaker bot and ticket scalping communities have long had bypasses for all the major companies.

I mean, we can all imagine bad use-cases of bots, but there's also the pros: the internet wastes loads of human time. I still remember needing to browse marketplaces real estate listings with terrible search and notification functionality to find a flat... shudders. Unbelievable amount of hours wasted.

If fewer people are able to build bots that can index a larger number of sites and give better searching capabilities, for instance, where sites are unable to provide this, I'm personally all for it. For many sites, it's that they lack the in-house development expertise and probably they wouldn't even mind.

[0]: https://developers.cloudflare.com/agents/model-context-proto... etc

hliyan 4/8/2025||
Ideally, shouldn't this be the native experience of most "sites" on the internet? We've built an entire user experience around serving users rich, two dimensional visual content that is not machine-readable and are now building a natural language command line layer on top of it. Why not get rid of the middleware and present users a direct natural language interface to the application layer?
buttofthejoke 4/7/2025||
Why use this over Puppeteer or Playwright extensions?
namukang 4/7/2025|
The Puppeteer MCP server doesn't work well because it requires CSS selectors to interact with elements. It makes up CSS selectors rather than reading the page and generating working selectors.

The Playwright MCP server is great! Currently Browser MCP is largely an adaptation of the Playwright MCP server to use with your actual browser rather than creating a new one each time. This allows you to reuse your existing Chrome profile so that you don't need to log in to each service all over again and avoids bot detection which often triggers when using the fresh browser instances created by Playwright.

I also plan to add other useful tools (e.g. Browser MCP currently supports a tool to get the console logs which is useful for automated debugging) which will likely diverge from the Playwright MCP server features.

cAtte_ 4/7/2025|||
by the way, you can indeed access your personal context with Playwright. just `launchPersistentContext()` and set the userDataDir to that of your existing Chrome install:

https://playwright.dev/docs/api/class-browsertype#browser-ty...

buttofthejoke 4/7/2025|||
Ooo, i like that. one of the most annoying points has been 'not sharing' the browser context. i'll def check it out
Fernicia 4/7/2025||
Any plans to make a Firefox version?
namukang 4/7/2025|
Browser MCP uses the Chrome DevTools Protocol (CDP) to automate the browser so it currently only works for Chromium-based browsers.

Unfortunately, Firefox doesn't expose WebDriver BiDi (the standardized version of CDP) to browser extensions AFAIK (someone please correct me if I'm mistaken!), so I don't think I can support it even if I tried.

krono 4/7/2025||
Just found this[0] implementation roadmap on Mozilla's wiki, recently updated too! At least it's actively being worked on.

Not going to lie, this makes me happy.

[0]: https://wiki.mozilla.org/WebDriver/RemoteProtocol/WebDriver_...

DebtDeflation 4/7/2025|
In the Task Automation demo, how does it know all of the attributes of the motorcycle he is trying to sell? Is it relying on the underlying LLM's embedded knowledge? But then how would it know the price and mileage? Is there some underlying document not referenced in the demo? Because that information is not in the prompt.
More comments...