Posted by namukang 4/7/2025
The tool use / function calling thing far predates Anthropic releasing the MCP specification and it really wasn't that onerous to do before either. You could provide a json schema spec and tell the model to generate compliant json to pass to the API in question. MCP doesn't inherently solve any of the problems that come up in that sort of workflow, but it does provide an idiomatic approach for it (so there's a non-zero value there, but not much).
And if other vendors sign on to support MCP, then it becomes a self reinforcing cycle of adoption.
Coupled with the fact that any LLM trained for tool use can utilize the protocol, it doesn't feel like much of a moat that uniquely positions Claude Desktop in a meaningful way.
This is exactly what's happening now. A good portion of applications, frameworks and actors are starting to support it.
I've been reluctant on adopting MCP in applications until there was enough adoption.
However, depending on your use case it may also be too complex for your use case.
Containers are not a big deal when viewed in isolation. But when its common size/standard for all kinds of ships, cranes and trucks, it is a big deal then.
In that sense its more about gathering community around one way to do things.
In theory there are REST APIs and OpenAPI standard, but those were not made for LLMs but code. So you usually need some kind of friendly wrapper(like for candy) on top of REST API.
It really starts to feel like a a big deal when you work in integrating LLMs with tools.
APA (Agentic Process Automation) is the new RPA, and this is definitely one example of it.
2025-04-07T18:43:26.537Z [browsermcp] [info] Initializing server... 2025-04-07T18:43:26.603Z [browsermcp] [info] Server started and connected successfully 2025-04-07T18:43:26.610Z [browsermcp] [info] Message from client: {"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"claude-ai","version":"0.1.0"}},"jsonrpc":"2.0","id":0} node:internal/errors:983 const err = new Error(message); ^
Error: Command failed: FOR /F "tokens=5" %a in ('netstat -ano ^| findstr :9009') do taskkill /F /PID %a at genericNodeError (node:internal/errors:983:15) at wrappedFn (node:internal/errors:537:14) at checkExecSyncError (node:child_process:882:11) at execSync (node:child_process:954:15)
There was another comment that mentioned that there's an issue with port killing code on Windows: https://news.ycombinator.com/item?id=43614145
I just published a new version of the @browsermcp/mcp library (version 0.1.1) that handles the error better until I can investigate further so it should hopefully work now if you're using @browsermcp/mcp@latest.
FWIW, Claude Desktop currently has a bug where it tries to start the server twice, which is why the MCP server tries to kill the process from a previous invocation: https://github.com/modelcontextprotocol/servers/issues/812
Thanks, great job! I like it overall, but I noticed it has some issues entering text in forms, even on google.com. It's able to find a workaround and insert the searched text in the URL, but it would be nice if the entry into forms worked well for UI testing.
1. Kill your Claude Desktop app
2. Click "Connect" in the browser extension.
3. Quickly start your Calude Desktop app.
It will work 50% of the time - I guess the timing must be just right for it to work. Hopefully, the developers can improve this.
Now on to testing :)
"Go to https://news.ycombinator.com/upvoted?id=josefrichter, summarize what topics I am interested in, and then from the homepage pick articles I might be interested in."
Works like a charm.
Do we suppose they will just create a backdoor to allow _some_ bots in? If they do that how long will it be before other bots impersonate them? It seems like a bit of a fad from my small mind.
Suppose it does become a thing, what then? We end up with an internet which is heavily optimised for bots (arguably it already is to an extent) and unusable for humans?
Wild.
https://brightdata.com/pricing/web-unlocker https://2captcha.com/pricing
As opposed to the Web we now have, which is heavily optimized for... wasting human life.
What you're asking for, what "large companies such as CloudFlare have spent millions on", is verifying that on the other end of the connection is a web browser, and behind that web browser there is a human being that's being made to needlessly suffer and waste their limited lifespans, as they tediously work their way through the UI maze like a good little lab rat, watching ads at every turn of the corridor, while being constantly surveilled.
Or do you believe there is some other reason why you should care about whether you're interacting with a "human" (really: an user agent called "web browser") vs. "not human" (really: any other user agent)?
The relationship between the commercial web and its users is antagonistic - businesses make money through friction, by making it more difficult for users to accomplish their goals. That's why we never got the era of APIs and web automation for users. That's why we're dealing with tons of bespoke shitty SPAs instead of consistent interfaces - because no store wants to make it easy for you to comparison-shop, or skip their upsells, or efficiently search through the stock; no news service wants you to skip ads or make focused searches, etc.
As users, we've lost the battle for APIs and continue to be forced to use the "manual web" (with active cooperation of the browser vendors, too). MCP feels promising because we're in a moment in time, however brief, where LLMs can navigate the "manual web" for us, shielding us from all the malicious bullshit (ads, marketing copy, funneling, call to actions, confusing design, dark patterns, less dark patterns, the fact that your store is a bloated SPA instead of an endpoint for a generic database querying frontend, and so on) while remaining mostly impervious to it. This will not last long - the vendors de-facto ruling the web have every reason to shut it down (or turn it around and use LLMs against us). But for now, it works.
Adversarial interoperability is the name of the game. LLMs, especially combined with tool use (and right tools), make it much easier and much more accessible than ever before. For however brief a moment.
As for the optimisation to _waste human life_ I do agree but the reality is that the sites which waste the majority of human life/time are the ones which would not be automated by the MCP and would, ultimately, see more 'real' usage by virtue of the fact that your average human will have more time to mindlessly scroll their favourite echo-chamber.
Then we have the whole other debate of whether we really believe that the VC funders whom are largely responsible for the current state of the web will continue pumping money into something which would hurt their bottom line from another angle?
On the topic of:
> whether we really believe that the VC funders whom are largely responsible for the current state of the web will continue pumping money into something which would hurt their bottom line from another angle?
No, I don't believe that at all - which is why I keep saying the current situation is an anomaly, a brief moment in time. LLMs deployed in form of general-purpose chatbots/agents are giving too much power to the people, which is already becoming disruptive to many businesses, so that power will be gradually taken away. Expect less general-purpose AI agents, and more "AI powered features" that shackle LLMs behind some limited UI, to ensure you can only get as much benefit from AI as it fits the vendors' business strategies.
That, and maybe they will as CF seem quite big on MCP.[0] Or people just bypass the bot detection. It's already not terribly difficult to do; people in the sneaker bot and ticket scalping communities have long had bypasses for all the major companies.
I mean, we can all imagine bad use-cases of bots, but there's also the pros: the internet wastes loads of human time. I still remember needing to browse marketplaces real estate listings with terrible search and notification functionality to find a flat... shudders. Unbelievable amount of hours wasted.
If fewer people are able to build bots that can index a larger number of sites and give better searching capabilities, for instance, where sites are unable to provide this, I'm personally all for it. For many sites, it's that they lack the in-house development expertise and probably they wouldn't even mind.
[0]: https://developers.cloudflare.com/agents/model-context-proto... etc
The Playwright MCP server is great! Currently Browser MCP is largely an adaptation of the Playwright MCP server to use with your actual browser rather than creating a new one each time. This allows you to reuse your existing Chrome profile so that you don't need to log in to each service all over again and avoids bot detection which often triggers when using the fresh browser instances created by Playwright.
I also plan to add other useful tools (e.g. Browser MCP currently supports a tool to get the console logs which is useful for automated debugging) which will likely diverge from the Playwright MCP server features.
https://playwright.dev/docs/api/class-browsertype#browser-ty...
Unfortunately, Firefox doesn't expose WebDriver BiDi (the standardized version of CDP) to browser extensions AFAIK (someone please correct me if I'm mistaken!), so I don't think I can support it even if I tried.
Not going to lie, this makes me happy.
[0]: https://wiki.mozilla.org/WebDriver/RemoteProtocol/WebDriver_...