Show HN: Showboat and Rodney, so agents can demo what they've built

Posted by simonw 7 hours ago

Show HN: Showboat and Rodney, so agents can demo what they've built(simonwillison.net)

80 points | 47 commentspage 2

tardismechanic 6 hours ago|

See also (the confusingly named) playwright-cli

https://github.com/microsoft/playwright-cli

Different from the cli used for running tests etc that comes bundled with PlayWright

Sample use:

  playwright-cli open https://demo.playwright.dev/todomvc/ --headed
  playwright-cli type "Buy groceries"
  playwright-cli press Enter
  playwright-cli type "Water flowers"
  playwright-cli press Enter
  playwright-cli check e21
  playwright-cli check e35
  playwright-cli screenshot

simonw 6 hours ago||

Yeah that's an excellent option for this kind of thing too.

markusw 6 hours ago||

Oh, I hadn't seen that one either, thanks for sharing. Here I am still using the Chrome Devtools MCP like a caveman. :D

nzoschke 5 hours ago||

go-rod has been instrumental to my agentic coding loops too. Some uses:

- E2E testing of browser components

- Taking screenshots before and after and having Claude look at them to double check things

- Driving it with an API and CLI as a headless browser

Will definitely give Rodney a look.

water-drummer 5 hours ago||

Wait, why should an LLM simply not just write directly to the markdown file instead of going through the extra step of using a cli tool which is basically `echo 'something' >> file.md` but with templates that should really be in a prompt instead of a being in a compiled binary? Did Claude come up with the idea for this as well?

Also, I am sure you must already know about Playwright mcp so why this? If your goal isn't to make the cli human-friendly, which is the only advantage clis have over mcps doing the same thing, then why not just use the mcp? It doesn't even handle multiple sessions and has a single global state file––this is slop.

simonw 5 hours ago|

Because I don't want it to write to the markdown file directly. I want it to tell me the command it runs and I then run that command and write both the command and the output to the file.

Otherwise it's just writing a document, not building a demo you can review.

As far as I can tell you can't hook MCPs up to Claude Code for web.

I originally planned to support separate sessions but decided to leave that out for the initial release. I've opened an issue for that here: https://github.com/simonw/rodney/issues/6

measurablefunc 5 hours ago||

Google's antigravity does this automatically by creating Task & Walkthrough artifacts.

saberience 6 hours ago||

Sounds like both of these tools could be one shot by either Claude or Codex.

Or alternatively, just be a skill versus a tool.

My “agents” already demo stuff all the time by just being prompted to do so. I have notations in my standard Agents.md for how I want my documentation, testing etc.

simonw 6 hours ago|

They kind of were one-shotted by Claude. The value is in coming up with a consistent design and good enough --help that you can prompt:

  Run uvx showboat --help and
  uvx rodney --help and use those
  tools to demo the feature you built

The help text effectively doubles as a skill.

markusw 6 hours ago||

I guess it would still make sense to have "demo" and "browser-use" skills, so that the agent can reach for them proactively? I always try to remove as much friction as possible for myself, one little bit at a time.

simonw 5 hours ago||

My problem is that I work in dozens of different repos generally using Claude Code for web, which doesn't have a way to install extra global skills yet.

I don't want to duplicate my skills into all those repos (and keep them updated) so I prefer the "uvx tool --help" pattern.

markusw 5 hours ago||

That's actually one of the things that has kept me from using Claude Code web (that, and I often need a Chrome browser for the agent). But they must be working on it.

I saw an MCP I've set up on claude.ai show up in my local Claude Code MCP list the other day, it seems inevitable that there will be skills integration across environments as well at some point.

simonw 5 hours ago||

In working on Rodney I found out that the Claude Code for web environment has a Chrome browser installed already. It's a shame you can't see its output directly - even if it takes a screenshot there's no easy way to view it other than having it commit and push that to a branch in GitHub.

789bc7wassad 5 hours ago||

[dead]

limonstublechew 6 hours ago||

[dead]

brian200 6 hours ago||

[flagged]

usefulposter 5 hours ago||

This comment is regurgitating Simon's post with too much adherence to the input tokens. The unnatural, promotional restating of proper nouns in constrained output is a notable LLM tell.

Please respect the Hacker News community and read https://news.ycombinator.com/item?id=46747998.

yodon 5 hours ago||

If you could actually detect AI content with high accuracy, you would sell it as a service and print money, but you can't, so you force all the rest of us to wade through posts like yours, claiming to tell the rest of us what is and isn't AI, which are FAR more annoying, disruptive, and low signal than the post you're commenting on, which is intelligent, adds to the conversation, and is, by my read, almost certainly actually human authored, just written by someone who knows how to write.

usefulposter 5 hours ago||

I'm not affiliated with it in any way, but you might want to check out the leading model https://arxiv.org/abs/2402.14873 and the mixed authorship detection https://www.pangram.com/blog/pangram-3-0-technical. LLM detection has come a long way since the first GPTZero.

Human heuristics - I've prompted millions of tokens across every frontier model iteration for all manner of writing styles and purposes - also helps greatly.

Concerning to me are long-time posters who (perhaps unknowingly) advance the decline of this human community by encouraging the people breaking HN guidelines. Perhaps spending a few hours on Moltbook might help develop such a heuristic, since "someone who knows how to write" is just a Claude model with a link to the blogpost.

https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...

Thanks for your comment!

toastal 6 hours ago|

If agents can generate text so easily, why would they be limited to Markdown instead of reStructuredText, AsciiDoc, or LaTeX which have rich features that help users understand text? I can understand developers refusing to adopt proper formats for documentation, but this seems odd for the bots. It doesn’t even generate the correct syntax block in Markdown using “bash” instead of “sh-session”.

bee_rider 5 hours ago||

I dunno. I’ve written a bit of LaTeX but does it really shine in this context? IMO the real advantage it has is that it can allow the user to express more complicated intents than Markdown (weird phrasing—my natural instinct was to call LaTeX more precise than Markdown, but Markdown is pretty precise for describing the type of file that it is good at…).

Anyway LLMs don’t have underlying intent so maybe it is fine to just let them express what they can in Markdown?

simonw 5 hours ago|||

Markdown has the widest tool compatibility - GitHub renders it, so does VS Code and many other editors and file hosts.

I didn't know about sh-session, is that documented anywhere?

giancarlostoro 5 hours ago||

I think its primarily because that is the most common formatting in every editor now? I could be wrong. Markdown has become the standard for README files for over a decade now.

toastal 5 hours ago||

Winning a popularity contest doesn’t mean it’s good. That is the worst part of about these things as they just generate the most common denominator type code/tooling while also repeating anti-patterns/mistakes like the bash vs. sh-session/console issue I pointed out. Garbage in has been so much garbage out unfortunately.

giancarlostoro 5 hours ago||

Never said it was good, just making an observation that Markdown is most likely to be available to render OOTB in more editors. I don't think Markdown is bad necessarily either. It's "good enough" for simple document.