Top
Best
New

Posted by embedding-shape 8 hours ago

Show HN: One Human + One Agent = One Browser From Scratch in 20K LOC(emsh.cat)
Related: https://simonwillison.net/2026/Jan/27/one-human-one-agent-on...
93 points | 54 comments
simonw 5 hours ago|
This is a notably better demonstration of a coding agent generated browser than Cursor's FastRender - it's a fraction of the size (20,000 lines of Rust compared to ~1.6m), uses way fewer dependencies (just system libraries for rendering images and text) and the code is actually quite readable - here's the flexbox implementation, for example: https://github.com/embedding-shapes/one-agent-one-browser/bl...

Here's my own screenshot of it rendering my blog - https://bsky.app/profile/simonwillison.net/post/3mdg2oo6bms2... - it handles the layout and CSS gradiants really well, renders the SVG feed icon but fails to render a PNG image.

I thought "build a browser that renders HTML+CSS" was the perfect task for demonstrating a massively parallel agent setup because it couldn't be productively achieved in a few thousand lines of code by a single coding agent. Turns out I was wrong!

vidarh 4 hours ago||
I think the human + agent thing absolutely will make a huge difference. I see regularly that Claude can totally off piste and eventually claw itself back with a proper agent setup but it will take a lot of time if I don't spot it and get it back on track.

I have one project Claude is working on right now where I'm testing a setup to attempt to take myself more out of the loop, because that is the hard part. It's "easy" to get an agent to multiply your output. It's hard to make that scale with your willingness to spend on tokens rather than with your ability to read and review and direct.

I've ended up with roughly this (it's nothing particularly special):

- Runs a evaluator that evaluates the current state and assigns scores across multiple metrics.

- If a given score is above a given threshold, expand the test suite automatically.

- If the score is below a given threshold, spawn a "research agent" that investgates why the scores don't meet expectations.

- The research agent delivers a report, that is passed to an implementation agent.

- The main agent re-runs the scoring, and if it doesn't show an improvement on one or more of the metrics, the commit is discarded, and notes made of what was tried, and why it failed.

It takes a bit of trial and error to get it right (e.g. "it's the test suite that is wrong" came up early, and the main agent was almost talked into revising the test suite to remove the "problematic" tests) but a division sort of like this lets Claude do more sensible stuff for me. Throwing away commits feels drastic - an option is to let it run a little cycle of commit -> evaluate -> redo a few times before the final judgement, maybe - but it so far it feels like it'll scale better. Less crap makes it into the project.

And I think this will work better than to treat these agents as if they are developers whose output costs 100x as much.

Code so cheap it is disposable should change the workflows.

So while I agree this is a better demonstration of a good way to build a browser, it's a less interesting demonstration as well. Now that we've seen people show that something like FastRender is possible, expect people to experiment with similarly ambitious projects but with more thought put into scoring/evaluation, including on code size and dependencies.

embedding-shape 4 hours ago||
> I think the human + agent thing absolutely will make a huge difference.

Just the day(s) before, I was thinking about this too, and I think what will make the biggest difference is humans who posses "Good Taste". I wrote a bunch about it here: https://emsh.cat/good-taste/

I think the ending is most apt, and where I think we're going wrong right now:

> I feel like we're building the wrong things. The whole vibe right now is "replace the human part" instead of "make better tools for the human part". I don't want a machine that replaces my taste, I want tools that help me use my taste better; see the cut faster, compare directions, compare architectural choices, find where I've missed things, catch when we're going into generics, and help me make sharper intentional choices.

vidarh 4 hours ago||
For some projects, "better tools for the human part" is sufficient and awesome.

But for other projects, being able to scale with little or no human involvement suddenly turns some things that were borderline profitable or not possible to make profitable at all with current salaries vs. token costs into viable businesses.

Where it works, it's a paradigm shift - for both good and bad.

So it depends what you're trying to solve for. I have projects in both categories.

embedding-shape 4 hours ago||
Personally I think the part where you try to eliminate humans from involvement, is gonna lead to too much trouble, being too inflexible and the results will be bad. It's what I've seen so far, haven't seen anything pointing to it being feasible, but I'd be happy to be corrected.
vidarh 4 hours ago||
It really depends on the type of tasks. There are many tasks LLMs do for me entirely autonomously already, because they do it well enough that it's no longer worth my time.
Imustaskforhelp 49 minutes ago|||
To me I really like how embedding shapes took things in his own hands and actually built it. It really proved a point at such a scale where I don't think any recent example can point to.

It's great to see hackernews be so core part of it haha.

> I thought "build a browser that renders HTML+CSS" was the perfect task for demonstrating a massively parallel agent setup because it couldn't be productively achieved in a few thousand lines of code by a single coding agent. Turns out I was wrong!

I do wonder if tech people from future/present are gonna witness this as a goliath vs david story. 20k 1 human 1 agent beats 5 million$ 1.6 millions loc browser changing how even the massive AI users/pioneers at the time thought about the use of AI

Looks like I have watched some documentaries recently but why do I feel like a documentary about this whole thing can be created in future.

But also, More and more I am feeling like AI is an absolute black box, nobody knows how to do things but we are all kind of doing experiments with it and seeing what sticks (like how we now have definitive proof that 1 human 1 agent > many agents no human in the loop)

And this is when we are 1 month in 2026, who knows what other experiments and proofs happen this year to find more about this black box, and about its usefulness or not.

Simon, it would be interesting if you could read the thread of predictions of 2026 thread in hn each month or quaterly to see how many people were wrong or right about AI as we figure out more things perhaps.

rananajndjs 3 hours ago||
[dead]
happytoexplain 30 minutes ago||
What kind of time frame do you ballpark this would have taken you on your own?

I know it's a little apples-and-oranges (you and the agent wouldn't produce the exact same thing), but I'm not asking because I'm interested in the man-hour savings. Rather, I want to get a perspective on what kind of expertise went into the guidance (without having to read all the guidance and be familiar with browser implementation myself). "How long this would have taken the author" seems like one possible proxy for "how much pre-existing experience went into this agent's guidance".

simonw 23 minutes ago||
I have a fun little tool which runs the year-2000-era sloccount algorithm (which is Perl and C so I run it in WebAssembly) to estimate the time and cost of a project here: https://tools.simonwillison.net/sloccount

If you paste https://github.com/embedding-shapes/one-agent-one-browser into the "GitHub Repository" tab it estimates 4.58 person-years and $618,599 by year-2000 standards, or 5.61 years and $1,381,079 according to my very non-trustworthy 2025 estimate upgrade.

embedding-shape 17 minutes ago||
> What kind of time frame do you ballpark this would have taken you on your own?

I don't think I'd be able to do this on my own. Not that I don't know Rust, but because I don't know X11 (nor macOS or Windows) well enough to even know where to begin.

I've been a Linux user for almost two decades, so I know my way around my system, but never developed X11 applications or anything, I'm mostly a web developer who jumped around various roles through the years. Spent a lot of time caring deeply about testing, infrastructure, architecture/design and communication between humans, might have given me a slight edge in programming together with agents.

hedgehog 20 minutes ago||
This looks pretty solid. I think you can make this process more efficient by decomposing the problem into layers that are more easily testable, e.g. testing topological relationships of DOM elements after parse, then spatial after layout, then eventually pixels on things like ACID2 or whatever the modern equivalent is. The models can often come up with tests more accurately than they get the code right the first time. There are often also invariants that can be used to identify bugs without ground truth, e.g rendering the page with slightly different widths you can make some assertions about how far elements will move.
mwcampbell 1 hour ago||
Impressive work.

I wonder if you've looked into what it would take to implement accessibility while maintaining your no-Rust-dependencies rule. On Windows and macOS, it's straightforward enough to implement UI Automation and the Cocoa NSAccessibility protocols respectively. On Unix/X11, as I see it, your options are:

1. Implement AT-SPI with a new from-scratch D-Bus implementation.

2. Implement AT-SPI with one of the D-Bus C libraries (GLib, libdbus, or sdbus).

3. Use GTK, or maybe Qt.

QuadmasterXLII 1 hour ago||
The rendering is pretty chaotic when I tried it- not that far off from just the text in the html tags, in some size, color, and placement on the screen. This sounds like unfairness, but there is some motte-and-bailey where if you claim to be a browser, I get to evaluate on stuff like links being consistently blue and underlined ( as is, they are sometimes blue and sometimes underlined, without a clear pattern- if they were never formatted differently from standard text, I would just buy this as a feature not implemented yet). It may be that some of the rendering is not supported on windows- the back button certainly isn't. I guess if I want to make my criticism actually legitimate I should make a "one human and no agent browser" post that just regexes out stuff that looks like content and formats it at random. The binary I downloaded definitely overperforms at the hacker news homepage and simonw's blog.
embedding-shape 6 minutes ago|
It's a really basic browser. It's made less as an independent thing, and more as a reply to https://cursor.com/blog/scaling-agents, so as long as it does more or less the same as theirs, but is less LOC, it does what I set out for it to do :)

> I get to evaluate on stuff like links being consistently blue and underlined

Yeah, this browser doesn't have a "default stylesheet" like a regular browser. Probably should have added that, but was mostly just curious about rendering the websites from the web, rather than using what browsers think the web should look like.

> It may be that some of the rendering is not supported on windows- the back button certainly isn't.

Hmm, on Windows 11 the back button should definitively work, tried that just last night. Are you perhaps on Windows 10? I have not tried that myself, should work but might be why.

barredo 20 minutes ago||
The binaries are only around 1 MB for Linux, Mac and Windows. Very impressive https://github.com/embedding-shapes/one-agent-one-browser/re...
embedding-shape 10 minutes ago|
Fun fact, not until someone mentioned how small the binaries did I notice! Fun little side-effect from the various constraints and requirements I set in the REQUIREMENTS.md I suppose.
embedding-shape 7 hours ago||
I set some rules for myself: three days of total time, no 3rd party Rust crates, allowed to use commonly available OS libraries, has to support X11/Windows/macOS and can render some websites.

After three days, I have it working with around 20K LOC, whereas ~14K is the browser engine itself + X11, then 6K is just Windows+macOS support.

Source code + CI built binaries are available here if you wanna try it out: https://github.com/embedding-shapes/one-agent-one-browser

bhadass 22 minutes ago||
very impressive!

it's amazing how far we've come in 20 years. i was a (very minor) contributor to khtml/konqueror (before apple got involved w/ webkit) in the early 2000s, and back then it was such a labor intensive process to even create a halfway working engine. like, months of work just to get basic rendering somewhat correct on a very small portion of the web (which was obv much smaller)

in addition to agentic coding, i think for this specific task having css-spec/html-spec/web-platform-tests as machine readable test suites helps a LOT. the agent can actually validate against real specs.

back in the day, despite having gecko as an open source reference, in practice the "standards" were whatever IE was doing. so you'd spend weeks implementing something only to discover every site was coded for IE's quirks lmao. for all of their other faults, google/apple and other contributors helped bring in discipline to that.

embedding-shape 13 minutes ago||
> i think for this specific task having css-spec/html-spec/web-platform-tests as machine readable test suites helps a LOT

You know, I placed the specs in the repository with that goal (even sneaked in a repo that needs compiling before being usable), but as far as I can see, the agent never actually peeked into that directory nor read anything from them in the end.

It'll be easier to see once I made all the agent sessions public, and I might be wrong (I didn't observe the agent at all times), but seems the agent never used though.

chatmasta 1 hour ago|||
Did you use Claude code? How many tokens did you burn? What’d it cost? What model did you use?
embedding-shape 58 minutes ago||
Codex, no idea about tokens, I'll upload the session data probably tomorrow so you could see exactly what was done. I pay ~200 EUR/month for the ChatGPT Pro plan, prorating days I guess it'll be ~19 EUR for three days. Model used for everything was gpt-5.2 with reasoning effort set to xhigh.
forgotpwd16 25 minutes ago|||
>I'll upload the session data probably tomorrow so you could see exactly what was done.

That'll be dope. The tokens used (input,output,total) are actually saved within codex's jsonl files.

soiltype 28 minutes ago|||
Thank you in advance for that! I barely use AI to generate code so I feel pretty lost looking at projects like this.
jacquesm 5 hours ago||
Those are excellent constraints.
avmich 48 minutes ago||
Next thing would probably be an OS. With different APIs, the browser could be not constrained by existing standards. Generation of a good set of applications making working in the OS convenient - starting with GNU set? And then we can approach CPU architecture - again, without constraint to existing languages or instruction sets. That should be interesting to play with.
forgotpwd16 47 minutes ago|
Someone has already done this: https://github.com/viralcode/vib-OS

Also, someone made a similar comment not too long ago. So people surely are curious if this is possible. Kinda surprised this project's submission didn't got popular.

forgotpwd16 51 minutes ago||
Impressive. Very nice. (Let's see Paul Allen's browser. /s) Can say is Brooks's law in action. What one human and one agent can do in 3d, one human and hundreds of agents can do in few weeks. A modern retake of the old joke.

>without using any 3rd party libraries

Seems to be an easier for coding agents to implement from scratch over using libraries.

pulkas 1 hour ago|
The Mythical Man-Month, revisited
More comments...