Top
Best
New

Posted by embedding-shape 1/27/2026

Show HN: One Human + One Agent = One Browser From Scratch in 20K LOC(emsh.cat)
Related: https://simonwillison.net/2026/Jan/27/one-human-one-agent-on...
322 points | 153 commentspage 3
barredo 1/27/2026|
The binaries are only around 1 MB for Linux, Mac and Windows. Very impressive https://github.com/embedding-shapes/one-agent-one-browser/re...
userbinator 1/28/2026||
"only around 1MB" is not particularly impressive in absolute terms... there are a few browsers which are the same or smaller, and yet more functional.

https://tinyapps.org/network.html

Of course, "AI-generated browser is 1MB" is neither here nor there.

embedding-shape 1/28/2026|||
Tried building with some other arguments/configs, went from 1,2M on X11 to 664K, which seems to place it under Lynx (text-only, 714k) but above OffByOne (full HTML 3.2, 409k). Of course, my experiment barely implements anything from a "real" browser, so unfair comparison really.

Neat collection of apps nonetheless, some really impressive stuff in there.

simonw 1/28/2026|||
Do any of those handle CSS and SVG?
embedding-shape 1/27/2026||
Fun fact, not until someone mentioned how small the binaries did I notice! Fun little side-effect from the various constraints and requirements I set in the REQUIREMENTS.md I suppose.
storystarling 1/27/2026||
How did you handle the context window for 20k lines? I assume you aren't feeding the whole codebase in every time given the API costs. I've struggled to keep agents coherent on larger projects without blowing the budget, so I'm curious if you used a specific scoping strategy here.
simonw 1/27/2026||
GPT-5.2 has a 400,000 token context window. Claude Opus 4.5 is just 200,000 tokens. To my surprise this doesn't seem to limit their ability to work with much larger codebases - the coding agent harnesses have got really good at grepping for just the code that they need to have in-context, similar to how a human engineer can make changes to a million lines of code without having to hold it all in their head at once.
storystarling 1/27/2026||
That explains the coherence, but I'm curious about the mechanics of the retrieval. Is it AST-based to map dependencies or are you just using vector search? I assume you still have to filter pretty aggressively to keep the token costs viable for a commercial tool.
simonw 1/27/2026||
No vector search, just grep.
embedding-shape 1/27/2026|||
I didn't, Codex (tui/cli) did, it does it all by itself. I have one REQUIREMENTS.md which is specific to the project, a AGENTS.md that I reuse across most projects, then I give Codex (gpt-5.2 with reasoning effort set to xhigh) a prompt + screenshot, tells it to get it to work somewhat similar, waits until it completes, reviewed that it worked, then continued.

Most of the time when I develop professionally, I restart the session after each successful change, for this project, I initially tried to let one session go as long as possible, but eventually I reverted back to my old behavior of restarting from 0 after successful changes.

For knowing what file it should read/write, it uses `ls`, `tree` and `ag ` most commonly, there is no out-of-band indexing or anything, just a unix shell controlled by a LLM via tool calls.

nurettin 1/27/2026||
You don't load the entire project into the context. You let the agent work on a few 600-800 line files one feature at a time.
storystarling 1/27/2026||
Right, but how does it know which files to pick? I'm curious if you're using a dependency graph or embeddings for that discovery step, since getting the agent to self-select the right scope is usually the main bottleneck.
embedding-shape 1/27/2026|||
I gave you a more complete answer here: https://news.ycombinator.com/item?id=46787781

> since getting the agent to self-select the right scope is usually the main bottleneck

I haven't found this to ever be the bottleneck, what agent and model are you using?

nurettin 1/28/2026|||
If you don't trigger the discovery agents, claude cli uses a search tool and greps 50-100 lines at a go. If discovery is triggered, claude sends multiple agents to the code with different tasks which return with overall architecture notes.
polyglotfacto 1/30/2026||
This one's really nice.

- clear code structure and good architecture(modular approach reminiscent of Blitz but not as radical, like Blitz-lite).

- Very easy to follow the code and understand how the main render loop works:

    - For Mac: main loop is at https://github.com/embedding-shapes/one-agent-one-browser/blob/master/src/platform/macos/windowed.rs#L74
   
    - You can see clearly how UI events as passed to the App to handle. 

    - App::tick allows the app to handle internal events(Servoshell does something similar with `spin_event_loop` at https://github.com/servo/servo/blob/611f3ef1625f4972337c247521f3a1d65040bd56/components/servo/servo.rs#L176)

    - If a redraw is needed, the main render logic is at https://github.com/embedding-shapes/one-agent-one-browser/blob/master/src/platform/macos/windowed.rs#L221 and calls into `render` of App, which computes a display list(layout) and then translates it into commands to the generic painter, which internally turns those into platform specific graphics operations.
- It's interesting how the painter for Mac uses Cocoa for graphics; very different from Servo which uses Webrender or Blitz which(in some path) uses Vello(itself using wgpu). I'd say using Cocoa like that might be closer to what React-Native does(expert to comfirm this pls?). Btw this kind of platform specific bindings is a strength of AI coding(and a real pain to do by hand).

- Nice modularity between the platform and browser app parts achieved with the App and Painter traits.

How to improve it further? I'd say try to map how the architecture correspond to Web standards, such as https://html.spec.whatwg.org/multipage/webappapis.html#event...

Wouldn't have to be precise and comprehensive, but for example parts of App::tick could be documented as an initial attempt to implement a part of the web event-loop and `render` as an attempt at implementing the update-the-rendering task.

You could also split the web engine part from the app embedding it in a similar way to the current split between platform and app.

Far superior, and more cost effective, than the attempt at scaling autonomous agent coding pursued by Fastrender. Shows how the important part isn't how many agents you can run in parallel, but rather how good of an idea the human overseeing the project has(or rather: develops).

embedding-shape 7 days ago|
Hey, thanks a bunch of the review of the code itself! Personally I hadn't really looked deeply at it yet myself, especially the Windows and macOS code, interesting to hear that it as a slightly different approach for the painter, to me that's slightly unexpected.

Agree with your conclusion :)

Would be interesting to see how the architecture/design would change if I had focused the agent on making as modular and reusable code as possible, as currently I had some constraints about it, but not too strict. You bring up lots of interesting points, thanks again!

polyglotfacto 5 days ago||
you're welcome.
madmaniak 1/28/2026||
But when I install Firefox or Chrome it's much faster, much better and also someone else's code. Also copied and pasted by machine. Just I don't claim it's mine.
deadbabe 1/27/2026||
This is not that impressive, there are numerous examples of browsers for training data to reference.
simonw 1/28/2026||
I don't buy this.

It implies that the agents could only do this because they could regurgitate previous browsers from their training data.

Anyone who's watched a coding agent work will see why that's unlikely to be what's happening. If that's all they were doing, why did it take three days and thousands of changes and tool calls to get to a working result?

I also know that AI labs treat regurgitation of training data as a bug and invest a lot of effort into making it unlikely to happen.

I recommend avoiding the temptation to look at things like this and say "yeah, that's not impressive, it saw that in the training data already". It's not a useful mental model to hold.

deadbabe 1/28/2026||
It took three days because... agents suck.

But yes, with enough prodding they will eventually build you something that's been built before. Don't see why that's particularly impressive. It's in the training data.

simonw 1/28/2026||
Not a useful mental model.
deadbabe 1/28/2026||
It is useful. If you can whip up something complex fairly quickly with an AI agent, it’s likely because it’s already been done before.

But if even the AI agent seems to struggle, you may be doing something unprecedented.

simonw 1/28/2026||
Except if you spend quality time with coding agents you realize that's not actually true.

They're equally useful for novel tasks because they don't work by copying large scale patterns from their training data - the recent models can break down virtually any programming task to a bunch of functions and components and cobble together working code.

If you can clearly define the task, they can work towards a solution with you.

The main benefit of concepts already in the training data is that it lets you slack off on clearly defining the task. At that point it's not the model "cheating", it's you.

deadbabe 1/28/2026|||
Good long lived software is not a bunch of functions and components cobbled together.

You need to see the big picture and visions of the future state in order to ensure what is being built will be able to grow and breathe into that. This requires an engineer. An agent doesn’t think much about the future, they think about right now.

This browser toy built by the agent, it has NO future. Once it has written the code, the story is over.

aix1 1/28/2026||||
Simon, do you happen to have some concrete examples of a model doing a great job at a clearly novel, clearly non-trivial coding task?

I'd find it very interesting to see some compelling examples along those line.

simonw 1/28/2026||
I think datasette-transactions https://github.com/datasette/datasette-transactions is pretty novel. Here's the transcript where Claude Code built it: https://gisthost.github.io/?a41ce6304367e2ced59cd237c576b817...

That transcript viewer itself is a pretty fun novel piece of software, see https://github.com/simonw/claude-code-transcripts

Denobox https://github.com/simonw/denobox is another recent agent project which I consider novel: https://orphanhost.github.io/?simonw/denobox/transcripts/ses...

keybored 1/28/2026|||
> Except if you spend quality time with coding agents you realize that's not actually true.

Agent engineering seems to be (from the outside!) converging on quality lived experience. Compared to Stone Age manual coding it’s less about technical arguments and more about intuition.

Vibes in short.

You can’t explain sex to someone who has not had sex.

Any interaction with tools is partly about intuition. It’s a difference of degree.

embedding-shape 1/28/2026|||
Damn, ok, what should I attempt instead, that could impress even you?
anonymous908213 1/28/2026||
Actually good software that is suitable for mass adoption would go a long way to convincing a lot of people. This is just, yet another, proof-of-concept. Something which LLMs obviously can do, and which never seems to translate to real-world software people use. Parsing and rendering text is really not the hard part of building a browser, and there's no telling how closely the code mirrors existing open-source implementations if you aren't versed on the subject.

That said, I think some credit is due. This is still a nice weekend project as far as LLMs go, and I respect that you had a specific goal in mind (showing a better approach than Cursor's nonsense, that gets better results in less time with less cost) and achieved it quickly and decisively. It has not really changed my priors on LLMs in any way, though. If anything it just confirms them, particularly that the "agent swarm" stuff is a complete non-starter and demonstrates how ridiculous that avenue of hype is.

embedding-shape 1/28/2026||
> Actually good software that is suitable for mass adoption would go a long way to convincing a lot of people.

Yeah, that's obviously a lot harder, but doable. I've built it for clients, since they pay me, but haven't launch/made public something of my own, where I could share the code, I guess might be useful next project now.

> This is just, yet another, proof-of-concept.

It's not even a PoC, it's a demonstration of how far off the mark Cursor are with their "experiment" where they were amazed by what "hundreds of agents" build for week(s).

> there's no telling how closely the code mirrors existing open-source implementations if you aren't versed on the subject

This is absolutely true, I tried to get some better answers on how one could even figure that out here: https://news.ycombinator.com/item?id=46784990

usef- 1/28/2026||
What would be impressive to you?
deadbabe 1/28/2026||
A browser so unique and strange it is literally unlike anything we've ever seen to date, using entirely new UI patterns and paradigms.
nenadg 1/28/2026||
>(no JS tho)

this is a feature

Imustaskforhelp 1/27/2026||
I feel like I have talked to Embedding-shape on Hackernews quite a lot that I recognize him. So it was a proud like moment when I saw his hackernews & github comments on a youtube video [0]about the recent cursor thing

It's great to see him make this. I didn't know that he had a blog but looks good to me. Bookmarked now.

I feel like although Cursor burned 5 million$, we saw that and now Embedding shapes takeaway

If one person with one agent can produce equal or better results than "hundreds of agents for weeks", then the answer to the question: "Can we scale autonomous coding by throwing more agents at a problem?", probably has a more pessimistic answer than some expected.

Effectively to me this feels like answering the query which was being what if we have thousands of AI agents who can build a complex project autonomously with no Human. That idea seems dead now. Humans being in the loop will have a much higher productivity and end result.

I feel like the lure behind the Cursor project was to find if its able to replace humans completely in a extremely large project and the answer's right now no (and I have a feeling [bias?] that the answer's gonna stay that way)

Emsh I have a question tho, can you tell me about your background if possible? Have you been involved in browser development or any related endeavours or was this a first new one for you? From what I can feel/have talked with you, I do feel like the answer's yes that you have worked in browser space but I am still curious to know the answer.

A question which is coming to my mind is how much would be the difference between 1 expert human 1 agent and 1 (non expert) say Junior dev human 1 agent and 1 completely non expert say a normal person/less techie person 1 agent go?

What are your guys prediction on it?

How would the economics of becoming an "expert" or becoming a jack of all trades (junior dev) in a field fare with this new technology/toy that we got.

how much productivity gains could be from 1 non expert -> junior dev and the same question for junior -> senior dev in this particular context

[0] Cursor Is Lying To Developers… : https://www.youtube.com/watch?v=U7s_CaI93Mo

simonw 1/27/2026|
I don't think the Cursor thing was about replacing humans entirely.

(If it was that's bad news for them as a company that sells tools to human developers!)

It was about scaling coding agents up to much larger projects by coordinating and running them in parallel. They chose a web browser for that not because they wanted to build a web browser, but because it seemed like the ideal example of a well specified but enormous (million line+) project which multiple parallel agents could take on where a single agent wouldn't be able to make progress.

embedding-shape's project here disproves that last bit - that you need parallel agents to build a competent web renderer - by achieving a more impressive result with just one Codex agent in a few days.

Imustaskforhelp 1/27/2026||
> I don't think the Cursor thing was about replacing humans entirely.

I think how I saw things was that somehow Cursor was/is still targetted very heavily on vibe coding in a similar fashion of bolt.dev or lovable and I even saw some vibe coders youtube try to see the difference and honestly at the end Cursor had a preferable pricing than the other two and that's how I felt Cursor was.

Of course Cursor's for the more techie person as well but I feel as if they would shift more and more towards Claude Code or similar which are subsidized by the provider (Anthropic) itself, something not possible for Cursor to do unless burning big B's which it already has done.

So Cursor's growth was definitely towards the more vibe coders side.

Now coming to my main point which is that I had the feeling that what cursor was trying to achieve wasn't trying to replace humans entirely but replace humans from the loop Aka Vibe coding. Instead of having engineers, if suppose the Cursor experiment was sucessful, the idea (which people felt when it was first released instantly) was that the engineering itself would've been dead & instead the jobs would've turned into management from a bird's eye view (not managing agent's individually or being aware of what they did or being in any capacity within the loop)

I feel like this might've been the reason they were willing to burn 5 million$ for.

If you could've been able to convince engineers considering browsers are taken as the holy grail of hardness that they are better off being managers, then a vibe coding product like Cursor would be really lucrative.

Atleast that's my understanding, I can be wrong I usually am and I don't have anything against Cursor. (I actually used to use Cursor earlier)

But the embedding shapes project shows that engineering is very much still alive and beneficial net. He produced a better result with very minimal costs than 5 million$ inference costs project.

> embedding-shape's project here disproves that last bit - that you need parallel agents to build a competent web renderer - by achieving a more impressive result with just one Codex agent in a few days.

Simon, I think that browsers got the idea of this autonomous agents partially because of your really famous post about how independent tests can lead to easier ports via agents. Browsers have a lot of independent tests.

So Simon, perhaps I may have over-generalized but do you know of any ideas where the idea of parallel agents is actually good now that browsers are off, personally after this project, I can't really think of any. When the Cursor thing first launched or when I first heard of it recently, I thought that browsers did make sense for some reason but now that its out of the window, I am not sure if there are any other projects where massively parallel agents might be even net positive over 1 human + 1 agent as Emsh.

simonw 1/27/2026||
No, I'm still waiting to see concrete evidence that the "swarms of parallel agents" thing is worthwhile. I use sub-agents in Claude Code occasionally - for problems that are easily divided - and that works fine as a speed-up, but I'm still holding out for an example of a swarm of agents that's really compelling.

The reason I got excited about the Cursor FastRender example was that it seemed like the first genuine example of thousands of agents achieving something that couldn't be achieved in another way... and then embedding-shapes went and undermined it with 20,000 lines of single-agent Rust!

Imustaskforhelp 1/27/2026|||
Edit 2: looks like the project took literally the last token I had to create a big buggy implementation in golang haha!

I kind of left the agents to do what they wanted just asking for a port.

Your website does look rotated and the image is the only thing visible in my golang port.

Let me open source it & I will probably try to hammer it some more after I wake up to see how good Kimi is in real world tasks.

https://github.com/SerJaimeLannister/golang-browser

I must admit that its not working right now and I am even unable to replicate your website that was able to first display even though really glitchy and image zoomed to now only a white although oops looks like I forgot the i in your name and wrote willson instead of willison as I wasn't wearing specs. Sorry about that

Now Let me see yeah now its displaying something which is extremely glitchy

https://github.com/SerJaimeLannister/golang-browser/blob/mai...

I have a file to show how glitchy it is. I mean If anything I just want someone to tinker around with if a golang project can reasonably be made out of this rust project.

Simon, I see that you were also interested in go vibe coding haha, this project has independent tests too! Perhaps you can try this out as well and see how it goes! It would be interesting to see stuff then!

Alright time for me to sleep now, good night!

Imustaskforhelp 1/27/2026|||
Haha yea, Me and emsh were actually talking about it on bluesky (which I saw after seeing your bluesky, I didn't know both you and emsh were on bsky haha)

https://bsky.app/profile/emsh.cat/post/3mdgobfq4as2p

But basically I got curious and you can see from my other comments on you how much I love golang so decided to port the project from rust to golang and emsh predicts that the project's codebase can even shrink to 10k!

(although one point tho is that I don't have CC, I am trying it out on the recently released Kimi k2.5 model and their code but I decided to use that to see the real world use case of an open source model as well!)

Edit: I had written this comment just 2 minutes before you wrote but then I decided to write the golang project

I mean, I think I ate through all of my 200 queries in kimi code & it now does display me a (browser?) and I had the shell script as something to test your website as the test but it only opens up blank

I am gonna go sleep so that the 5 hour limits can get recharged again and I will continue this project.

I think it will be really interesting to see this project in golang, there must be good reason for emsh to say the project can be ~10k in golang.

embedding-shape 1/27/2026||
> I think it will be really interesting to see this project in golang, there must be good reason for emsh to say the project can be ~10k in golang.

Oh no, don't read too much into my wild guesses! Very hunch-based, and I'm only human after all.

tonyhart7 1/27/2026||
>one human

>one agent

>one browser

>one million nvidia gpu

embedding-shape 1/28/2026|
Next time I'll do it on my GPU, then it'll be using just a 10K GPU, that's fine right?
mdavid626 1/28/2026||
What’s the point of this?
embedding-shape 1/28/2026|
What's the point of anything really?

A more real answer: Read the first 6 words of the submission article.

mdavid626 1/28/2026||
It feels like this mentality is taking over the world. Porn instead of sex, short videos instead of real life interactions, AI generated code instead of software engineering, sugar/chemicals instead of food and so on...

All, just to have fun.

Very sad.

embedding-shape 1/28/2026||
> AI generated code instead of software engineering

This is exactly what's wrong with Cursor's approach, and why we need better tools for collaborating, so we don't loose the engineering part of software development.

I, just like you, am fucking tired of all the slop being constantly pushed as something great. This + my previous blog entries are all about pushing back on the slop.

mdavid626 1/28/2026||
This is exactly the problem. People use AI to generate projects and expect, that other people will celebrate and value them as they value similar human written projects.

They fail to see where the value really is. They try to cheat the system, get admiration of others, but without putting in any value.

TalkWithAI 1/28/2026|
[dead]
More comments...