Scaling long-running autonomous coding

Posted by samwillis 1/14/2026

Scaling long-running autonomous coding(cursor.com)

290 points | 197 commentspage 3

luhego 1/15/2026|

> We initially built an integrator role for quality control and conflict resolution, but found it created more bottlenecks than it solved

Of course it creates bottlenecks, since code quality takes time and people don’t get it right on the first try when the changes are complex. I could also be faster if I pushed directly to prod!

Don’t get me wrong. I use these tools, and I can see the productivity gains. But I also believe the only way to achieve the results they show is to sacrifice quality, because no software engineer can review the changes at the same speed the agent generates code. They may solve that problem, or maybe the industry will change so only output and LOC matter, but until then I will keep cursing the agent until I get the result I want.

matthewfcarlson 1/14/2026||

It’s fascinating that many of the issues they faced I’ve seen in human software engineering teams.

Things like integration creating bottlenecks or a lack of consistent top down direction leading to small risk adverse changes instead of bold redesigns. All things I’ve seen before.

2001zhaozhao 1/15/2026|

At least the AI teams aren't politically competing against each other unlike human teams.

(Or are they?)

WOTERMEON 1/15/2026||

Weird twist the hiring call at the end for a company that says

> Our mission is to automate coding

nashadelic 1/20/2026||

I'm very curious about how much of it is re-using (remixing?) an existing browser implementation it has seen in the while or its trained on. Even in that case, 99% of all code isn't doing anything novel so this copying would still have significant practical use.

gaigalas 1/15/2026||

There's a clear conflict between SKILLS, tools and multi-tasking.

I think "intra-context" tooling is already dead. It's too narrow.

It's all "extra-context" now: how one instruments for multiple agents, at multiple times, handling things.

Personally, I think the best tool in this realm will come from open source, and be agnostic (many agents from many places interacting), in order to leverage differences between subtle provider qualities (speed, price and so on).

Building a browser is an interesting and expensive experiment. How much did it cost?

measurablefunc 1/15/2026||

All of these things have readily available analogues on the web which means they are more than likely just laundering open source code & claiming victory.

random_mutex 1/16/2026||

It doesn't compile so no victory

measurablefunc 1/16/2026||

Just the usual corporate marketing & hype.

rzmmm 1/15/2026||

There are many open-source toy browser implementations available, so this seems quite likely.

foota 1/15/2026||

Slightly off topic, but they want to move from solid to react? Isn't that the reverse of the newest trend? Would be interesting to know more.

random_mutex 1/16/2026|

Most likely LLMs are better at writing react

mdswanson 1/15/2026||

Over the past year or so, I've built my own system of agents that behaves almost exactly like this. I can describe what I'd like built before I go to bed and have a fantastic foundation in place by the next day. For simpler projects, they'll be complete. Because of the reviews, the code continually improves until the agents are satisfied. I'm impressed every time.

z_zetetic_z 1/15/2026|

Any chance you would care to share more about this?

ora-600 1/14/2026|

I would love to know the cost of building this browser. I think that multi-agent orchestration systems will probably be the theme for systems this year.

I think the north-star metric for a multi-agent orchestrator system would be how much did it cost to get this done. how much better could we have done? should we have used a cheaper model for doing a trivial task and an expensive one to monitor it?

More comments...