Top
Best
New

Posted by ingve 11 hours ago

The Coming Loop(lucumr.pocoo.org)
241 points | 190 commentspage 2
wolttam 7 hours ago|
I am 100% for fully agentic loops... for tasks other than engineering.

I'm not willing to outsource the understanding how things work part of myself. That part of myself is what got me into computing in the first place.

If this work becomes simply a matter of describing intent to a machine (probably through an Issue, like a user), and going to check on the result when you get the 'done' notification: I'm done.

It's possible to use the tools to do awesome things without letting go of full system understanding of the parts that you look after.

yanis_t 9 hours ago||
I keep thinking about at which point I should not force myself into the loop. As a developer I really like working on the code structure, making it clearer, thinking about good abstraction, breaking into modules, etc. I really take pleasure in it. At the same time I understand that at some point I am becoming the limiting factor.

If the point of the software is benefit people, should I still care about how the code looks.

Right now, I still think that the answer is yes, but in 3 years? in 10 years?

wartywhoa23 2 hours ago||
> If the point of the software is benefit people, should I still care about how the code looks.

The answer is yes you should, as long as you want to keep software benefiting people.

steezeburger 4 hours ago|||
It's tough if you're somewhere that isn't very meaningful to you beyond the technology. I think there will be an existential shift soon towards more fulfilling work. Maybe I'm naive or that's just what I feel I need for myself.
cadamsdotcom 8 hours ago||
You will always be able to ask the agent to do refactors for you - and it can do mega ones that exhaust you to think about!
yanis_t 7 hours ago|||
Two problems: I will not get my pleasure, and I will still not know how the code works.
datadrivenangel 8 hours ago||||
Agentic refactoring is very questionable if you want to maintain quality, as it will rewrite all your code to be more average.
zahlman 1 hour ago||
I've found that if you look over the code and notice and describe a specific problem and solution, the agent can apply a refactoring for you well enough; and that's often faster than editing the file yourself even if you already know exactly what to do.

The idea of setting up an agentic loop to review code and propose and implement refactorings still seems pretty awful to me, though, yeah. Maybe cut that off at the first green-bar revision, and then apply some actual taste and judgement.

goatlover 2 hours ago|||
Or you can do it yourself. Use AI as a prototyping tool or for boring throw away tasks. There's no reason everyone has to succumb to vibe coding.
dataviz1000 3 hours ago||
I’m having awesome success working with recursive agents. I discussed my experience with them. [0]

> Claude's attention doesn't distinguish between "instructions I'm writing" and "instructions I'm following" -- they're both just tokens in context.

It takes a little human help in the first iterations but after a while it will start to iterate and improve unsupervised.

[0] https://github.com/adam-s/agent-tuning

piker 9 hours ago||
We used a “loop” before it was called that to drive MS-DOC support into Tritium. Based on that experience, I take issue with this:

“There are already impressive examples of large automatic porting efforts, including the reported work around moving parts of Bun from Zig to Rust.” (Emphasis added.)

It will be impressive if/when the Bun team is able to pick up and continue extending and supporting Bun. For us, MS-DOC remains read-only and probably perpetually buggy until we reimplement with a better understanding. Until then, it’s definitely not “impressive”. Functional? Maybe. Impressive, no.

stuartaxelowen 2 hours ago||
This blog post pints to the fact that you need information across scales to make really insightful products and software. You need to understand fundamental mechanisms, strengths, and risks of your software to know where to make bets next. You need to know about the “how” of your optimization system to know which customer asks to deny.

Using layers like the loops described here to abdicate your work is you decoupling from the joint market/engineering value you originally provided.

furyofantares 4 hours ago||
I have had some success with /goal for long tasks that can be set up in a way that the agent can do good work for an extended period of time.

A lot of tasks aren't amenable to that, and the ones that are still need a lot of care to be set up correctly. The default vibe coded codebase won't be.

I've come to think of the activity of choosing the right technology, the right architecture, the right testing setup, the right context, and the right /goals to use as programming the agent.

kakugawa 3 hours ago|
How much does /goal actually help? In auto mode, I've tried using and not using /goal and I haven't felt a difference.

https://code.claude.com/docs/en/goal#how-evaluation-works

> /goal is a wrapper around a session-scoped prompt-based Stop hook. Each time Claude finishes a turn, the condition and the conversation so far are sent to your configured small fast model, which defaults to Haiku. The model returns a yes-or-no decision and a short reason. A “no” tells Claude to keep working and includes the reason as guidance for the next turn. A “yes” clears the goal and records an achieved entry in the transcript.

> The evaluator runs on whichever provider your session is configured for. It does not call tools, so it can only judge what Claude has already surfaced in the conversation.

Apparently, it uses Haiku (by default) to evaluate every turn to determine if the goal has been achieved. However, it only relies on the transcript itself (including the reasoning of the main model). It can't independently verify if the goal has been achieved. So, if the main model thinks the goal is or isn't done, how often does Haiku disagree (in a productive way)? That's not clear to me.

furyofantares 2 hours ago||
I've mainly used the feature in codex, where I've been able to get it to work for 5 straight days (with breaks when rate limits are hit -- which was surprisingly only thrice) on a massive port.

I don't know how well it works in claude code, but I wouldn't be worried about Haiku getting it wrong and don't see a problem with it relying on the transcript. I always set these things up to maintain a checklist of subtasks to do in a file and check them off, and to always implement with red/green testing methodology, where it writes and commits failing tests, then writes the feature/fixes the bug and commits with passing tests and with an updated checklist file.

So the model should always know from the transcript whether the current task is done by whether it shows the tests passing, and it should always know if there's more tasks left from the checklist file being updated before the commit.

inline_always 2 hours ago||
The bottleneck has always been the 'verification' and 'trust', that's why we have senior engineers, same way you need a head architect sign-off on a blueprint, because when things go bad you need a human agent to be the responsible party. Even if we manage to teach a herd of dumb AIs to produce massive amount of code, who's going to trust that output with their life?
wahnfrieden 2 hours ago|
That's the entire topic - loops are not just infinite output, they require automated verification and progress evaluation steps.

The game is to find ways to automate that. Not fully but yes to reduce what's required from humans. Seems like you're questioning the entire premise rather than pondering how far it can be taken and how.

gcanyon 9 hours ago||
I'm a software developer from way back, using tools and languages that coding agents are far less familiar with.

So when I use an agent to write code, it's in languages I'm less familiar with, and often using libraries I know nothing about.

All to say, my part of the process often ends up being:

1. "Here's what I'm looking for, in detail" 2. "That's not right. Here's one way it's not right, and a specific example. Please fix that." 3. Sometimes I give suggestions for how what is going wrong might be happening, or conceptually how to work around the issue. 4. And iterate on 2-3 until the result is close enough.

That's a loop I'd love to automate.

timmytokyo 2 hours ago||
Sounds like a great way to avoid learning anything new about the languages you don't already know.
handfuloflight 3 hours ago||
Have you tried SKILL.MD files encoding your nuanced domain knowledge?
contagiousflow 8 hours ago||
> You Cannot Quite Opt Out

I am so over this. I cannot take anyone seriously that claims inevitability of their ideas, and how you must adopt them without "being left behind". If these tools are so good and so capable the result should be able to speak for themselves rather than this FOMO inducing, emotional language.

dvogel 8 hours ago||
I couldn't agree more. Thus far I'm still objectively more productive than all of the AI enthusiasts I've worked with. I think a lot of the activity with these tools is coming from people who just enjoy using them more than they enjoyed coding. They feel more productive not because they are producing more but because they are producing somewhat less with much effort. It takes them roughly the same amount of time even if it changes the distribution of time spent on each task.

> and in recent weeks it has started to dominate the Twitter discourse.

As a general rule, I don't waste my time with the advice of people who still think Twitter is a source of wisdom.

timmytokyo 2 hours ago|||
It's especially rich when you consider the shitty code the "loop" produces. The article quotes Boris Cherny as if he's someone to look up to. But then you look at the code he produces with "loops", and it's an unmaintainable ball of mud:

[1] https://neuromatch.social/@jonny/116324676116121930

[2] https://neuromatch.social/@jonny/116349873176941251

ElatedOwl 7 hours ago|||
the point of that section is that attackers and security researchers will use / are using loops, and you as the maintainer are not able to opt out of others doing this. an unwilling participant.
jcgrillo 6 hours ago||
No, but you always have the option to opt out of being a maintainer. If "the community" is going to behave badly and bury you under a barrage of vibeslop you can just leave.

EDIT: there's a version of this that could be positive--everything could get a hell of a lot more secure. But that doesn't seem to be what's happening.

unknownfuture 8 hours ago|||
You're being uncharitable. I don't read it as intentionally FOMO inducing. I read it as the exhausted sigh of resignation from someone who sees where the wind is blowing whether they like it or not. I see it as someone watching tech management and execs listening rapt as Boris pours the poison of AI maximalism in their ears. I read it as someone who sees developers around them either drinking from that same poisoned well or bowing under the pressure from those leaders to adopt AI or lose their livelihoods.

It is true that the author is incorrect: you can certainly opt out, but you won't be opting out of AI, you'll be opting out of the industry.

contagiousflow 7 hours ago||
"the industry" is not some monolith, and treating it as such is no productive. There are all types of software and many ways in which it is created. If the companies that are "AI enabled" are so much better we should see some big changes soon. But I'm still waiting for products I use from "AI enabled" companies to start churning out features at unprecedented speed.
unknownfuture 7 hours ago||
So then what's your experience in your current place of work?
Lambdanaut 8 hours ago||
In my experience, some language like this is the result of witnessing it speak for itself.
agumonkey 5 hours ago|
I can't help but be tired of the LLM trendy, where people bang at loops until they hope the model sculpts something. It feels so empty mentally to just have results without constructing it.

That said the idea of loop has always been there (iteration, V cycle etc) but I'd be glad to find people with more theory and less agents swinging blindly so to speak.

More comments...