Top
Best
New

Posted by bsuh 1 day ago

Agents need control flow, not more prompts(bsuh.bearblog.dev)
561 points | 277 commentspage 10
throwthrowuknow 19 hours ago|
Isn’t this basically what Palantir does?
try-working 1 day ago||
that's why you need a recursive workflow that creates its own artifacts per step that can later be used for verification.
Nizoss 23 hours ago|
Sounds interesting, can you elaborate on your thinking? Got me curious.
try-working 17 hours ago|||
how do you verify the work that was just done in the current stage? verify against the output artifacts from the previous stages. for example, if you have a requirement doc, then you can analyse the codebase for current state, and store as a doc. then generate the implementation plan based on the delta between requirements and current state. after implementation, create an implementation summary doc. to verify the implementation in the next stage, compare the implementation summary against the implementation plan, the previous codebase analysis and the original requirements doc, as well as codebase diffs.

so, every stage outputs a source of truth for that stage, which can be used by later stages for verification, alone or together with other artifacts. if you want to read more, here's the recursive-mode development workflow I built: https://recursive-mode.dev/introduction

nhectropic 2 hours ago|||
[dead]
terminalbraid 1 day ago||
My friend, you have invented management.
Nizoss 23 hours ago|
Not throwing shade at anyone here but the thought has definitely crossed my mind that we are recreating SAFe but for agents when looking at some of the orchestration setups out there. I think that it is better to not force the same hierarchical processes that worked for humans in large organizations onto agents and instead look at what they need to give better results and what their failure modes look like.
marvinified 20 hours ago||
Depends on the use case
ncrmro 19 hours ago||
deepwork.md is made for this.
nhectropic 2 hours ago|
[dead]
cookiengineer 20 hours ago||
We have control flow. It's requirements specifications and test driven development. You just have to enforce it, so the agents cannot cheat their way around it.

I decided to build my agentic environment differently. Local only, sandboxed, enforced with Go specific requirement definitions that different agent roles cannot break as a contract.

That alone is far better than any hyped markdown-storage-sold-as-memory project I've seen in the last weeks.

Currently I am experimenting with skills tailored to other languages, because agentskills actually are kinda useless because they're not enforced nor can any of their metadata be used to predictably verify their behaviors.

My recommendation to others is: Treat LLM output as malware. Analyse its behavior, not its code. Never let LLMs work outside your sandbox. Force them to not being able to escape sandboxes. And that includes removing the Bash tool, for example, because that's not a reproducible sandbox.

Also, choose a language that comes with a strong unit testing methodology. I chose Go because it allows me to write unit tests for my tools, and even agents to agents communication down the line (with some limitations due to TestMain, but at least it's possible).

If you write your agent environment or harness in Typescript, you already failed before you started. Compiled code isn't typesafe because the compiler doesn't generate type checks in the resulting JS code.

Anyways, my two cents from the purpleteaming perspective that tries to make LLMs as deterministic as possible.

carterschonwald 20 hours ago||
i mean of course. ive been working on this the past few months and ive a bunch of tech towards this in flight, including some harness forks to layer my ideas in. eg my oh punkin pi test bed on my github.com/cartazio page , theres some shockingly obvious ince you see it tricks that i think i can stack into a really nice harness product for just doing hard real work with these models more easily
droolingretard 1 day ago||
Are you the guy who used to write MapleStory hacks?
ltbarcly3 1 day ago||
Don't listen to anyone who knows what should be done without proof. If someone 'knows' what agents 'need' then that knowledge is worth millions of dollars right now. If they haven't built it they are probably just talking shit.
More comments...