Posted by vinhnx 5 hours ago
Right now when Claude Code (or any agent) executes a plan, it typically has the same broad permissions for every step. But ideally, each execution step should only have access to the specific tools and files it needs — least privilege, applied to AI workflows.
I've been experimenting with declarative permission manifests for agent tasks. Instead of giving the agent blanket access, you define upfront what each skill can read, write, and execute. Makes the planning phase more constrained but the execution phase much safer.
Anyone else thinking about this from a security-first angle?
The practice is:
- simple
- effective
- retains control and quality
Certainly the “unsupervised agent” workflows are getting a lot of attention right now, but they require a specific set of circumstances to be effective:
- clear validation loop (eg. Compile the kernel, here is gcc that does so correctly)
- ai enabled tooling (mcp / cli tool that will lint, test and provide feedback immediately)
- oversight to prevent sgents going off the rails (open area of research)
- an unlimited token budget
That means that most people can't use unsupervised agents.
Not that they dont work; Most people have simply not got an environment and task that is appropriate.
By comparison, anyone with cursor or claude can immediately start using this approach, or their own variant on it.
It does not require fancy tooling.
It does not require an arcane agent framework.
It works generally well across models.
This is one of those few genunie pieces of good practical advice for people getting into AI coding.
Simple. Obviously works once you start using it. No external dependencies. BYO tools to help with it, no “buy my AI startup xxx to help”. No “star my github so I can a job at $AI corp too”.
Great stuff.
The “easy” path of “short prompt declaring what I want” works OK for simple tasks but consistently breaks down for medium to high complexity tasks.
This makes no sense to my intuition of how an LLM works. It's not that I don't believe this works, but my mental model doesn't capture why asking the model to read the content "more deeply" will have any impact on whatever output the LLM generates.
Same reason that "Pretend you are an MIT professor" or "You are a leading Python expert" or similar works in prompts. It tells the model to pay attention to the part of the corpus that has those terms, weighting them more highly than all the other programming samples that it's run across.
Just a theory.
So if you send a python code then the first one in function can be one expert, second another expert and so on.
Unless someone can come up with some kind of rigorous statistics on what the effect of this kind of priming is it seems no better than claiming that sacrificing your first born will please the sun god into giving us a bountiful harvest next year.
Sure, maybe this supposed deity really is this insecure and needs a jolly good pep talk every time he wakes up. or maybe you’re just suffering from magical thinking that your incantations had any effect on the random variable word machine.
The thing is, you could actually prove it, it’s an optimization problem, you have a model, you can generate the statistics, but no one as far as I can tell has been terribly forthcoming with that , either because those that have tried have decided to try to keep their magic spells secret, or because it doesn’t really work.
If it did work, well, the oldest trick in computer science is writing compilers, i suppose we will just have to write an English to pedantry compiler.
"Add tests to this function" for GPT-3.5-era models was much less effective than "you are a senior engineer. add tests for this function. as a good engineer, you should follow the patterns used in these other three function+test examples, using this framework and mocking lib." In today's tools, "add tests to this function" results in a bunch of initial steps to look in common places to see if that additional context already exists, and then pull it in based on what it finds. You can see it in the output the tools spit out while "thinking."
So I'm 90% sure this is already happening on some level.
Now? We have AGENTS.md files that look like a parent talking to a child with all the bold all-caps, double emphasis, just praying that's enough to be sure they run the commands you want them to be running
(1 Outside of some core ML developers at the big model companies)
Practice playing songs by ear and after 2 weeks, my brain has developed an inference model of where my fingers should go to hit any given pitch.
Do I have any idea how my brain’s model works? No! But it tickles a different part of my brain and I like it.
thats hilarious. i definitely treat claude like shit and ive noticed the falloff in results.
if there's a source for that i'd love to read about it.
See, uhhh, https://pmc.ncbi.nlm.nih.gov/articles/PMC8052213/ and maybe have a shot at running claude while playing Enya albums on loop.
/s (??)
sometimes internet arguments get messy, people die on their hills and double / triple down on internet message boards. since historic internet data composes a bit of what goes into an llm, would it make sense that bad-juju prompting sends it to some dark corners of its training model if implementations don't properly sanitize certain negative words/phrases ?
in some ways llm stuff is a very odd mirror that haphazardly regurgitates things resulting from the many shades of gray we find in human qualities.... but presents results as matter of fact. the amount of internet posts with possible code solutions and more where people egotistically die on their respective hills that have made it into these models is probably off the charts, even if the original content was a far cry from a sensible solution.
all in all llm's really do introduce quite a bit of a black box. lot of benefits, but a ton of unknowns and one must be hyperviligant to the possible pitfalls of these things... but more importantly be self aware enough to understand the possible pitfalls that these things introduce to the person using them. they really possibly dangerously capitalize on everyones innate need to want to be a valued contributor. it's really common now to see so many people biting off more than they can chew, often times lacking the foundations that would've normally had a competent engineer pumping the brakes. i have a lot of respect/appreciation for people who might be doing a bit of claude here and there but are flat out forward about it in their readme and very plainly state to not have any high expectations because _they_ are aware of the risks involved here. i also want to commend everyone who writes their own damn readme.md.
these things are for better or for worse great at causing people to barrel forward through 'problem solving', which is presenting quite a bit of gray area on whether or not the problem is actually solved / how can you be sure / do you understand how the fix/solution/implementation works (in many cases, no). this is why exceptional software engineers can use this technology insanely proficiently as a supplementary worker of sorts but others find themselves in a design/architect seat for the first time and call tons of terrible shots throughout the course of what it is they are building. i'd at least like to call out that people who feel like they "can do everything on their own and don't need to rely on anyone" anymore seem to have lost the plot entirely. there are facets of that statement that might be true, but less collaboration especially in organizations is quite frankly the first steps some people take towards becoming delusional. and that is always a really sad state of affairs to watch unfold. doing stuff in a vaccuum is fun on your own time, but forcing others to just accept things you built in a vaccuum when you're in any sort of team structure is insanely immature and honestly very destructive/risky. i would like to think absolutely no one here is surprised that some sub-orgs at Microsoft force people to use copilot or be fired, very dangerous path they tread there as they bodyslam into place solutions that are not well understood. suddenly all the leadership decisions at many companies that have made to once again bring back a before-times era of offshoring work makes sense: they think with these technologies existing the subordinate culture of overseas workers combined with these techs will deliver solutions no one can push back on. great savings and also no one will say no.
Without something quantifiable it's not much better then someone who always wears the same jersey when their favorite team plays, and swears they play better because of it.
think of the latent space inside the model like a topological map, and when you give it a prompt, you're dropping a ball at a certain point above the ground, and gravity pulls it along the surface until it settles.
caveat though, thats nice per-token, but the signal gets messed up by picking a token from a distribution, so each token you're regenerating and re-distorting the signal. leaning on language that places that ball deep in a region that you want to be makes it less likely that those distortions will kick it out of the basin or valley you may want to end up in.
if the response you get is 1000 tokens long, the initial trajectory needed to survive 1000 probabilistic filters to get there.
or maybe none of that is right lol but thinking that it is has worked for me, which has been good enough
The claw machine is also a sort-of-lie, of course. Its main appeal is that it offers the illusion of control. As a former designer and coder of online slot machines... totally spin off into pages on this analogy, about how that illusion gets you to keep pulling the lever... but the geographic rendition you gave is sort of priceless when you start making the comparison.
- You are a Python Developer... or - You are a Professional Python Developer... or - You are one of the World most renowned Python Experts, with several books written on the subject, and 15 years of experience in creating highly reliable production quality code...
You will notice a clear improvement in the quality of the generated artifacts.
That's very different from "think deeper". I'm just curious about this case in specific :)
"Large Language Models Understand and Can be Enhanced by Emotional Stimuli": https://arxiv.org/abs/2307.11760
In image generation, it's fairly common to add "masterpiece", for example.
I don't think of the LLM as a smart assistant that knows what I want. When I tell it to write some code, how does it know I want it to write the code like a world renowned expert would, rather than a junior dev?
I mean, certainly Anthropic has tried hard to make the former the case, but the Titanic inertia from internet scale data bias is hard to overcome. You can help the model with these hints.
Anyway, luckily this is something you can empirically verify. This way, you don't have to take anyone's word. If anything, if you find I'm wrong in your experiments, please share it!
I am not sure if we know why really, but they are that way and you need to explicitly prompt around it.
Lazy thinking makes LLMs do surface analysis and then produce things that are wrong. Neurotic thinking will see them over-analyze, and then repeatedly second-guess themselves, repeatedly re-derive conclusions.
Something very similar to an anxiety loop in humans, where problems without solutions are obsessed about in circles.
I craft a detailed and ordered set of lecture notes in a Quarto file and then have a dedicated claude code skill for translating those notes into Slidev slides, in the style that I like.
Once that's done, much like the author, I go through the slides and make commented annotations like "this should be broken into two slides" or "this should be a side-by-side" or "use your generate clipart skill to throw an image here alongside these bullets" and "pull in the code example from ../examples/foo." It works brilliantly.
And then I do one final pass of tweaking after that's done.
But yeah, annotations are super powerful. Token distance in-context and all that jazz.
The author mentions annotations but doesn't go into detail about how to feed the annotations to Claude.
<!-- TODOCLAUDE: Split this into a two-cols-title, divide the examples between -->
or <!-- TODOCLAUDE: Use clipart skill to make an image for this slide -->
And then, when I finish annotating I just say: "Address all the TODOCLAUDEs"But it's not hard to build one. The key for me was describing, in great detail:
1. How I want it to read the source material (e.g., H1 means new section, H2 means at least one slide, a link to an example means I want code in the slide)
2. How to connect material to layouts (e.g., "comparison between two ideas should be a two-cols-title," "walkthrough of code should be two-cols with code on right," "learning objectives should be side-title align:left," "recall should be side-title align:right")
Then the workflow is:
1. Give all those details and have it do a first pass.
2. Give tons of feedback.
3. At the end of the session, ask it to "make a skill."
4. Manually edit the skill so that you're happy with the examples.
First, the "big bang" write it all at once. You are going to end up with thousands of lines of code that were monolithically produced. I think it is much better to have it write the plan and formulate it as sensible technical steps that can be completed one at a time. Then you can work through them. I get that this is not very "vibe"ish but that is kind of the point. I want the AI to help me get to the same point I would be at with produced code AND understanding of it, just accelerate that process. I'm not really interested in just generating thousands of lines of code that nobody understands.
Second, the author keeps refering to adjusting the behaviour, but never incorporating that into long lived guidance. To me, integral with the planning process is building an overarching knowledge base. Every time you're telling it there's something wrong, you need to tell it to update the knowledge base about why so it doesn't do it again.
Finally, no mention of tests? Just quick checks? To me, you have to end up with comprehensive tests. Maybe to the author it goes without saying, but I find it is integral to build this into the planning. Certain stages you will want certain types of tests. Some times in advance of the code (so TDD style) other times built alongside it or after.
It's definitely going to be interesting to see how software methodology evolves to incorporate AI support and where it ultimately lands.
I get the PLAN.md (or equivalent) to be separated into "phases" or stages, then carefully prompt (because Claude and Codex both love to "keep going") it to only implement that stage, and update the PLAN.md
Tests are crucial too, and form another part of the plan really. Though my current workflow begins to build them later in the process than I would prefer...
There's no winner for "least amount of code written regardless of productivity outcomes.", except for maybe Anthropic's bank account.
Yesterday I had Claude write an audit logging feature to track all changes made to entities in my app. Yeah you get this for free with many frameworks, but my company's custom setup doesn't have it.
It took maybe 5-10 minutes of wall-time to come up with a good plan, and then ~20-30 min for Claude implement, test, etc.
That would've taken me at least a day, maybe two. I had 4-5 other tasks going on in other tabs while I waited the 20-30 min for Claude to generate the feature.
After Claude generated, I needed to manually test that it worked, and it did. I then needed to review the code before making a PR. In all, maybe 30-45 minutes of my actual time to add a small feature.
All I can really say is... are you sure you're using it right? Have you _really_ invested time into learning how to use AI tools?
Fast forward to today and I tried the tools again--specifically Claude Code--about a week ago. I'm blown away. I've reproduced some tools that took me weeks at full-time roles in a single day. This is while reviewing every line of code. The output is more or less what I'd be writing as a principal engineer.
The problem is LLMs are great at simple implementation, even large amounts of simple implementation, but I've never seen it develop something more than trivial correctly. The larger problem is it's very often subtly but hugely wrong. It makes bad architecture decisions, it breaks things in pursuit of fixing or implementing other things. You can tell it has no concept of the "right" way to implement something. It very obviously lacks the "senior developer insight".
Maybe you can resolve some of these with large amounts of planning or specs, but that's the point of my original comment - at what point is it easier/faster/better to just write the code yourself? You don't get a prize for writing the least amount of code when you're just writing specs instead.
The original article is, to me, seemingly not that novel. Not because it's a trite example, but because I've begun to experience massive gains from following the same basic premise as the article. And I can't believe there's others who aren't using like this.
I iterate the plan until it's seemingly deterministic, then I strip the plan of implementation, and re-write it following a TDD approach. Then I read all specs, and generate all the code to red->green the tests.
If this commenter is too good for that, then it's that attitude that'll keep him stuck. I already feel like my projects backlog is achievable, this year.
This is 100% incorrect, but the real issue is that the people who are using these llms for non-trivial work tend to be extremely secretive about it.
For example, I view my use of LLMs to be a competitive advantage and I will hold on to this for as long as possible.
Does it write maintainable code? Does it write extensible code? Does it write secure code? Does it write performant code?
My experience has been it failing most of these. The code might "work", but it's not good for anything more than trivial, well defined functions (that probably appeared in it's training data written by humans). LLMs have a fundamental lack of understanding of what they're doing, and it's obvious when you look at the finer points of the outcomes.
That said, I'm sure you could write detailed enough specs and provide enough examples to resolve these issues, but that's the point of my original comment - if you're just writing specs instead of code you're not gaining anything.
But the aha moment for me was what’s maintainable by AI vs by me by hand are on different realms. So maintainable has to evolve from good human design patterns to good AI patterns.
Specs are worth it IMO. Not because if I can spec, I could’ve coded anyway. But because I gain all the insight and capabilities of AI, while minimizing the gotchas and edge failures.
How do you square that with the idea that all the code still has to be reviewed by humans? Yourself, and your coworkers
So maybe it's that we won't be reviewing by hand anymore? I.e. it's LLMs all the way down. Trying to embrace that style of development lately as unnatural as it feels. We're obv not 100% there yet but Claude Opus is a significant step in that direction and they keep getting better and better.
yes, if I steer it properly.
It's very good at spotting design patterns, and implementing them. It doesn't always know where or how to implement them, but that's my job.
The specs and syntactic sugar are just nice quality of life benefits.
The compounding is much greater than my brain can do on its own.
But did you truly think about such feature? Like guarantees that it should follow (like how do it should cope with entities migration like adding a new field) or what the cost of maintaining it further down the line. This looks suspiciously like drive-by PR made on open-source projects.
> That would've taken me at least a day, maybe two.
I think those two days would have been filled with research, comparing alternatives, questions like "can we extract this feature from framework X?", discussing ownership and sharing knowledge,.. Jumping on coding was done before LLMs, but it usually hurts the long term viability of the project.
Adding code to a project can be done quite fast (hackatons,...), ensuring quality is what slows things down in any any well functioning team.
Some things are complex.
You could've been curious and ask why it would take 1-2 days, and I would've happily told you.
I wanted to add audit logging for all endpoints we call, all places we call the DB, etc. across areas I haven't touched before. It would have taken me a while to track down all of the touchpoints.
Granted, I am not 100% certain that Claude didn't miss anything. I feel fairly confident that it is correct given that I had it research upfront, had multiple agents review, and it made the correct changes in the areas that I knew.
Also I'm realizing I didn't mention it included an API + UI for viewing events w/ pretty deltas
I think the method in TFA is overall less stressful for the dev. And you can always fix it up manually in the end; AI coding vs manual coding is not either-or.
That said, if you're on a serious team writing professional software there is still tons of value in always telling AI to plan first, unless it's a small quick task. This post just takes it a few steps further and formalizes it.
I find Cursor works much more reliably using plan mode, reviewing/revising output in markdown, then pressing build. Which isn't a ton of overhead but often leads to lots of context switching as it definitely adds more time.
I find the best way to use agents (and I don't use claude) is to hash it out like I'm about to write these changes and I make my own mental notes, and get the agent to execute on it.
Agents don't get tired, they don't start fat fingering stuff at 4pm, the quality doesn't suffer. And they can be parallelised.
Finally, this allows me to stay at a higher level and not get bogged down of "right oh did we do this simple thing again?" which wipes some of the context in my mind and gets tiring through the day.
Always, 100% review every line of code written by an agent though. I do not condone committing code you don't 'own'.
I'll never agree with a job that forces developers to use 'AI', I sometimes like to write everything by hand. But having this tool available is also very powerful.
This new version that I'm doing (from scratch with ChatGPT web) has a far more ambitious scope and is already at the "usable" point. Now I'm primarily solidifying things and increasing test coverage. And I've tested the key parts with IRL scenarios to validate that it's not just passing tests; the thing actually fulfills its intended function so far. Given the increased scope, I'm guessing it'd take me a few months to get to this point on my own, instead of under a week, and the quality wouldn't be where it is. Not saying I haven't had to wrangle with ChatGPT on a few bugs, but after a decent initial planning phase, my prompts now are primarily "Do it"s and "Continue"s. Would've likely already finished it if I wasn't copying things back and forth between browser and editor, and being forced to pause when I hit the message limit.
I recommend to try out Opencode with this approach, you might find it less tiring than ChatGPT web (yes it works with your ChatGPT Plus sub).
This! Once I'm familiar with the codebase (which I strive to do very quickly), for most tickets, I usually have a plan by the time I've read the description. I can have a couple of implementation questions, but I knew where the info is located in the codebase. For things, I only have a vague idea, the whiteboard is where I go.
The nice thing with such a mental plan, you can start with a rougher version (like a drawing sketch). Like if I'm starting a new UI screen, I can put a placeholder text like "Hello, world", then work on navigation. Once that done, I can start to pull data, then I add mapping functions to have a view model,...
Each step is a verifiable milestone. Describing them is more mentally taxing than just writing the code (which is a flow state for me). Why? Because English is not fit to describe how computer works (try describe a finite state machine like navigation flow in natural languages). My mental mental model is already aligned to code, writing the solution in natural language is asking me to be ambiguous and unclear on purpose.
Sadly my post didn't much attention at the time.
https://thegroundtruth.media/p/my-claude-code-workflow-and-p...
- Specs: these are generally static, but updatable as the project evolves. And they're broken out to an index file that gives a project overview, a high-level arch file, and files for all the main modules. Roughly ~1k lines of spec for 10k lines of code, and try to limit any particular spec file to 300 lines. I'm intimately familiar with every single line in these.
- Plans: these are the output of a planning session with an LLM. They point to the associated specs. These tend to be 100-300 lines and 3 to 5 phases.
- Working memory files: I use both a status.md (3-5 items per phase roughly 30 lines overall), which points to a latest plan, and a project_status (100-200 lines), which tracks the current state of the project and is instructed to compact past efforts to keep it lean)
- A planner skill I use w/ Gemini Pro to generate new plans. It essentially explains the specs/plans dichotomy, the role of the status files, and to review everything in the pertinent areas of code and give me a handful of high-level next set of features to address based on shortfalls in the specs or things noted in the project_status file. Based on what it presents, I select a feature or improvement to generate. Then it proceeds to generate a plan, updates a clean status.md that points to the plan, and adjusts project_status based on the state of the prior completed plan.
- An implementer skill in Codex that goes to town on a plan file. It's fairly simple, it just looks at status.md, which points to the plan, and of course the plan points to the relevant specs so it loads up context pretty efficiently.
I've tried the two main spec generation libraries, which were way overblown, and then I gave superpowers a shot... which was fine, but still too much. The above is all homegrown, and I've had much better success because it keeps the context lean and focused.
And I'm only on the $20 plans for Codex/Gemini vs. spending $100/month on CC for half year prior and move quicker w/ no stall outs due to token consumption, which was regularly happening w/ CC by the 5th day. Codex rarely dips below 70% available context when it puts up a PR after an execution run. Roughly 4/5 PRs are without issue, which is flipped against what I experienced with CC and only using planning mode.
I have found it to work very well with Claude by giving it context and guardrails. Basically I just tell it "follow the guidance docs" and it does. Couple that with intense testing and self-feedback mechanisms and you can easily keep Claude on track.
I have had the same experience with Codex and Claude as you in terms of token usage. But I haven't been happy with my Codex usage; Claude just feels like it's doing more of what I want in the way I want.