How I use Claude Code: Separation of planning and execution

Posted by vinhnx 23 hours ago

How I use Claude Code: Separation of planning and execution(boristane.com)

857 points | 541 commentspage 11

gregman1 15 hours ago|

It is really fun to watch how a baby makes its first steps and also how experienced professionals rediscover what standards were telling us for 80+ years.

__bjoernd 14 hours ago||

Sounds a bit like what Claude Plan Mode or Amazon's Kiro were built for. I agree it's a useful flow, but you can also overdo it.

connectsnk 18 hours ago||

Is it required to tell Claude to re-read the code folder again when you come back some day later or should we ask Claude to just pickup from research.md file thus saving some tokens?

skybrian 22 hours ago||

I do something broadly similar. I ask for a design doc that contains an embedded todo list, broken down into phases. Looping on the design doc asking for suggestions seems to help. I'm up to about 40 design docs so far on my current project.

des429 8 hours ago||

The author discovered plan mode in cursor.

jamesmcq 22 hours ago||

This all looks fine for someone who can't code, but for anyone with even a moderate amount of experience as a developer all this planning and checking and prompting and orchestrating is far more work than just writing the code yourself.

There's no winner for "least amount of code written regardless of productivity outcomes.", except for maybe Anthropic's bank account.

shepherdjerred 22 hours ago||

I really don't understand why there are so many comments like this.

Yesterday I had Claude write an audit logging feature to track all changes made to entities in my app. Yeah you get this for free with many frameworks, but my company's custom setup doesn't have it.

It took maybe 5-10 minutes of wall-time to come up with a good plan, and then ~20-30 min for Claude implement, test, etc.

That would've taken me at least a day, maybe two. I had 4-5 other tasks going on in other tabs while I waited the 20-30 min for Claude to generate the feature.

After Claude generated, I needed to manually test that it worked, and it did. I then needed to review the code before making a PR. In all, maybe 30-45 minutes of my actual time to add a small feature.

All I can really say is... are you sure you're using it right? Have you _really_ invested time into learning how to use AI tools?

tyleo 22 hours ago|||

Same here. I did bounce off these tools a year ago. They just didn't work for me 60% of the time. I learned a bit in that initial experience though and walked away with some tasks ChatGPT could replace in my workflow. Mainly replacing scripts and reviewing single files or functions.

Fast forward to today and I tried the tools again--specifically Claude Code--about a week ago. I'm blown away. I've reproduced some tools that took me weeks at full-time roles in a single day. This is while reviewing every line of code. The output is more or less what I'd be writing as a principal engineer.

jamesmcq 22 hours ago||||

Trust me I'm very impressed at the progress AI has made, and maybe we'll get to the point where everything is 100% correct all the time and better than any human could write. I'm skeptical we can get there with the LLM approach though.

The problem is LLMs are great at simple implementation, even large amounts of simple implementation, but I've never seen it develop something more than trivial correctly. The larger problem is it's very often subtly but hugely wrong. It makes bad architecture decisions, it breaks things in pursuit of fixing or implementing other things. You can tell it has no concept of the "right" way to implement something. It very obviously lacks the "senior developer insight".

Maybe you can resolve some of these with large amounts of planning or specs, but that's the point of my original comment - at what point is it easier/faster/better to just write the code yourself? You don't get a prize for writing the least amount of code when you're just writing specs instead.

fourthark 21 hours ago|||

This is exactly what the article is about. The tradeoff is that you have to throughly review the plans and iterate on them, which is tiring. But the LLM will write good code faster than you, if you tell it what good code is.

reg_dunlop 20 hours ago||

Exactly; the original commenter seems determined to write-off AI as "just not as good as me".

The original article is, to me, seemingly not that novel. Not because it's a trite example, but because I've begun to experience massive gains from following the same basic premise as the article. And I can't believe there's others who aren't using like this.

I iterate the plan until it's seemingly deterministic, then I strip the plan of implementation, and re-write it following a TDD approach. Then I read all specs, and generate all the code to red->green the tests.

If this commenter is too good for that, then it's that attitude that'll keep him stuck. I already feel like my projects backlog is achievable, this year.

fourthark 20 hours ago||

Strongly agree about the deterministic part. Even more important than a good design, the plan must not show any doubt, whether it's in the form of open questions or weasel words. 95% of the time those vague words mean I didn't think something through, and it will do something hideous in order to make the plan work

nojito 22 hours ago|||

>I've never seen it develop something more than trivial correctly.

This is 100% incorrect, but the real issue is that the people who are using these llms for non-trivial work tend to be extremely secretive about it.

For example, I view my use of LLMs to be a competitive advantage and I will hold on to this for as long as possible.

jamesmcq 21 hours ago||

The key part of my comment is "correctly".

Does it write maintainable code? Does it write extensible code? Does it write secure code? Does it write performant code?

My experience has been it failing most of these. The code might "work", but it's not good for anything more than trivial, well defined functions (that probably appeared in it's training data written by humans). LLMs have a fundamental lack of understanding of what they're doing, and it's obvious when you look at the finer points of the outcomes.

That said, I'm sure you could write detailed enough specs and provide enough examples to resolve these issues, but that's the point of my original comment - if you're just writing specs instead of code you're not gaining anything.

cowlby 21 hours ago|||

I find “maintainable code” the hardest bias to let go of. 15+ years of coding and design patterns are hard to let go.

But the aha moment for me was what’s maintainable by AI vs by me by hand are on different realms. So maintainable has to evolve from good human design patterns to good AI patterns.

Specs are worth it IMO. Not because if I can spec, I could’ve coded anyway. But because I gain all the insight and capabilities of AI, while minimizing the gotchas and edge failures.

girvo 19 hours ago||

> But the aha moment for me was what’s maintainable by AI vs by me by hand are on different realms. So maintainable has to evolve from good human design patterns to good AI patterns.

How do you square that with the idea that all the code still has to be reviewed by humans? Yourself, and your coworkers

cowlby 19 hours ago||

I picture like semi conductors; the 5nm process is so absurdly complex that operators can't just peek into the system easily. I imagine I'm just so used to hand crafting code that I can't imagine not being able to peek in.

So maybe it's that we won't be reviewing by hand anymore? I.e. it's LLMs all the way down. Trying to embrace that style of development lately as unnatural as it feels. We're obv not 100% there yet but Claude Opus is a significant step in that direction and they keep getting better and better.

reg_dunlop 20 hours ago||||

To answer all of your questions:

yes, if I steer it properly.

It's very good at spotting design patterns, and implementing them. It doesn't always know where or how to implement them, but that's my job.

The specs and syntactic sugar are just nice quality of life benefits.

jmathai 21 hours ago|||

You’d be building blocks which compound over time. That’s been my experience anyway.

The compounding is much greater than my brain can do on its own.

skydhash 21 hours ago||||

> Yesterday I had Claude write an audit logging feature to track all changes made to entities in my app. Yeah you get this for free with many frameworks, but my company's custom setup doesn't have it.

But did you truly think about such feature? Like guarantees that it should follow (like how do it should cope with entities migration like adding a new field) or what the cost of maintaining it further down the line. This looks suspiciously like drive-by PR made on open-source projects.

> That would've taken me at least a day, maybe two.

I think those two days would have been filled with research, comparing alternatives, questions like "can we extract this feature from framework X?", discussing ownership and sharing knowledge,.. Jumping on coding was done before LLMs, but it usually hurts the long term viability of the project.

Adding code to a project can be done quite fast (hackatons,...), ensuring quality is what slows things down in any any well functioning team.

streetfighter64 22 hours ago|||

I mean, all I can really say is... if writing some logging takes you one or two days, are you sure you _really_ know how to code?

boxedemp 21 hours ago|||

Ever worked on a distributed system with hundreds of millions of customers and seemingly endless business requirements?

Some things are complex.

shepherdjerred 22 hours ago||||

You're right, you're better than me!

You could've been curious and ask why it would take 1-2 days, and I would've happily told you.

jamesmcq 21 hours ago||

I'll bite, because it does seem like something that should be quick in a well-architected codebase. What was the situation? Was there something in this codebase that was especially suited to AI-development? Large amounts of duplication perhaps?

shepherdjerred 21 hours ago||

It's not particularly interesting.

I wanted to add audit logging for all endpoints we call, all places we call the DB, etc. across areas I haven't touched before. It would have taken me a while to track down all of the touchpoints.

Granted, I am not 100% certain that Claude didn't miss anything. I feel fairly confident that it is correct given that I had it research upfront, had multiple agents review, and it made the correct changes in the areas that I knew.

Also I'm realizing I didn't mention it included an API + UI for viewing events w/ pretty deltas

fragmede 21 hours ago|||

We're not as good at coding as you, naturally.

psvv 18 hours ago|||

I'd find it deeply funny if the optimal vibe coding workflow continues to evolve to include more and more human oversight, and less and less agent autonomy, to the point where eventually someone makes a final breakthrough that they can save time by bypassing the LLM entirely and writing the code themselves. (Finally coming full circle.)

skeledrew 21 hours ago|||

Researching and planning a project is a generally usefully thing. This is something I've been doing for years, and have always had great results compared to just jumping in and coding. It makes perfect sense that this transfers to LLM use.

kburman 21 hours ago|||

Since Opus 4.5, things have changed quite a lot. I find LLMs very useful for discussing new features or ideas, and Sonnet is great for executing your plan while you grab a coffee.

roncesvalles 20 hours ago|||

Well it's less mental load. It's like Tesla's FSD. Am I a better driver than the FSD? For sure. But is it nice to just sit back and let it drive for a bit even if it's suboptimal and gets me there 10% slower, and maybe slightly pisses off the guy behind me? Yes, nice enough to shell out $99/mo. Code implementation takes a toll on you in the same way that driving does.

I think the method in TFA is overall less stressful for the dev. And you can always fix it up manually in the end; AI coding vs manual coding is not either-or.

dmix 22 hours ago|||

Most of these AI coding articles seem to be about greenfield development.

That said, if you're on a serious team writing professional software there is still tons of value in always telling AI to plan first, unless it's a small quick task. This post just takes it a few steps further and formalizes it.

I find Cursor works much more reliably using plan mode, reviewing/revising output in markdown, then pressing build. Which isn't a ton of overhead but often leads to lots of context switching as it definitely adds more time.

stealthyllama 19 hours ago|||

There is a miscommunication happening, this entire time we all had surprisingly different ideas about what quality of work is acceptable which seems to account for differences of opinion on this stuff.

keyle 22 hours ago|||

I partly agree with you. But once you have a codebase large enough, the changes become longer to even type in, once figured out.

I find the best way to use agents (and I don't use claude) is to hash it out like I'm about to write these changes and I make my own mental notes, and get the agent to execute on it.

Agents don't get tired, they don't start fat fingering stuff at 4pm, the quality doesn't suffer. And they can be parallelised.

Finally, this allows me to stay at a higher level and not get bogged down of "right oh did we do this simple thing again?" which wipes some of the context in my mind and gets tiring through the day.

Always, 100% review every line of code written by an agent though. I do not condone committing code you don't 'own'.

I'll never agree with a job that forces developers to use 'AI', I sometimes like to write everything by hand. But having this tool available is also very powerful.

jamesmcq 22 hours ago||

I want to be clear, I'm not against any use of AI. It's hugely useful to save a couple of minutes of "write this specific function to do this specific thing that I could write and know exactly what it would look like". That's a great use, and I use it all the time! It's better autocomplete. Anything beyond that is pushing it - at the moment! We'll see, but spending all day writing specs and double-checking AI output is not more productive than just writing correct code yourself the first time, even if you're AI-autocompleting some of it.

skeledrew 21 hours ago||

For the last few days I've been working on a personal project that's been on ice for at least 6 years. Back when I first thought of the project and started implementing it, it took maybe a couple weeks to eke out some minimally working code.

This new version that I'm doing (from scratch with ChatGPT web) has a far more ambitious scope and is already at the "usable" point. Now I'm primarily solidifying things and increasing test coverage. And I've tested the key parts with IRL scenarios to validate that it's not just passing tests; the thing actually fulfills its intended function so far. Given the increased scope, I'm guessing it'd take me a few months to get to this point on my own, instead of under a week, and the quality wouldn't be where it is. Not saying I haven't had to wrangle with ChatGPT on a few bugs, but after a decent initial planning phase, my prompts now are primarily "Do it"s and "Continue"s. Would've likely already finished it if I wasn't copying things back and forth between browser and editor, and being forced to pause when I hit the message limit.

keyle 20 hours ago||

This is a great come-back story. I have had a similar experience with a photoshop demake of mine.

I recommend to try out Opencode with this approach, you might find it less tiring than ChatGPT web (yes it works with your ChatGPT Plus sub).

phantomathkg 21 hours ago|||

Surely Addy Osmani can code. Even he suggests plan first.

https://news.ycombinator.com/item?id=46489061

skydhash 20 hours ago||

> planning and checking and prompting and orchestrating is far more work than just writing the code yourself.

This! Once I'm familiar with the codebase (which I strive to do very quickly), for most tickets, I usually have a plan by the time I've read the description. I can have a couple of implementation questions, but I knew where the info is located in the codebase. For things, I only have a vague idea, the whiteboard is where I go.

The nice thing with such a mental plan, you can start with a rougher version (like a drawing sketch). Like if I'm starting a new UI screen, I can put a placeholder text like "Hello, world", then work on navigation. Once that done, I can start to pull data, then I add mapping functions to have a view model,...

Each step is a verifiable milestone. Describing them is more mentally taxing than just writing the code (which is a flow state for me). Why? Because English is not fit to describe how computer works (try describe a finite state machine like navigation flow in natural languages). My mental mental model is already aligned to code, writing the solution in natural language is asking me to be ambiguous and unclear on purpose.

jrs235 21 hours ago||

Claude appeared to just crash in my session: https://news.ycombinator.com/item?id=47107630

willsmith72 13 hours ago||

this sounds... really slow. for large changes for sure i'm investing time into planning. but such a rigid system can't possible be as good as a flexible approach with variable amounts of planning based on complexity

growt 16 hours ago||

That is just spec driven development without a spec, starting with the plan step instead.

RVuRnvbM2e 18 hours ago|

This is just Waterfall for LLMs. What happens when you explore the problem space and need to change up the plan?

More comments...