Posted by vinhnx 6 hours ago
The key insight that most people miss: this isn't a new workflow invented for AI - it's how good senior engineers already work. You read the code deeply, write a design doc, get buy-in, then implement. The AI just makes the implementation phase dramatically faster.
What I've found interesting is that the people who struggle most with AI coding tools are often junior devs who never developed the habit of planning before coding. They jump straight to "build me X" and get frustrated when the output is a mess. Meanwhile, engineers with 10+ years of experience who are used to writing design docs and reviewing code pick it up almost instantly - because the hard part was always the planning, not the typing.
One addition I'd make to this workflow: version your research.md and plan.md files in git alongside your code. They become incredibly valuable documentation for future maintainers (including future-you) trying to understand why certain architectural decisions were made.
Experimentally, i've been using mfbt.ai [https://mfbt.ai] for roughly the same thing in a team context. it lets you collaboratively nail down the spec with AI before handing off to a coding agent via MCP.
Avoids the "everyone has a slightly different plan.md on their machine" problem. Still early days but it's been a nice fit for this kind of workflow.
However, Opus made me rethink my entire workflow. Now, I do it like this:
* PRD (Product Requirements Document)
* main.py + requirements.txt + readme.md (I ask for minimal, functional, modular code that fits the main.py)
* Ask for a step-by-step ordered plan
* Ask to focus on one step at a time
The super powerful thing is that I don’t get stuck on missing accounts, keys, etc. Everything is ordered and runs smoothly. I go rapidly from idea to working product, and it’s incredibly easy to iterate if I figure out new features are required while testing. I also have GLM via OpenCode, but I mainly use it for "dumb" tasks.
Interestingly, for reasoning capabilities regarding standard logic inside the code, I found Gemini 3 Flash to be very good and relatively cheap. I don't use Claude Code for the actual coding because forcing everything via chat into a main.py encourages minimal code that's easy to skim—it gives me a clearer representation of the feature space
> ...
> never let Claude write code until you’ve reviewed and approved a written plan
I certainly always work towards an approved plan before I let it lost on changing the code. I just assumed most people did, honestly. Admittedly, sometimes there's "phases" to the implementation (because some parts can be figured out later and it's more important to get the key parts up and running first), but each phase gets a full, reviewed plan before I tell it to go.
In fact, I just finished writing a command and instruction to tell claude that, when it presents a plan for implementation, offer me another option; to write out the current (important parts of the) context and the full plan to individual (ticket specific) md files. That way, if something goes wrong with the implementation I can tell it to read those files and "start from where they left off" in the planning.
We all tend to regress to average (same thoughts/workflows)...
Have had many users already doing the exact same workflow with: https://github.com/backnotprop/plannotator
And “Don’t change this function signature” should be enforced not by anticipating that your coding agent “might change this function signature so we better warn it not to” but rather via an end to end test that fails if the function signature is changed (because the other code that needs it not to change now has an error). That takes the author out of the loop and they can not watch for the change in order to issue said correction, and instead sip coffee while the agent observes that it caused a test failure then corrects it without intervention, probably by rolling back the function signature change and changing something else.
Speckit is worth trying as it automates what is being described here, and with Opus 4.6 it's been a kind of BC/AD moment for me.