Posted by signa11 8 hours ago
I used to be on that side of the argument - clearly code is more precise so it MUST be simpler than wrangling with the uncertainty of prose. But precision isn't the only factor in play.
The argument here is that essential complexity lives on and you can only convert between expressions of it - that is certainly true but it's is overlooking both accidental complexity and germane complexity.
Specs in prose give you an opportunity to simplify by right-sizing germane complexity in a way that code can't.
You might say "well i could create a library or a framework and teach everyone how to use it" and so when we're implementing the code to address the essential complexity, we benefit from the germane complexity of the library. True, but now consider the infinite abstraction possible in prose. Which has more power to simplify by replacing essential complexity with germane complexity?
Build me a minecraft clone - there's almost zero precision here, if it weren't for the fact that word minecraft is incredibly load bearing in this sentence, then you'd have no chance of building the right thing. One sentence. Contrast with the code you'd have to write and read to express the same.
- Define the data structures in the code yourself. Add comments on what each struct/enum/field does.
- Write the definitions of any classes/traits/functions/interfaces that you will add or change. Either leave the implementations empty or write them yourself if they end up being small or important enough to write by hand (or with AI/IDE autocompletion).
- Write the signatures of the tests with a comment on what it's verifying. Ideally you would write the tests yourself, specially if they are short, but you can leave them empty.
- Then at this point you involve the agent and tell it to plan how to complete the changes without barely having to specify anything in the prompt. Then execute the plan and ask the agent to iterate until all tests and lints are green.
- Go through the agent's changes and perform clean up. Usually it's just nitpicks and changes to conform to my specific style.
If the change is small enough, I find that I can complete this with just copilot in about the same amount of time it would take to write an ambiguous prompt. If the change is bigger, I can either have the agent do it all or do the fun stuff myself and task the agent with finishing the boring stuff.
So I would agree with the title and the gist of the post but for different reasons.
Example of a large change using that strategy: https://github.com/trane-project/trane/commit/d5d95cfd331c30...
I found that to be really vital for good code. https://fsharpforfunandprofit.com/rop/
I’ve found that two to three iterations with various prompts or different models will often yield a surprising solution or some aspect I hadn’t thought of or didn’t know about.
Then I throw away most or all of the code and follow your process, but with care to keep the good ideas from the LLMs, if any.
The only vibecoded thing was an iOS app and I didn't follow this process because I don't know iOS programming nor do I want to learn it. This only works if you know at least how to define functions and data structures in the language, but I think most PMs could learn that if they set their minds to it.
- Do a brainstorming session with the LLM about your idea. Flesh out the major points of the product, who the stakeholders are, what their motivations and goals are, what their pain points are. Research potential competitors. Find out what people are saying about them, especially the complaints.
- Build a high level design document with the LLM. Go through user workflows and scenarios to discern what kinds of data are needed, and at what scale.
- Do more market research to see how novel this approach is. Figure out what other approaches are used, and how successful they are. Get user pain points with each approach if you can. Then revisit your high level design.
- Start a technical design document with the LLM. Figure out who the actors of the system are, what their roles are in the system, and what kinds of data they'll need in order to do their job.
- Research the technologies that could help you build the system. Find out how popular they are, how reliable they are, how friendly they are (documentation, error messaging, support, etc), their long-term track record, etc. These go into a research document.
- Decide based on the research which technologies match your system best. Start a technical document with the LLM. Go through the user scenarios and see how the technologies fit.
- Decide on the data structures and flows through the system. Caching, load balancing, reliability, throughput requirements at the scale you plan to reach for your MVP and slightly beyond. Some UX requirements at this point are good as well.
- Start to flesh out your interfaces, both user and machine. Prototype some ideas and see how well they work.
- Circle back to research and design based on your findings. Iterate a few times and update the documents as you go using your LLM. Try to find ways to break it.
- Once you're happy with your design, build an architecture document that shows how the whole system will concretely fit together.
- Build an implementation plan. Run it through multiple critique rounds. Try to find ways to break it.
- Now you're at the threshold where changes get harder. Build the implementation piece by piece, checking to make sure they work as expected. This can be done quickly with multiple LLMs in parallel. Expect that the pieces won't fit and you'll need to rethink a lot of your assumptions. Code will change a LOT, so don't waste much time making it nice. You should have unit and integration tests and possibly e2e tests, which are cheap for the LLM to maintain, even if a lot of them suck.
- Depending on how the initial implementation went, decide whether to keep the codebase and refine it, or start the implementation over using the old codebase for lessons learned.
Basically, more of the same of what we've been doing for decades, just with faster tools.
I have always done that steps with my team and the results are great.
If you are a solo developer I understand that the LLM can help somewhat but not replace a real team of developers.
The code will always be an imperfect projection of the specification, and that is a feature. It must be decoupled to some extent or everything would become incredibly brittle. You do not need your business analysts worrying about which SQLite provider is to be used in the final shipped product. Forcing code to be isomorphic with spec means everyone needs to know everything all the time. It can work in small tech startups, but it doesn't work anywhere else.
Most people don't specify or know that they want a U bend in their pipes or what kind or joints should be used for their pipes.
The absence of U bends or use or poor joints will be felt immediately.
Thankfully home building is a relatively solved problem whereas software is everything bespoke and every problem slightly different... Not to mention forever changing
This article is really attacking vague prose that pushes ambiguity onto the agent - okay, fair enough. But that's a tooling problem. What if you could express structure and relationships at a higher level than text, or map domain concepts directly to library components? People are already working on new workflows and tools to do just that!
Also, dismissing the idea that "some day we'll be able to just write the specs and the program will write itself" is especially perplexing. We're already doing it, aren't we? Yes, it has major issues but you can't deny that AI agents are enabling literally that. Those issues will get fixed.
The historical parallel matters here as well. Grady Booch (co-creator of UML) argues we're in the third golden age of software engineering:
- 1940s: abstracted away the machine -> structured programming
- 1970s: abstracted away the algorithm -> OOP, standard libraries, UML
- Now: abstracting away the code itself
Recent interview here: https://www.youtube.com/watch?v=OfMAtaocvJw
Each previous transition had engineers raising the same objections: "this isn't safe", "you're abstracting away my craft". They were right that something was lost, but wrong that it was fatal. Eventually the new tools worked well enough to be used in production.
Which was mostly a failure, to the point that there is a major movement towards languages that "abstract away" (in this sense) less, e.g. Rust.
Certainly if the creators of UML are saying that AI is great, that gives me more confidence than ever that it's bunk.
Rust uses crates to import those dependencies, which was one of its biggest innovations.
A sufficiently detailed spec need only concern itself with essential complexity.
Applications are chock-full of accidental complexity.
- Delete code and start all over with the spec. I don't think anyone's ready to do that.
- Buy a software product / business and be content with just getting markdown files in a folder.
I haven't heard of being content paying for a product consisting of markdown files. Though I could imagine people paying for agent skill files. But yet, the skills are not the same product as say, linear.
That is the great insight I can offer
Like they say “everything comes round again”
Could be that the truth is somewhere in between?