Posted by theorchid 4 days ago
I assure you, they do not.
My most recent Claude Code fix consisted of one line: calling `third_party_lib._connect()`. It reaches into the internals of an external library. The fix worked, but it is improper to depend on the specific implementation. The correct fix was about 20 lines.
(Tangentially, this is why I think LLMs are more useful for senior developers because junior developers tend to not have a sense for what's good quality and accept whatever works.)
In short:
1. stacked-commits automation (cannot skip writing context/why/verify sections)
2. product specs (full ERD: https://excalidraw.com/#json=WT-oRUdyKBhAsDZJ3NwAR,WAbVgfO39...)
3. linking specs to code via SCIP indexes, and commits to ACs, later you can attach anything you want
My version is roughly: if execution and verification are getting cheap, but taste, design, intent, and planning are still expensive, then the workflow should focus on how intent gets executed in each change and how it stays intact over time instead of slowly drifting.
I’m more hesitant to formalize all of that into heavy tooling, but there are clear parallels here. I wrote up more of what I mean here: https://russellromney.com/blog/intent-driven-development
When you are working on something, you don't load entire context of entire app in your head, only "minimal viable context" (MVC). In my ERD this is a record in the Spec table. And sure, some specs might reference other specs. Some can depend implicitly or explicitly. But this is a clear anchor point you can approach two ways: either from searching from a list of specs, or by going to file you know needs editing, and calling `blast-radius` script to fetch which specs link to it.
Intent can be represented as granular artifacts. The tiniest granularity in my schema is Acceptance Criterion, where it is represented in code as a test. It can be e2e, unit, integration — depends on the context and what is "acceptable" way to prove AC works on system level.
For example, AC for product-level spec that says "Invalid input is rejected with clear feedback" (Submitting invalid input shows validation error and does not create a short URL record) means that we have e2e test in playwright that starts from the URL creation page and follows user across steps to receive the error. And the engineers who are working on this AC see it with their own eyes.
But this might not be enough. Why?
Because you can still show an error on screen, but the record in the DB would be created. So I decided there should be one more window that shows state of the objects (their properties) in real time. This could be simple TUI that just visualizes "evidence artifacts". Think of how a debugger works: you can step forward and you have local variables you can track.
So the idea is simple: "Watch how playwright script executes steps for AC one by one, while you look at the DB table. After every step you see updated object state. If multiple state changes happened after a single step, ideally we want to display history of updates. At the end of run we compare evidence before/after state for every tracked object."
I call this "runtime engine". This is approximately how engineers simulate it in their head when reading code. But why even bother simulating it, when we can simply run and observe what happens?
That line (between your other values?) was uproarious; I apologise for not u*voting it, partially because I couldn't vocalise my peculiar fetish att (+ "gnarliness-pornstar" doesn't sound nearly as enticing as "AI-affordability-pornstar" X)