Posted by evakhoury 12/16/2025
I agree with the author that if you have the code (and, with an LLM, you do) and a specification, AI agents could be helpful to generate the proof. This is a huge win!
But it certainly doesn't confront the important problem of writing a spec that captures the properties you actually care about. If the LLM writes that for you, I don't see a reason to trust that any more than you trust anything else it writes.
I'm not an expert here, so I invite correction.
https://news.ycombinator.com/item?id=46216274 - 4 comments, 6 days ago
https://news.ycombinator.com/item?id=46203508 - 1 comment, 7 days ago
https://news.ycombinator.com/item?id=46198874 - 2 comments, 8 days ago
I'm curious what a proof would look like compared to my RSpec unit tests and what the actual implementation would look like.
A proof establishes that your code works correctly on all paths, inputs, etc.
You could still have written the model wrong, you could still have not accounted for something important in your constraints. But at least it will give you a definite answer to “Does this model do what you say it does?”.
Then there are cases when an exhaustive search of the entire state space is not possible and best you can do is a stochastic search within it and hope that “Given a random sampling of all possible inputs, this model has respected the constraints for T amount of time, we can say with 99.99999998% certainty that the model follows the constraints”.
OK.
"Reading and writing such formal specifications still requires expertise and careful thought."
So the promise here is that bosses in IT organisations will in the future have expertise and prioritise careful thought, or allow their subordinates to have and practice these characteristics.
Would be great news if that'll be the case.
In pre-silicon verification, formal has been used successfully for decades, but is not a replacement for simulation based verification.
The future of verification (for hardware and software) is to eliminate verification all together, by synthesizing intent into correct code and tests.
-- https//www.verifai.ai
Much better to have AI write deterministic test suites for your project.
AI can help write test code and suggest edge cases, but it shouldn’t be trusted to decide whether behavior is correct.
When software is hard to test, that’s usually a sign the design is too tightly coupled or full of side effects, or that the architecture is unnecessarily complicated. Not that the testing tools are bad.