Top
Best
New

Posted by evakhoury 12/16/2025

AI will make formal verification go mainstream(martin.kleppmann.com)
826 points | 432 commentspage 7
skeeter2020 12/16/2025|
Prediction: AI hypers - both those who are clueless and those who know perfectly well - will love this because it makes their "AI replaces every developer" wet dream come true, by shifting the heavy lifting from this thing called "software development" to the tiny problem of just formally verifying the software product. Your average company can bury this close to QA, where they're probably already skimping, save a bunch of money and get out with the rewards before the full damage is apparent.
jkaptur 12/16/2025||
I think that there are three relevant artifacts: the code, the specification, and the proof.

I agree with the author that if you have the code (and, with an LLM, you do) and a specification, AI agents could be helpful to generate the proof. This is a huge win!

But it certainly doesn't confront the important problem of writing a spec that captures the properties you actually care about. If the LLM writes that for you, I don't see a reason to trust that any more than you trust anything else it writes.

I'm not an expert here, so I invite correction.

zamadatix 7 days ago||
Some past discussion (excluding 0 comment submissions) to add:

https://news.ycombinator.com/item?id=46216274 - 4 comments, 6 days ago

https://news.ycombinator.com/item?id=46203508 - 1 comment, 7 days ago

https://news.ycombinator.com/item?id=46198874 - 2 comments, 8 days ago

hedayet 7 days ago||
I can see the inspiration; But then again, how much investment will be required to verify the verifier? (it's still code - and is generated my a non-deterministic system)
tasuki 7 days ago||
I'd love it, but this is almost textbook wishful thinking.
bluerooibos 7 days ago||
Is there a difference here between a proof and an automated test?

I'm curious what a proof would look like compared to my RSpec unit tests and what the actual implementation would look like.

pizlonator 7 days ago|
A test tests some - but not all - of the paths through your code. And some - but not all - of the inputs you might see.

A proof establishes that your code works correctly on all paths, inputs, etc.

i_am_a_peasant 7 days ago||
A “proof” in the formal verification sense is just an exhaustive search through a state space that a model of your system, respects a certain set of constraints you have to explicitly write.

You could still have written the model wrong, you could still have not accounted for something important in your constraints. But at least it will give you a definite answer to “Does this model do what you say it does?”.

Then there are cases when an exhaustive search of the entire state space is not possible and best you can do is a stochastic search within it and hope that “Given a random sampling of all possible inputs, this model has respected the constraints for T amount of time, we can say with 99.99999998% certainty that the model follows the constraints”.

cess11 7 days ago||
"As the verification process itself becomes automated, the challenge will move to correctly defining the specification"

OK.

"Reading and writing such formal specifications still requires expertise and careful thought."

So the promise here is that bosses in IT organisations will in the future have expertise and prioritise careful thought, or allow their subordinates to have and practice these characteristics.

Would be great news if that'll be the case.

sandeepsr 7 days ago||
Formal verification does not scale, and has not scaled for 2 decades. Although, LLMs can help write properties, that required a skilled person (Phd) to write.

In pre-silicon verification, formal has been used successfully for decades, but is not a replacement for simulation based verification.

The future of verification (for hardware and software) is to eliminate verification all together, by synthesizing intent into correct code and tests.

-- https//www.verifai.ai

igornotarobot 7 days ago|
Afaik, formal verification worked well for hardware because most of the things in hardware were deterministic and could be captured precisely. Most of the software these days is concurrent and distributed. Does the LLM + RL approach work for concurrent and distributed code? We barely know how to do systematic simulation there.
Sevii 12/16/2025||
If AI is good enough to write formal verification, why wouldn't it be good enough to do QA? Why not just have AI do a full manual test sweep after every change?
simonw 12/16/2025||
You can do that as well, but it's non-deterministic and also expensive to run.

Much better to have AI write deterministic test suites for your project.

apercu 12/16/2025||
I guess I am luddite-ish in that I think people still need to decide what must always be true in a system. Tests should exist to check those rules.

AI can help write test code and suggest edge cases, but it shouldn’t be trusted to decide whether behavior is correct.

When software is hard to test, that’s usually a sign the design is too tightly coupled or full of side effects, or that the architecture is unnecessarily complicated. Not that the testing tools are bad.

More comments...