AI will make formal verification go mainstream

Posted by evakhoury 12/16/2025

AI will make formal verification go mainstream(martin.kleppmann.com)

827 points | 434 commentspage 7

skeeter2020 12/16/2025|

Prediction: AI hypers - both those who are clueless and those who know perfectly well - will love this because it makes their "AI replaces every developer" wet dream come true, by shifting the heavy lifting from this thing called "software development" to the tiny problem of just formally verifying the software product. Your average company can bury this close to QA, where they're probably already skimping, save a bunch of money and get out with the rewards before the full damage is apparent.

rcarmo 12/17/2025||

I've been heearing about formal verification since college (which for me was more than 30 years ago) and I even taught a thing called "Z", which was a formally verifiable academia thing that tried to be the ultimate formal language. It never panned out, and I honestly don't think that AI is going to help in anything but test generation, which is going to remain the most pragmatic approach to formal verification (but, like all things, it's an approximation, not 100% correct).

jkaptur 12/16/2025||

I think that there are three relevant artifacts: the code, the specification, and the proof.

I agree with the author that if you have the code (and, with an LLM, you do) and a specification, AI agents could be helpful to generate the proof. This is a huge win!

But it certainly doesn't confront the important problem of writing a spec that captures the properties you actually care about. If the LLM writes that for you, I don't see a reason to trust that any more than you trust anything else it writes.

I'm not an expert here, so I invite correction.

zamadatix 12/17/2025||

Some past discussion (excluding 0 comment submissions) to add:

https://news.ycombinator.com/item?id=46216274 - 4 comments, 6 days ago

https://news.ycombinator.com/item?id=46203508 - 1 comment, 7 days ago

https://news.ycombinator.com/item?id=46198874 - 2 comments, 8 days ago

bluerooibos 12/17/2025||

Is there a difference here between a proof and an automated test?

I'm curious what a proof would look like compared to my RSpec unit tests and what the actual implementation would look like.

pizlonator 12/17/2025|

A test tests some - but not all - of the paths through your code. And some - but not all - of the inputs you might see.

A proof establishes that your code works correctly on all paths, inputs, etc.

i_am_a_peasant 12/18/2025||

A “proof” in the formal verification sense is just an exhaustive search through a state space that a model of your system, respects a certain set of constraints you have to explicitly write.

You could still have written the model wrong, you could still have not accounted for something important in your constraints. But at least it will give you a definite answer to “Does this model do what you say it does?”.

Then there are cases when an exhaustive search of the entire state space is not possible and best you can do is a stochastic search within it and hope that “Given a random sampling of all possible inputs, this model has respected the constraints for T amount of time, we can say with 99.99999998% certainty that the model follows the constraints”.

hedayet 12/17/2025||

I can see the inspiration; But then again, how much investment will be required to verify the verifier? (it's still code - and is generated my a non-deterministic system)

tasuki 12/17/2025||

I'd love it, but this is almost textbook wishful thinking.

Sevii 12/16/2025||

If AI is good enough to write formal verification, why wouldn't it be good enough to do QA? Why not just have AI do a full manual test sweep after every change?

simonw 12/16/2025||

You can do that as well, but it's non-deterministic and also expensive to run.

Much better to have AI write deterministic test suites for your project.

apercu 12/16/2025||

I guess I am luddite-ish in that I think people still need to decide what must always be true in a system. Tests should exist to check those rules.

AI can help write test code and suggest edge cases, but it shouldn’t be trusted to decide whether behavior is correct.

When software is hard to test, that’s usually a sign the design is too tightly coupled or full of side effects, or that the architecture is unnecessarily complicated. Not that the testing tools are bad.

sandeepsr 12/17/2025||

Formal verification does not scale, and has not scaled for 2 decades. Although, LLMs can help write properties, that required a skilled person (Phd) to write.

In pre-silicon verification, formal has been used successfully for decades, but is not a replacement for simulation based verification.

The future of verification (for hardware and software) is to eliminate verification all together, by synthesizing intent into correct code and tests.

-- https//www.verifai.ai

igornotarobot 12/17/2025|

Afaik, formal verification worked well for hardware because most of the things in hardware were deterministic and could be captured precisely. Most of the software these days is concurrent and distributed. Does the LLM + RL approach work for concurrent and distributed code? We barely know how to do systematic simulation there.

cess11 12/17/2025|

"As the verification process itself becomes automated, the challenge will move to correctly defining the specification"

OK.

"Reading and writing such formal specifications still requires expertise and careful thought."

So the promise here is that bosses in IT organisations will in the future have expertise and prioritise careful thought, or allow their subordinates to have and practice these characteristics.

Would be great news if that'll be the case.

More comments...