AI will make formal verification go mainstream

Posted by evakhoury 12/16/2025

AI will make formal verification go mainstream(martin.kleppmann.com)

827 points | 434 commentspage 9

wagwang 12/16/2025|

Disagree, the ideal agentic coding workflow for high tolerance programming is to give the agent access to software output and have it iterate in a loop. You can let the agent do TDD, you can give it access to server logs, or even browser access.

okaleniuk 12/17/2025||

AI will definitely make more people thing about verification. Formal or otherwise.

gttalbot 12/17/2025||

If the AI is going to lie and cheat on the code it writes (this is the largest thing I bump into regularly and have to nudge it on), what makes the author think that the same AI won't cheat on the proof too?

raincole 12/17/2025||

God I hope not. Many people (me included) find Rust types hard to read enough. If formal verification ever goes mainstream I guarantee you that means no one reads them and the whole SWE completely depends on AI.

chhxdjsj 12/17/2025||

An excellent use case for this is ethereum smart contract verification. People store millions of dollars in smart contracts that are probably a one or two iterations of claude or gemini away from being pwned.

grahar64 12/17/2025||

Won't people just use AI to define specification? Like if they are getting most of it done with AI, won't they won't read/test the code to verify won't they also not read the spec

topazas 12/17/2025||

Right, but let's assume we have really simple crud application in mainstream language, could be ts or python. What are current best approaches to have this semi-formally verified, tools, methods, etc?

Havoc 12/16/2025||

I've been toying with vibecoding rust - hardly formal verification, but it is a step closer than python that's for sure.

So far so good, though the smaller amount of training data is noticeable.

iwontberude 12/16/2025|

vibecoding rust sounds cool, which model are you using? I have tried in the past with GPT4o and Sonnet 4, but they were so bad I thought I should just wait a few years.

Havoc 12/18/2025|||

Using GLM. It works but chews tokens pretty hard

iwontberude 12/17/2025|||

Or downvote me without a reply proving my point

ktimespi 12/17/2025||

Can you really rely on an LLM to write valid proofs though? What if one of the assumptions is false? I can very well think of a lot of ways that this can happen in Rocq, for example.

rf15 12/17/2025|

Isn't formal verification a "just write it twice" approach with different languages? (and different logical constraints on the way you write the languages)

More comments...