Top
Best
New

Posted by evakhoury 12/16/2025

AI will make formal verification go mainstream(martin.kleppmann.com)
827 points | 434 commentspage 9
wagwang 12/16/2025|
Disagree, the ideal agentic coding workflow for high tolerance programming is to give the agent access to software output and have it iterate in a loop. You can let the agent do TDD, you can give it access to server logs, or even browser access.
okaleniuk 12/17/2025||
AI will definitely make more people thing about verification. Formal or otherwise.
gttalbot 12/17/2025||
If the AI is going to lie and cheat on the code it writes (this is the largest thing I bump into regularly and have to nudge it on), what makes the author think that the same AI won't cheat on the proof too?
raincole 12/17/2025||
God I hope not. Many people (me included) find Rust types hard to read enough. If formal verification ever goes mainstream I guarantee you that means no one reads them and the whole SWE completely depends on AI.
chhxdjsj 12/17/2025||
An excellent use case for this is ethereum smart contract verification. People store millions of dollars in smart contracts that are probably a one or two iterations of claude or gemini away from being pwned.
grahar64 12/17/2025||
Won't people just use AI to define specification? Like if they are getting most of it done with AI, won't they won't read/test the code to verify won't they also not read the spec
topazas 12/17/2025||
Right, but let's assume we have really simple crud application in mainstream language, could be ts or python. What are current best approaches to have this semi-formally verified, tools, methods, etc?
Havoc 12/16/2025||
I've been toying with vibecoding rust - hardly formal verification, but it is a step closer than python that's for sure.

So far so good, though the smaller amount of training data is noticeable.

iwontberude 12/16/2025|
vibecoding rust sounds cool, which model are you using? I have tried in the past with GPT4o and Sonnet 4, but they were so bad I thought I should just wait a few years.
Havoc 12/18/2025|||
Using GLM. It works but chews tokens pretty hard
iwontberude 12/17/2025|||
Or downvote me without a reply proving my point
ktimespi 12/17/2025||
Can you really rely on an LLM to write valid proofs though? What if one of the assumptions is false? I can very well think of a lot of ways that this can happen in Rocq, for example.
rf15 12/17/2025|
Isn't formal verification a "just write it twice" approach with different languages? (and different logical constraints on the way you write the languages)
More comments...