AI will make formal verification go mainstream

Posted by evakhoury 12/16/2025

AI will make formal verification go mainstream(martin.kleppmann.com)

827 points | 434 commentspage 8

robot-wrangler 12/16/2025|

I don't like sharing unpolished WIP and my project is still more at a lab-notebook phase than anything else, but the think-pieces are always light on code and maybe someone is looking for a fun holiday hackathon: https://mattvonrocketstein.github.io/py-mcmas/

dkga 12/17/2025||

As an economist I completely agree with the post. In fact, I assume we will see an explosion of formal proof requirements in code format by journals in a few years time. Not to mention this would make it much easier and faster to publish papers in Economic Theory, where checking the statements and sometimes the proofs simply takes a long time from reviewers.

mkleczek 12/17/2025||

The article only discusses reasons why formal verification is needed. It does not provide any information on how would AI solve the fundamental issues making it difficult: https://pron.github.io/posts/correctness-and-complexity

gaogao 12/16/2025||

Topical to my interests, I used Claude Code the other day for formally verifying some matrix multiplication in Rust. Writing the spec wasn't too hard actually, done as post-conditions in code, as proving equivalence to a simpler version of the code, such as for optimization, is pretty straight forward. Maybe I should write up a proper post on it.

zkmon 12/17/2025||

Any proof system or method is relative to its frame of reference, axiomatic context, known proofs etc. Modern software doesn't live in an isolated lab context, unless you are building an air-gapped HSM etc. Proof system itself would have to evolve to communicate the changing specs to the underlying software.

So, the job is not done for humans yet.

jstummbillig 12/16/2025||

Interesting. Here is my ai-powered dev prediction: We'll move toward event-sourced systems, because AI will be able to discover patterns and workflow correlations that are hard or impossible to recover from state-only CRUD. It seems silly to not preserve all that business information, given this analysis machine we have at our hands.

ursAxZA 12/17/2025||

The theory is sound, but the practical bridge is missing.

There’s a gap that current LLM architectures simply can’t cross yet.

henning 12/16/2025||

Unless you feed a spec to the LLM, and it nitpicks compiled TLA+ output generated by your PlusCal input, gaslights you into saying the code you just ran and pasted the output of is invalid, then generates invalid TLA+ output in response. Which is exactly what happened when I tried coding with Gemini via formal verification.

m00dy 12/17/2025|

I just write in safe Rust, if it compiles then it is formally verified for me.

I recently used Rust in my recent project, Deepwalker [0]. I have written only once and never looked back.

[0]: https://deepwalker.xyz

More comments...