When AI writes the software, who verifies it? - Hacker News

Posted by todsacerdoti 20 hours ago

When AI writes the software, who verifies it?(leodemoura.github.io)

252 points | 252 commentspage 5

milind-soni 5 hours ago|

At this point, its the users

ozten 14 hours ago||

This is really great and important progress, but Lean is still an island floating in space. Too hard to get actual work done building any real world system.

mentalgear 17 hours ago||

> Where this leads is clear. Layer by layer, the critical software stack will be reconstructed with mathematical proofs built in. The question is not whether this happens, but when.

Nevermark 9 hours ago||

> When AI writes the software, who verifies it?

The users. It's a start.

bryanlarsen 17 hours ago||

You can use AI to make a reviewers job much easier. Add documents, divide your MR into reviewable chunks, et cetera.

If reviewing is the expensive part now, optimize for reviewability.

anonhacker199 15 hours ago||

The biggest issue right now is most AI tools aren't hooked up appropriately to an environment they can test in (Chrome typically). Replit works extremely well because it has an integrated browser and testing strategy. AI works very well when it has the ability to check its own work.

yoaviram 18 hours ago||

I just finished writing a post about exactly this. Software development, as the act of manually producing code, is dying. A new discipline is being born. It is much closer to proper engineering.

Like an engineer overseeing the construction of a bridge, the job is not to lay bricks. It is to ensure the structure does not collapse.

The marginal cost of code is collapsing. That single fact changes everything.

https://nonstructured.com/zen-of-ai-coding/

shinycode 14 hours ago||

Our CEO, an expert in marketing has discovered Claude Code and is the one having the most open PR of all developers and is pushing for us to « quickly review ». He does not understand why review are so slow because it’s « the easiest part ». We live in a new world.

skydhash 14 hours ago|||

> I just finished writing a post about exactly this. Software development, as the act of manually producing code, is dying.

It was never that. Take any textbook on software engineering and the focus was never the code, but on systems design and correctness. I'm looking at the table of contents of one (Software Engineering by David C. Kung) and these are a few sample chapters:

  ...
  4. Software Requirement Elicitation
  5. Domain Modelling
  6. Architectural Design
  ...  
  8. Actor-System Interaction Modeling
  9. Object Interaction Modeling
  ...
  15. Modeling and Design of Rule-Based Systems
  ...
  19. Software Quality Assurance
  ...
  24. Software Security

What you're talking about was coding, which has never been the bottleneck other than for beginners in some programming languages.

MattDaEskimo 18 hours ago|||

Accountability then

yoaviram 18 hours ago||

Anticipating modes of failure, creating tooling to identify and hedge against risks.

sjajzh 13 hours ago||

If we could do this it would have been done already. Outsourced devs would be ubiquitous.

imiric 13 hours ago||

In what world do these new tools help with "laying bricks", but not with ensuring that the structure does not collapse? How is that work any more difficult than producing the software in the first place? It wasn't that long ago that these tools could barely produce a simple program. If you're buying into the promises of this tech, then what's stopping it from also being able to handle those managerial tasks much better than a human?

The seemingly profound points of your marketing slop article ignore that these new tools are not a higher level of abstraction, but a replacement of all cognitive work. The tech is coming for your job just as it is coming for the job of the "bricklayer" you think is now worthless. The work you're enjoying now is just a temporary transition period, not an indication of the future of this industry.

If you enjoy managing a system that hallucinates solutions and disregards every other instruction, that's great. When you reach a dead end with that approach, and the software is exposing customer data, or failing in unpredictable ways, hopefully you know some good "bricklayers" that can help you with that.

sslayer 16 hours ago||

State Sponsored Hackers AI will verify it.

csense 13 hours ago|

It seems like sound testing methodology to identify important theorems related to the code, prove them, and then verify the proof.

Verification gets sold as "bulletproof" but I'm skeptical for a couple reasons:

- How do you establish the relationship between the code and the theorem? Lean theorem can be applied to zlib implemented in Lean, what if you want to check zlib implemented in a normal programming language like C, JS, Zig, or whatever?

- How do you know the key properties mean what you think they mean? E.g. the theorem says "ZlibDecode.decompressSingle (ZlibEncode.compress data level) = .ok data" but it feels like it would be very easy to accidentally prove ∃ x s.t. decompress(compress(x)) == x while thinking you proved ∀ x, decompress(compress(x)) == x.

I've tried Lean and Coq and...I don't really like them. The proofs use specialized programming languages. And they seem deliberately designed to require you to use a context explorer to have any hope of understanding the proof at all. OTOH a normal unit test is written in a general purpose programming language (usually the same one as the program being tested), I'm much more comfortable checking that a Claude-written unit test does what I think it's doing than a Claude-written Lean proof of correctness.

anonymous6500 2 hours ago||

The article does not reveal it to me either how the existing code would be mapped to Lean and back. The impression from zlib example is that I'd be expected to program in Lean. No way it's going to happen. The language is too complex for me and my average colleague. We're also not going to have two parallel implementations in ordinary language and Lean and compare them with 'differential random testing' (see https://aws.amazon.com/blogs/opensource/lean-into-verified-s... someone linked in the discussion), that's just too taxing for bigger products, let alone we typically don't have enough time to do one implementation right.

The gap of having succinct, expressive, powerful and executable specification to be able to continuously verify AI-generated programs is real, but I don't see how Lean alone closes it. If the author's intention was to attract community to help build that out with Lean in the center, it's not clear to me where to even start. Since the author provided no hints or direction, I've a feeling it's not clear to them either.

nextos 9 hours ago||

Lean and Coq might not be the right choice. Perhaps something like Dafny, F*, or Why3, where code and theorems live together.

Nevertheless, Lean 4 has closed the gap and it's closer to those than Coq at this stage.

More comments...