Toward automated verification of unreviewed AI-generated code

Posted by peterlavigne 1 day ago

Toward automated verification of unreviewed AI-generated code(peterlavigne.com)

37 points | 32 commentspage 2

boombapoom 2 hours ago|

production ready "fizz buzz" code. lol. I can't even continue typing this response.

Ancalagon 3 hours ago||

Even with mutation testing doesn’t this still require review of the testing code?

Animats 3 hours ago||

Mutation is a test for the test suite. The question is whether a change to the program is detected by the tests. If it's not, the test suite lacks coverage. That's a high standard for test suites, and requires heavy testing of the obvious.

But if you actually can specify what the program is supposed to do, this can work. It's appropriate where the task is hard to do but easy to specify. A file system or a database can be specified in terms of large arrays. Most of the complexity of a file system is in performance and reliability. What it's supposed to do from the API perspective isn't that complicated. The same can be said for garbage collectors, databases, and other complex systems that do something that's conceptually simple but hard to do right.

Probably not going to help with a web page user interface. If you had a spec for what it was supposed to do, you'd have the design.

jryio 3 hours ago||

Correct. Where did the engineering go? First it was in code files. Then it went to prompts, followed by context, and then agent harnesses. I think the engineering has gone into architecture and testing now.

We are simply shuffling cognitive and entropic complexity around and calling it intelligence. As you said, at the end of the day the engineer - like the pilot - is ultimately the responsible party at all stages of the journey.

morpheos137 2 hours ago||

I think we need to approach provable code.

otabdeveloper4 2 hours ago||

This one is pretty easy!

Just write your business requirements in a clear, unambiguous and exhaustive manner using a formal specification language.

Bam, no coding required.

rsoto2 17 minutes ago|

damn if only this language could be made to work with numbers we would really have something. Let's ask an LLM about it

ventana 2 hours ago||

I might be missing the point of the article, but from what I understand, the TL;DR is, "cover your code with tests", be it unit tests, functional tests, or mutants.

Each of these approaches is just fine and widely used, and none of them can be called "automated verification", which, if my understanding of the term is correct, is more about mathematical proof that the program works as expected.

The article mostly talks about automatic test generation.

andai 2 hours ago||

...in FizzBuzz

aplomb1026 3 hours ago||

[dead]

rigorclaw 2 hours ago|

[flagged]

ossianericson 1 hour ago|

[flagged]