Define policy forbidding use of AI code generators

Posted by todsacerdoti 6/25/2025

Define policy forbidding use of AI code generators(github.com)

551 points | 413 commentspage 4

lexiciccone 6/27/2025|

[dead]

b0a04gl 6/26/2025||

[dead]

jssjsnj 6/26/2025||

jekwoooooe 6/25/2025||

When will people give up this archaic practice of sending patches over emails?

gerdesj 6/26/2025||

When enough people don't want to do it anymore. Feel free to step up, live with email patches, and add to the numbers of those who don't like it and say so.

Why is it archaic if it works? I get there might be other ways to do patch sharing and discussion but what exactly is your problem with email as a transport?

You might as well describe voice and ears as archaic!

jekwoooooe 6/26/2025||

Archaic:

Very old or old fashioned

gerdesj 7/1/2025||

Dictionaries work fine for those with eyes, minds and fingers.

This mechanistic effort at a definition is insufficient. How on earth can whatever this is manage less than a thumb through a dictionary?

Archaic has way more meanings than just "old".

MobiusHorizons 6/26/2025|||

likely when it stops being a useful way to cut out noise

SchemaLoad 6/25/2025||

Sending patches over email is basically a filter for slop. Stops the low effort drive by PRs and anyone who actually wants to invest some time in to contributing won't have a problem working out the workflow.

jnwatson 6/26/2025|||

AI can figure out how to send a patch via email a lot faster than a human.

Art9681 6/26/2025||

This is a "BlockBuster laughs Netflix out of the room" moment. I am a huge fan of QEMU and used it throughout my career. The maintainers have every right to govern their project as they see fit. But this is a lot of mental gymnastics to justify clinging to punchcards in a world where we now have magnetic tape and keyboards to do things faster. This tech didn't spawn weeks ago. Every major project has had at least two years to prepare for this moment.

Pull your pants up.

catlifeonmars 6/26/2025||

2 years isn’t that long. It took the Linux kernel 10 years to start accepting code written in Rust. This isn’t quite the same as the typical frontend flavor-of-the week JavaScript library.

9283409232 6/26/2025|||

You're so dramatic. Like they said in the declaration, these are the early days of AI development and all the problems they mention will be eventually resolved so they have no problem taking a backseat while things sort themselves out and I respect that choice.

add-sub-mul-div 6/26/2025||

> This is a "BlockBuster laughs Netflix out of the room" moment

I'm not sure that's the dunk you think it is. Good for Netflix for making money, but we're drowning in their empty slop content now and worse off for it.

danielbln 6/26/2025|||

Who is forcing you to watch slop? And mind you, there was a TON of garbage at any local Blockbuster back in the day, with the added joy of having to go somewhere to rent it, being slapped with late and rewind fees or not even have availability of what you want to watch.

Choice is good. It means more slop, but also more gold. Figure out how to find the gold.

gjs278 6/26/2025|||

[dead]

teruakohatu 6/25/2025||

So essentially it’s “let us cover ourselves by saying it’s not allowed” and in practice that means not allowing code that a human thinks is AI generated code.

Universities have this issue too, despite many offering students and staff Grammarly (Gen AI) while also trying to ban Gen AI.

SchemaLoad 6/25/2025||

Sounds like a good idea to ensure developers are owning the code they submit rather than hiding behind "I don't know why it does that, ChatGPT wrote it".

Use AI if you want to, but if the person on the other side can tell, and you can't defend the submission as your own, that's a problem.

JoshTriplett 6/25/2025||

> Use AI if you want to, but if the person on the other side can tell, and you can't defend the submission as your own, that's a problem.

The actual policy is "don't use AI code generators"; don't try to weasel that into "use it if you want to, but if the person on the other side can tell". That's effectively "it's only cheating if you get caught".

By way of analogy, Open Source projects also typically have policies (whether written or unwritten) that you only submit code you are legally allowed to submit. In theory, you could take a pile of proprietary reverse-engineered code that you have no license to, or a pile of code from another project that you aren't respecting the license of, and submit it anyway, and slap a `Signed-off-by` on it. Nothing will physically stop you, and people might not be able to tell. That doesn't make it OK.

SchemaLoad 6/26/2025||

The way I interpret it is that if you brainstorm using ChatGPT but write your own code using the ideas created in this step that would be fine, the reviewer wouldn't suspect the code of being AI generated because you've made sure it fits in with the project and actually works. The exact wording here is that they will reject changes they suspect of being AI generated, not that you can't have read anything AI generated in the process.

Getting AI to remind you of the libraries API is a fair bit different to having it generate 1000 lines of code you have hardly read before submitting.

Art9681 6/26/2025||

What if the code is AI generated and the developer that drove it also understands the code and can explain it?

Filligree 6/26/2025||

Well, then you’re not allowed to submit it. This isn’t hard.

_fat_santa 6/25/2025|||

Well I guess the key difference is code is deterministic, that is whether an paper accomplishes it's goals is somewhat subjective but with code its an absolute certainty.

I'm sure that if a contributor working on a feature used cursor to initially generate the code but then goes over it to ensure it's working as expected that would be allowed, this is more for those folks that just want to jam in a quick vibe-coded PR so they can add "contributed to the QEMU project" on their resumes.

hananova 6/25/2025||

You'd be wrong, the linked commit clearly says that anything written by, or derived from, AI code generation is not allowed.

GuB-42 6/25/2025||

It more like a clarification.

The rules regarding the origin of code contributions are rather strict, that is, you can't contribute other people code unless you can make sure that the licence is appropriate. A LLM may output a copy of someone else code, sometimes verbatim, without giving you its origin, so you can't contribute code written by a LLM.

pretoriusdre 6/26/2025||

AI generated code is generally pretty good and incredibly fast.

Seeing this new phenomenon must be difficult for those people who have spent a long time perfecting their craft. Essentially, they might feel that their skillsets are being undermined. It would be especially hard for people who associate a lot of their self-identity with their job.

Being a purist is noble, but I think that this stance is foolish. Essentially, people who chose not to use AI code tools will be overtaken by the people who do. That's the unfortunate reality.

loktarogar 6/26/2025|

It's not a stance about the merits of AI generated code but about the legal status of it, in terms of who owns it and related concepts.

pretoriusdre 6/26/2025||

Yes the reasoning behind the decision is clear and as you described. But I would also make the point that the decision also comes with certain consequences, to which a discussion about merits is directly relevant.

loktarogar 6/26/2025||

> Essentially, people who chose not to use AI code tools will be overtaken by the people who do. That's the unfortunate reality.

Who is going to "overtake" QEMU, what exactly does that mean, and what will it matter if they are?

danielbln 6/26/2025||

OP said people. QEMU is not people.

loktarogar 6/26/2025||

We're talking about a decision that the people behind QEMU made that affects people, to which the consequences of made the discussion of merits "directly relevant".

If we're talking about something that neither involving QEMU nor the people behind it, where is the relevance? It's just a rant on AI at that point.

sysmax 6/26/2025|

I wish people would make distinction regarding the size/scope of the AI-generated parts. Like with video copyright laws, where a 5-second clip from a copyrighted movie is usually considered fair use and not frowned upon.

Because for projects like QEMU, current AI models can actually do mind-boggling stuff. You can give it a PDF describing an instruction set, and it will generate you wrapper classes for emulating particular instructions. Then you can give it one class like this and a few paragraphs from the datasheet, and it will spit out unit tests checking that your class works as the CPU vendor describes.

Like, you can get from 0% to 100% test coverage several orders of magnitude faster than doing it by hand. Or refactoring, where you want to add support for a particular memory virtualization trick, and you need to update 100 instruction classes based on straight-forward, but not 100% formal rule. A human developer would be pulling their hairs out, while an LLM will do it faster than you can get a coffee.

halostatue 6/26/2025||

Not all jurisdictions are the US, and not all jurisdictions allow fair use, but instead have specific fair dealing laws. Not all jurisdictions have fair dealing laws, meaning that every use has to be cleared.

There are simple algorithms that everyone will implement the same way down to the variable names, but aside from those fairly rare exceptions, there's no "maximum number of lines" metric to describe how much code is "fair use" regardless of the licence of the code "fair use"d in your scenario.

Depending on the context, even in the US that 5-second clip would not pass fair use doctrine muster. If I made a new film cut entirely from five second clips of different movies and tried a fair use doctrine defence, I would likely never see the outside of a courtroom for the rest of my life. If I tried to do so with licensing, I would probably pay more than it cost to make all those movies.

Look up the decisions over the last two decades over sampling (there are albums from the late 80s and 90s — when sampling was relatively new — which will never see another pressing or release because of these decisions). The musicians and producers who chose the samples thought they would be covered by fair use.

echelon 6/26/2025|||

Qemu can make the choice to stay in the "stone age" if they want. Contributors who prefer AI assistance can spend their time elsewhere.

It might actually be prudent for some (perhaps many foundational) OSS projects to reject AI until the full legal case law precedent has been established. If they begin taking contributions and we find out later that courts find this is in violation of some third party's copyright (as shocking as that outcome may seem), that puts these projects in jeopardy. And they certainly do not have the funding or bandwidth to avoid litigation. Or to handle a complete rollback to pre-AI background states.

762236 6/26/2025||

It sounds like you're saying someone could rewrite Qemu on their own, with the help of AI. That would be pretty funny.

mrheosuper 6/26/2025||

Given enough time, a monkey randomly types on typewriter can rewrite QEMU.