Posted by dw64 4 days ago
That said, AI-generated papers have already been spotted in other disciplines besides cs, and some of them are really obvious (arXiv:2508.11634v1 starts with a review of a non-existing paper). I really hope arXiv won't react by narrowing its scope to "novel research only"; in fact there is already AI slop in that category and it is harder to spot for a moderator.
("Peer-reviewed papers only" is mostly equivalent to "go away". Authors post on the arXiv in order to get early feedback, not just to have their paper openly accessible. And most journals at least formally discourage authors from posting their papers on the arXiv.)
Don’t understand why it restricted one category when the problem spans multiple categories.
So many "research papers" by "AI companies" that are blog posts or marketing dressed up as research. They contribute nothing and exist so the dudes running the company can point to all their "published research".
PaperMatch [1] helps solve this problem (large influx of papers) by running a semantic search on top of abstracts, for all of arXiv.
Are you saying that there's an automated method for reliably verifying that something was created by an LLM?
No, not really. From the blog post:
> In the past few years, arXiv has been flooded with papers. Generative AI / large language models have added to this flood by making papers – especially papers not introducing new research results – fast and easy to write. While categories across arXiv have all seen a major increase in submissions, it’s particularly pronounced in arXiv’s CS category. > [...] > Fast forward to present day – submissions to arXiv in general have risen dramatically, and we now receive hundreds of review articles every month. The advent of large language models have made this type of content relatively easy to churn out on demand, and the majority of the review articles we receive are little more than annotated bibliographies, with no substantial discussion of open research issues.
If so, I think the solution is obvious.
(But I remind myself that all complex problems have a simple solution that is wrong.)
That's without even being able to backprop through the annotator, and also with me actively trying to avoid reward hacking. If arxiv used an open model for review, it would be trivial for people to insert a few grammatical mistakes which cause them to receive max points.
Doubt
LLMs are experts in generating junk. And generally terrible at anything novel. Classifying novel vs junk is a much harder problem.
1. Require LLM produced papers to be attributed to the relevant LLM and not the person who wrote the prompt.
2. Treat submissions that misrepresent authorship as plagiarism. Remove the article, but leave an entry for it so that there is a clear indication that the author engaged in an act of plagiarism.
Review papers are valuable. Writing one is a great way to gain, or deepen, mastery over a field. It forces you to branch out and fully assimilate papers that you may have only skimmed, and then place them in their proper context. Reading quality review papers is also valuable. They're a great way for people new to a field to get up to speed and they can bring things that were missed to the fore, even for veterans of the field.
While the current generation of AI does a poor job of judging significance and highlighting what is actually important, they could improve in the future. However, there's no need for arXiv to accept hundreds of review papers written by the same model on the same field, and readers certainly don't want to sift through them all.
Clearly marking AI submissions and removing credit from the prompters would adequately future-proof things for when, and if, AI can produce high quality review papers. Clearly marking authors who engage in plagiarism as plagiarists will, hopefully, remove most of the motivation to spam arXiv with AI slop that is misrepresented as the work of humans.
My only concern would be for the cost to arXiv of dealing with the inevitable lawsuits. The policy arXiv has chosen is worse for science, but is less likely to get them sued by butt-hurt plagiarists or the very occasional false positive.
If you want to blame someone, blame all the people LARPing as AI researchers.
Meanwhile, banning review articles written by humans would be harmful in many fields. I'm not in CPSC, but I'd hate to see this policy become the norm for all disciplines.
I have to agree with their justification. Since "Attention Is All You Need" (2017) I have seen maybe four papers with similar impact in the AI/ML space. The signal to noise ratio is really awful. If I had to pick a semi-related paper published since 2020 that I actually found interesting, it would have to be this one: https://arxiv.org/abs/2406.19108 I cannot think of a close second right now.
All of the machine learning papers are pure slop to me now. The last one I looked at had an abstract that was so long it put me to sleep. Many of these papers aren't attempting basic decorum anymore. Mandatory peer review would fix a lot of this. I don't think it is acceptable for the staff at arXiv to have to endure a Sisyphean mountain of LLM shit. They definitely need to push back.
Not every paper can be a world-changing breakthrough. Which doesn't mean that more modest papers are noise (although some definitely are). What Kuhn calls "normal science" is also needed for science to work.