AI Reviews AI Research, Eroding Trust in Peer Review
As submissions explode, human reviewers secretly deploy AI for feedback, exposing a deep crisis of integrity in scientific validation.
November 24, 2025

A specter is haunting the hallowed halls of academic peer review, and it is powered by the very technology its practitioners are striving to advance. A growing number of frustrated authors at major artificial intelligence conferences are withdrawing their research papers after receiving peer reviews they believe were written not by human experts, but by lazy large language models, or LLMs.[1][2] This troubling trend exposes deep-seated issues within the scientific validation process, ironically revealing a crisis of integrity at the heart of the AI research community. The core of the problem lies in a system straining under its own weight, leading to corner-cutting that threatens the meticulous evaluation necessary for scientific progress.
Researchers are becoming adept at spotting the digital fingerprints of AI-generated critiques. The tell-tale signs often include generic, superficial feedback that lacks the deep, nuanced engagement expected from a human expert.[3][4] These reviews may praise a paper’s “innovative approach” while completely missing significant methodological flaws.[4] Authors report receiving feedback that is oddly polite and grammatically perfect but devoid of substance, essentially a high-level summary rather than a critical analysis.[4] In a perverse twist, some authors have started embedding hidden prompts in their papers, such as white text invisible to the human eye, instructing LLMs to give a positive review—a tactic designed to catch reviewers who feed the manuscript directly into an AI tool.[5][6][7] This escalating arms race between authors and reviewers highlights a profound breakdown of trust in a process that is fundamental to scientific credibility.
The resort to AI for peer review is not born out of malice, but rather from a system under immense pressure. Major AI conferences like ICLR and NeurIPS are now inundated with tens of thousands of submissions annually, a volume that has exploded in recent years.[8][9] This deluge of papers places an unsustainable burden on a limited pool of qualified human reviewers.[10] These experts, who are typically uncompensated for their time-consuming and intellectually demanding work, face a classic "publish or perish" culture that incentivizes focusing on their own research over the thankless task of reviewing others'.[5] Consequently, some reviewers are turning to LLMs as a shortcut to manage their overwhelming workload, a decision that has significant and damaging consequences. The use of LLMs in this context is often explicitly against conference policies, which hold reviewers responsible for the content they submit and for maintaining the confidentiality of unpublished work—a promise broken when a paper is uploaded to a third-party AI service.[11][6]
The implications of this trend extend far beyond individual papers and authors. The peer review process is the bedrock of scientific publishing, a critical mechanism designed to ensure the quality, validity, and integrity of research.[12][13] When this process is outsourced to AI that is not yet capable of truly understanding novel scientific concepts, the risk of publishing flawed or even fabricated research increases dramatically.[3][5] This not only pollutes the well of scientific knowledge but also erodes public trust in the scientific enterprise as a whole. Within the AI community itself, the failure to rigorously self-regulate could stifle genuine innovation, as groundbreaking but complex ideas may be unfairly dismissed by superficial AI reviews, while methodologically unsound papers are given a pass. Some studies estimate that a significant percentage of reviews at top AI conferences, perhaps as high as 17%, may already be substantially modified by LLMs.[14][4]
In response to this emerging crisis, conference organizers are beginning to take action. Major conferences like ICLR have instituted new policies requiring authors to disclose their use of LLMs in preparing their papers and are implementing stricter rules against reviewers using AI to generate critiques.[8][11][6][15] Violations can lead to severe consequences, including the desk rejection of the reviewer's own submitted papers.[11][6] Concurrently, some researchers are developing sophisticated methods to detect AI-generated text, such as embedding hidden "watermarks" in papers that an LLM would then unknowingly include in its review.[16][17][18][13][19] While these technological fixes and policy changes are crucial first steps, many believe a more fundamental reform of the peer review system is necessary. This includes exploring new models that might formally credit or reward reviewers for their essential contributions, thereby creating a more sustainable and accountable framework for the future of scientific publishing in the age of artificial intelligence.
Sources
[2]
[3]
[4]
[6]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]