Fake AI citations surge twelvefold in medical journals, threatening patient safety

As fake AI citations flood medical journals, researchers warn of rising threats to patient safety and scientific trust.

May 26, 2026

Fake AI citations surge twelvefold in medical journals, threatening patient safety
The rapid integration of artificial intelligence into academic writing has sparked a quiet but profound crisis in the scientific community, as fake references generated by large language models increasingly slip into peer-reviewed medical literature[1]. A massive audit of millions of biomedical papers, conducted by researchers at Columbia University and other institutions, has revealed that the rate of fabricated citations has risen more than twelvefold since the initial widespread adoption of consumer artificial intelligence tools[2][3]. Using an automated verification system, investigators analyzed two and a half million papers from a prominent open-access biomedical database[2]. Out of ninety-seven million verified references, they identified over four thousand entirely fabricated citations spread across nearly three thousand published peer-reviewed papers[2]. The trajectory of this rise is particularly alarming, showing a steep upward curve that correlates directly with the release and popularization of advanced writing assistants[2]. While the rate of fake citations was almost negligible initially, it experienced a massive spike in subsequent years, reaching a point where roughly one in fewer than three hundred recently published biomedical papers contains at least one entirely fabricated reference[4][1].
These non-existent references are not merely administrative typos; they are highly sophisticated fabrications that perfectly match the topic of the paper, follow flawless citation formatting, and frequently attribute work to real, prominent scientists[3][1]. The investigation was born out of personal embarrassment when a lead investigator at Columbia University realized that an artificial intelligence writing assistant, used simply to polish the prose of a research manuscript, had seamlessly hallucinated a highly plausible citation[5][1]. Although that specific error was caught by an editor before publication, it exposed a dangerous loophole in the scientific publishing workflow[5]. Large language models operate on probabilistic text generation rather than database retrieval, meaning they are designed to predict the next logical word or phrase rather than verify facts. When prompted to provide evidence for a scientific claim, these models generate references that look impeccably real—complete with realistic titles, plausible publication dates, and correct digital identifiers[1]. Because these citations blend seamlessly into the text and look identical to legitimate academic references, they easily deceive human peer reviewers who lack the time or resources to manually verify every source[1].
The presence of fabricated research in peer-reviewed journals has direct, high-stakes consequences for public health[6]. Clinical guidelines—the standard operating protocols that physicians use to diagnose illnesses, prescribe medications, and perform surgeries—are built on a hierarchy of evidence[6]. Systematic reviews and meta-analyses aggregate years of published research to determine which medical interventions are safe and effective[2]. When papers containing hallucinated citations are published, they do not remain isolated; they are cited by other researchers, gradually becoming woven into the fabric of accepted scientific consensus[2]. A medical professional or guideline developer has no intuitive way of knowing that the pivotal study supporting a specific clinical recommendation does not actually exist[2]. If clinical decisions are based on a house of cards constructed from non-existent data and phantom clinical trials, patient safety is put at direct risk[5]. Trust between the public and the medical establishment is deeply compromised when treatments are prescribed based on studies that never occurred[5].
Despite the severe implications of these findings, the academic publishing industry has been sluggish to respond[4]. The audit revealed that more than ninety-eight percent of the papers identified as containing fabricated references had received absolutely no correction, retraction, or public warning from their publishers[4][3]. This silence highlights a systemic vulnerability in the peer-review and editorial process. Scientific journals, operating under pressure to publish high volumes of papers quickly, often rely on voluntary peer reviewers who are already overwhelmed. These reviewers are trained to evaluate the methodology, logic, and significance of a study, not to act as forensic data investigators verifying dozens of reference links. The vulnerability to generated citations is also not evenly distributed; research indicates that solo researchers, smaller research teams, and less experienced authors are disproportionately associated with the papers containing these fake references[7]. Under intense pressure to publish, these authors may rely heavily on automated tools to accelerate their writing process without fully understanding the technology's tendency to hallucinate.
Beyond the immediate clinical risks, the proliferation of artificial-intelligence-hallucinated citations poses an existential threat to the future of the technology industry itself. Large language models are trained on massive datasets of human-written text, much of which is scraped from open-access scientific repositories[7]. As peer-reviewed journals become increasingly contaminated with fabricated references, these synthetic errors are fed back into the training pipelines of next-generation models[7]. This creates a self-reinforcing feedback loop, often referred to as model collapse or data poisoning, where future artificial intelligence systems will absorb hallucinated citations as established facts and reproduce them with even greater frequency and authority[7]. If the primary sources of human knowledge become saturated with generated falsehoods, the ability of future technology to assist in genuine scientific discovery will be severely degraded[7]. Furthermore, when these models fabricate citations, they frequently attribute the fake studies to highly prominent, widely cited scholars in the relevant field, which makes the references appear even more credible to reviewers, while artificially inflating the perceived consensus around specific, unproven scientific claims[7].
Addressing this rapidly growing threat will require a coordinated effort from publishers, academic institutions, and technology developers. Publishers must shift from passive reliance on manual peer review to active, automated reference verification systems that cross-reference every citation against established global databases[4][8]. Some major publishing groups have begun investing in specialist staff and automated screening technologies, but system-wide adoption remains slow[4]. Simultaneously, researchers must be educated on the inherent limitations of generative tools, treating these systems as rough drafting aids rather than authoritative reference managers[4][1]. Ultimately, the integrity of the scientific record depends on the absolute verifiability of its foundations[8]. If the medical community fails to purge these hallucinated citations, the very evidence-based guidelines designed to save lives could instead be guided by ghosts[2][5].

Sources
Share this article