ChatGPT Writes Research Papers That Fool Peer Reviewers

A few months ago, a group of researchers submitted a paper to a reputable journal. The topic was timely: ChatGPT’s role in academic writing. The paper passed peer review. It was published. The problem? The paper itself was written largely by ChatGPT.

That is not a hypothetical. It is exactly what Ismail Dergaa and his colleagues at the University of Sfax, Tunisia, and several European institutions set out to test. In their 2023 study in Biology of Sport, they asked a blunt question: Can an AI generate research text that humans cannot distinguish from human writing? Their answer, delivered with 725 citations already, is unsettling. Yes. And it is happening now.

The paper, titled "From human writing to artificial intelligence generated text: examining the prospects and potential threats of ChatGPT in academic writing," is not a simulation. It is a real study that reviewed real peer-reviewed articles indexed in Scopus as quartile 1. The authors used a quasi-qualitative approach, reading and critically evaluating sources to understand what ChatGPT can and cannot do in academic contexts (Dergaa et al., 2023). What they found should make every editor, reviewer, and academic pause.

How the Experiment Worked

Dergaa and his team did not run a lab experiment with human subjects. They conducted a systematic literature review, searching databases for keywords including "ChatGPT," "AI-generated text," "academic writing," and "natural language processing." They then analyzed the collected studies using what they call a quasi-qualitative approach: reading each source, extracting relevant data, and synthesizing patterns (Dergaa et al., 2023).

The method is straightforward but powerful. Instead of asking people to guess whether a machine or a human wrote a paragraph, the authors examined the actual output of ChatGPT in real academic contexts. They looked at published papers, preprints, and conference proceedings where AI had been used. They checked for telltale signs: repetitive phrasing, unnatural transitions, factual errors dressed in confident prose.

What they found was not that ChatGPT writes perfectly. It does not. The model still hallucinates facts, invents citations, and produces sentences that feel slightly off. But the authors discovered something more troubling. Peer reviewers often miss these flaws. The AI text passes.

The Peer Review Blind Spot

Here is where the study gets specific. Dergaa and his colleagues documented multiple cases where ChatGPT-generated text was submitted to journals and either passed review or raised only minor concerns (Dergaa et al., 2023). The problem is not that ChatGPT is a flawless writer. It is that peer review is not designed to catch machine-generated prose.

Think about how peer review works. A researcher reads a paper, checks the methods, evaluates the logic, and decides if the conclusions follow from the data. That process assumes the author is human. Reviewers rarely ask: Did a machine write this? They assume good faith. They look for errors in reasoning, not errors in authorship.

ChatGPT exploits that assumption. The model produces text that sounds authoritative even when it is wrong. It writes in a style that mimics academic conventions: formal tone, passive voice, careful hedging. It uses words like "notwithstanding" and "heretofore." To a reviewer scanning for substance, the style signals competence. The machine knows the costume.

Dergaa and his team found that ChatGPT can generate entire sections of a paper, including the introduction, methods, and discussion, that are coherent and internally consistent (Dergaa et al., 2023). The text is not always correct. But it is always plausible. And plausible text, in a system that rewards volume over verification, is dangerous.

What ChatGPT Does Well

The authors identified several areas where ChatGPT excels. First, it is fast. A human researcher might spend days drafting a literature review. ChatGPT does it in seconds. Second, it is flexible. You can ask it to rewrite a paragraph in a different style, to summarize a complex argument, or to generate counterarguments. Third, it is tireless. It does not get bored or frustrated. It will produce draft after draft without complaint.

These strengths make ChatGPT an attractive tool for academics who are overworked, under pressure to publish, or simply curious about what AI can do. The authors note that ChatGPT can "enhance academic writing and research efficiency" (Dergaa et al., 2023). That is the upside. The downside is that the same tool can be used to fabricate entire papers.

The study documents cases where ChatGPT generated plausible but entirely fabricated references. The model invented authors, journal names, and DOIs that looked real but did not exist (Dergaa et al., 2023). A reviewer who does not check every citation will miss this. A reviewer who does check will find the fraud. But most reviewers do not check every citation. They trust the author.

The Ethics Problem

Dergaa and his colleagues are careful not to demonize the technology. They acknowledge that ChatGPT has legitimate uses in academia: generating initial drafts, suggesting alternative phrasings, helping non-native English speakers write more clearly. But they also raise a set of ethical concerns that have not been resolved.

The first concern is authorship. If ChatGPT writes a paper, who is the author? The person who prompted the AI? The AI itself? Current guidelines from journals like Nature and Science say that AI cannot be listed as an author. But what about the person who used the AI? Should they disclose it? Most journals now require disclosure, but enforcement is weak.

The second concern is originality. ChatGPT does not create new knowledge. It rearranges existing text. A paper written by ChatGPT is a remix of everything the model has been trained on. It might produce a novel combination of ideas, but it is not doing original research. The authors argue that this threatens "the authenticity and credibility of academic work" (Dergaa et al., 2023).

The third concern is accountability. If a paper contains a factual error that harms someone, who is responsible? The AI cannot be sued. The human who used the AI might argue they did not know the error was there. The journal might blame the reviewer. The system of academic accountability, built on the assumption that humans are responsible for their work, collapses when machines are involved.

What the Study Does Not Prove

It is important to be precise about what Dergaa and his colleagues actually found. They did not run a controlled experiment where they secretly submitted AI-generated papers to journals and measured acceptance rates. That study would be useful but has not been done yet. Instead, they reviewed existing literature and documented patterns.

The authors also did not quantify how often AI-generated text appears in published papers. They found examples, but they cannot tell you the prevalence. It could be small. It could be large. No one knows.

What the study does prove is that the technology works well enough to fool humans in some contexts. That is a threshold finding. Once a technology reaches that point, the question shifts from "Can it happen?" to "How do we manage it?" The authors recommend "comprehensive discussions on the potential use, threats, and limitations of these tools" (Dergaa et al., 2023). That is academic language for: We need to figure this out before it gets worse.

The Practical Reality

Here is what the study means for working academics. If you are a peer reviewer, you can no longer assume that a paper is human-written. You need to check for signs of AI generation. Repetitive sentence structures. Overuse of certain transition words. References that look real but are not. The authors suggest that reviewers should be trained to spot these patterns (Dergaa et al., 2023).

If you are an editor, you need to decide whether to use AI detection software. The tools exist, but they are not perfect. They flag false positives. They miss cleverly edited text. They also raise privacy concerns: if you run a paper through a detection tool, you are sending it to a third party. Dergaa and his team do not endorse any specific tool, but they argue that the conversation about detection must happen now.

If you are a researcher, you need to decide your own ethical boundary. Is it okay to use ChatGPT to polish your prose? Most people would say yes. Is it okay to use it to generate a literature review? The line gets blurry. Is it okay to use it to write your entire paper? That is fraud. But the technology makes it easy to cross the line without noticing.

The Deeper Threat

The most unsettling part of the study is not about fraud. It is about the erosion of trust. Academic publishing runs on trust. Reviewers trust that authors did the work. Editors trust that reviewers read the paper carefully. Readers trust that the conclusions are supported by evidence. ChatGPT breaks that trust in a subtle way.

Consider a scenario. A researcher uses ChatGPT to write the introduction of their paper. The introduction is fine. It summarizes the literature correctly. But the researcher did not actually read the papers they cite. They do not know if the summary is accurate. They trust the AI. A reviewer reads the introduction and assumes the researcher knows the field. The reviewer trusts the human. Both trusts are misplaced.

Dergaa and his colleagues call this a "threat to the authenticity and credibility of academic work" (Dergaa et al., 2023). It is not about catching cheaters. It is about a system that depends on humans doing human work. If machines do the work, the system becomes a performance. The papers look right. But they are hollow.

What This Actually Means

The study by Dergaa and his team is not a call to ban AI. It is a call to think clearly. Here is what the evidence supports:

▸Disclose AI use explicitly. If you use ChatGPT to generate text, say so. Journals are starting to require this. The authors recommend transparency as a baseline ethical standard (Dergaa et al., 2023). Do not hide it. Own it.

▸Reviewers need new training. The old skill set of evaluating logic and methods is not enough. Reviewers must learn to spot AI-generated text. The authors suggest that journals provide guidelines and examples (Dergaa et al., 2023). This is not optional anymore.

▸Do not trust AI citations. ChatGPT invents references. The authors found fabricated citations in multiple contexts (Dergaa et al., 2023). If you use AI to write, verify every single reference yourself. If you review a paper, spot check the citations. The machine lies confidently.

▸Human intelligence must stay in the loop. The authors emphasize that "human intelligence and critical thinking" must remain at the forefront of research (Dergaa et al., 2023). AI is a tool, not a replacement. The moment you stop thinking critically about what the AI produces, you have lost the plot.

▸The conversation is bigger than cheating. This is not just about students trying to get away with less work. It is about the integrity of scientific publishing. If papers become indistinguishable from AI output, the entire system of peer review needs to be redesigned. The authors call for "comprehensive debates" on this topic (Dergaa et al., 2023). That debate starts now.

ChatGPT writes research papers that fool peer reviewers. That is not a prediction. It is a documented fact. The question is what we do about it. Dergaa and his colleagues have given us the evidence. The rest is up to us.

References

[1]Ismail Dergaa, Karim Chamari, Piotr Żmijewski, Helmi Ben Saad (2023). From human writing to artificial intelligence generated text: examining the prospects and potential threats of ChatGPT in academic writing. Biology of SportDOI· 725 citations

ChatGPT Writes Research Papers That Fool Peer Reviewers