Artificial intelligence in medical writing: the inconvenient truth behind our automated future

Koulaouzidis, Anastasios; Koulaouzidis, George; Marlicz, Maria; Charisopoulou, Dafni; Marlicz, Wojciech

Letters to the Editor

Artificial intelligence in medical writing: the inconvenient truth behind our automated future

Anastasios Koulaouzidis¹^,²^,³, George Koulaouzidis⁴, Maria Marlicz³, Dafni Charisopoulou⁵, Wojciech Marlicz³
¹ Department of Clinical Research, University of Southern Denmark, Odense, Denmark

² Research Unit, Department of Surgery, Odense University Hospital, Svendborg, Denmark

³ Department of Gastroenterology, Pomeranian Medical University, Szczecin, Poland

⁴ Department of Biochemical Sciences, Pomeranian Medical University, Szczecin, Poland

⁵ Paediatric Cardiology Department, Great Ormond Street Hospital, London, United Kingdom

DOI: 10.20452/pamw.16985

Published online: March 24, 2025.
CC BY 4.0

In this article

To the editor

Article information

To the editor

Artificial intelligence (AI), when appropriately applied in medical writing, is being heralded as a seismic shift in the field—akin to a “Gutenberg moment” in medicine—that has the potential to enhance the efficiency and value of medical communications.¹ On the other hand, we are beginning to witness the dawn of a crisis in the medical literature. The implications are profound. With projections predicting an exponential increase in generative AI content, a near‑future scenario could see a body of scientific knowledge created, edited, and reviewed by machines rather than humans.

Present role of artificial intelligence in scientific writing

The number of AI‑generated papers has grown exponentially since the early 2020s. A recent editorial published in the International Journal of Information Management discusses the rapidly increasing integration of AI tools, such as ChatGPT, into academic workflows, highlighting their growing role in research synthesis, and manuscript drafting and editing.² Another editorial published in the Journal of the American Medical Association highlights the growing prevalence of AI in academic research, noting its increasing contribution to various aspects of the research and writing processes, particularly in submissions to high‑impact journals.³ This trend also extends to lower‑impact journals, where AI tools are being utilized to streamline research synthesis and improve the overall quality of submissions. The initial hostility toward AI‑enhanced submissions gradually turns into growing tolerance and acceptance. Why is this shift happening? The answer lies in efficiency. AI‑driven tools can condense months of literature searches into just a few days.⁴ AI is estimated to reduce the time required to write a systematic review by up to 70%,⁴ enabling researchers to publish more frequently—an invaluable advantage in the industry where the pressure to publish is relentless.

Exponential growth: forecast and consequences

According to projected growth rates, AI‑assisted and AI‑generated papers will account for 15% of all papers in the medical literature (including articles indexed in PubMed / Medline) by 2025, and their share will be even greater by 2030.⁵ Experts predict that by 2032, over half of medical research papers will be at least partly written by AI. These projections should not be taken lightly—AI significantly accelerates publication, access, and preprint processes, at least in the short term. As we stand on the threshold of an AI‑driven era in medical writing, we must grapple with the profound implications of letting algorithms take the lead. Could this shift risk overshadowing the unique, irreplaceable insights that emerge from lived clinical experience, which algorithms cannot replicate?⁶ Are we creating a body of medical literature resembling a “join‑the‑dots” data mashup devoid of new wisdom? The implications of this trend could fundamentally alter the nature of medical research.⁷

The “black box” problem: is artificial intelligence–authored research trustworthy?

A significant problem with AI‑generated content is transparency, or rather, the lack of it. This is especially true when considering machine learning models, which are often regarded as black boxes, particularly in the case of deep learning. Such models can produce outputs based on weak or unverifiable premises, rendering it impracticable (if not futile) to trace the source(s) of their effects. This issue is of great concern in medical science, where understanding the “why” behind every observation is arguably more important than the observation itself. We face a considerable risk of replacing human insight with AI’s black box opaque outputs, which cannot always be thoroughly checked or understood.⁸ AI aims to understand the decision‑making mechanisms of its systems in a manner interpretable for the users to build trust.⁹ However, many AI users are unfamiliar with the processes and products of these models, making it difficult for them to form proper expectations or trust. This often results in skepticism regarding AI’s capabilities¹⁰ and highlights the necessity for transparency of AI processes to enhance user trust and acceptance.

Additionally, the quality of AI’s output remains a subject of doubt. A 2022 study published in Nature by Stanford researchers found that although AI‑generated reviews were well structured, they often lacked nuance, context, and ethical sensitivity. For example, when provided with insufficient context, AI erroneously recommended contraindicated medications—a simple mistake that a seasoned clinician or researcher would immediately recognize.¹¹ Moreover, researchers found that people tend to favor human authors over AI authors, primarily due to entrenched biases against AI and the inherent credibility of human content.¹²

Ethics and integrity issues: where does authorship start and stop?

The ethical challenges surrounding AI are piling up. The question of authorship of work created by AI is particularly complex. As AI systems become increasingly sophisticated, whether a piece of work is human- or machine‑generated becomes less apparent, posing significant challenges for authorship attribution. This has far‑reaching implications for questions of ownership over the moral rights associated with AI‑generated works. Should these rights belong to the AI, its creators, or its users? Addressing the dilemma requires re‑evaluating the existing copyright system and moral rights framework to account for the unique nature of AI‑generated content.¹³ This year, one of the major academic publishers, Springer Nature, announced that they would accept AI‑generated text, provided that the AI is credited as a coauthor.¹⁴ This sets a dangerous precedent, and carries a risk of becoming a slippery slope.

If AI is considered the author, should we assume the researcher’s integrity is compromised? Alternatively, if large language models (LLMs) become part of the medical canon, should they be held to the same rigorous standards of scrutiny as human investigators?

AI also poses a significant risk of contributing to the spread of disinformation. Tools such as ChatGPT have demonstrated a troubling capacity to produce plausible but false information. However, AI models are only as reliable as the datasets they are trained on—datasets that may inadvertently reflect biases within the health system. For instance, systemic biases can be introduced when an AI model is trained on a dataset of patients that is not sufficiently representative, leading to poorer performance for minority communities.¹⁵

Additionally, based on the historical context of medical prejudice, common misconceptions—such as the belief in relatively greater pain tolerance in black patients as compared with white patients—can lead to suboptimal treatment decisions despite being fundamentally flawed.¹¹ A recent study published in npj Digital Medicine found that medical LLMs are vulnerable to misinformation attacks, where minor manipulations in their training data or parameters can lead them to generate incorrect but believable biomedical information.¹⁵ This is no minor issue, especially in a domain where quality of life is at stake.

The next great bugbear: will artificial intelligence abuse its power?

The rapid incorporation of AI technology presents a significant risk of abuse due to the absence of a robust regulatory framework or ethical guidelines. Currently, no widely accepted standards exist to regulate AI technology effectively. Pharma companies and other industries are already using AI to rapidly create and submit research papers for journal placement.¹ This practice has shifted from being a mere convenience to a competitive advantage.

More narrowly, the risk of AI bias being exploited by industry corporations to oversaturate reputable journals with desirable but shallow studies is a concerning prospect for which regulatory authorities are largely unprepared. The medical community must establish clear use cases for AI in research and writing. This includes setting standards for the ethical use of AI tools, training medical writers to evaluate AI‑generated content critically, and fostering a culture of accountability among medical professionals.¹³

If left unchecked, we risk approaching an academic system where quantity precedes quality. AI could produce content that gives the illusion of “research findings” at an unprecedented pace but fails to meet the rigorous standards expected in science.

A 2023 survey published in PLOS One revealed that 62% of respondents believed AI‑assisted papers would shape the future academic landscape. This finding raised concerns, as respondents remarked that the proliferation of such papers could devalue unique, labor‑intensive research.¹²

The future: rules or runaway automation?

The answer is not, in any sense, the “unmaking” of AI in medical writing but, rather, the precise and perhaps unbiased control over its activity. Institutions, academic journals, and government bodies must establish clear boundaries to prevent AI from encroaching on or overpowering human intelligence, ultimately allowing it to enhance human knowledge and thinking. This includes setting clear criteria for when and how AI should be used in research and defining what “authorship” truly means.

Failing to address these challenges could have disastrous consequences. A decade from now, we may question why we allowed machines to take control of our clinical science dialogue. The uncontrolled proliferation of AI in medical writing is not merely an academic concern but an existential threat to the integrity of medicine.

But what happens if the academic and regulatory communities do nothing? In that case, we risk drowning in a sea of AI‑generated data, where truth and precision are sacrificed for speed and convenience. Ultimately, it is up to human society to decide whether we value understanding the unique contributions of human beings vs algorithms.

ARTICLE INFORMATION

Conflict of interest: None declared.

References

Thomson E. The Gutenberg Parenthesis: how does studying 500 years of the printing press help us tackle the era of AI? World Economic Forum. https://www.weforum.org/stories/2024/01/gutenberg‑parenthesis‑ai‑internet‑printing‑press/. Accessed January 27, 2025.
Dwivedi YK, Kshetri N, Hughes L, et al. Opinion Paper: “So what if ChatGPT wrote it?” Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy. Int J Inf Manage. 2023; 71: 102642. | Crossref
Flanagin A, Bibbins‑Domingo K, Berkwits M, Christiansen SL. Nonhuman “authors” and implications for the integrity of scientific publication and medical knowledge. JAMA. 2023; 329: 637‑639. | Crossref
Joos L, Keim DA, Fischer MT. Cutting through the clutter: the potential of LLMs for efficient filtration in systematic literature reviews. arXiv preprint. 2024; 2407.10652.
Berbís MA, McClintock DS, Bychkov A, et al. Computational pathology in 2030: a Delphi study forecasting the role of AI in pathology within the next decade. EBioMedicine. 2023; 88: 104427. | Crossref