ChatGPT-4 can enhance clinical workflow by recommending case-specific MRI protocols, optimizing protocol selection, and leading to relevant acquisition time savings, a new German-led study has found.
“ChatGPT-4 demonstrated a very high agreement with board-certified (neuro-)radiologists in selecting MRI protocols and was able to suggest approved time saving protocols from the set of available sequences,” noted Dr. Zeynep Bendella, a neuroradiologist at the Clinic of Neuroradiology, University Hospital Bonn, and the German Center for Neurodegenerative Diseases (DZNE), and colleagues in an article posted on 13 September by the European Journal of Radiology (EJR).
While ChatGPT-4 is emerging as a promising tool, its integration should be undertaken with care, highlighting the synergistic relationship between AI and human expertise, according to the authors. “It must be emphasized that the final decision-making authority regarding the protocol to be applied and the full responsibility remain with the radiologist.”
Study aims and logistics
The central aim of the study was to determine whether ChatGPT-4 can correctly suggest MRI protocols and additional MRI sequences based on real-world Radiology Request Forms (RRFs), as well as to investigate the ability of ChatGPT-4 to suggest time-saving protocols.
The team analyzed a total of 1,001 RRFs from the Department of Neuroradiology in Bonn (referred to as "in-house" dataset), along with another 300 RRFs from overseas (referred to as "external" dataset). Data acquisition was conducted between August 2023 and July 2024. The patients’ age, sex, and clinical information were extracted from the RRFs and used to prompt ChatGPT-4 to choose an adequate MRI protocol from predefined institutional lists. Four independent raters then assessed its performance, and ChatGPT-4 was tasked with creating case-specific protocols aimed at saving time.
When ChatGPT-4 was tasked to freely compose time-efficient MRI protocols from the available sequences, these suggestions were approved in 766 out of 1,001 (76.5%) of the in-house cases and 140 of the 300 (46.7%) external cases, leading to clear time savings, as shown in the table below.
Time savings when ChatGPT-4 freely composed time-efficient MRI protocols from the given sequences. Time savings were calculated only for protocols adopted as diagnostically acceptable and shorter than the predefined standard protocols. Courtesy of Dr. Zeynep Bendella et al and EJR.
In the total sample, the MRI protocol suggestions by ChatGPT-4 saved between 16% and 17% of the acquisition time compared to the predefined MRI protocols in the adopted cases, the researchers stated.
“These time savings were particularly pronounced when ChatGPT-4 was allowed to freely compose MRI protocols based solely on the clinical question and available sequences without being constrained by predefined institutional protocols,” they added. “This flexible approach enabled the model to tailor imaging strategies more precisely to the diagnostic need, often omitting unnecessary sequences while maintaining diagnostic quality.”
While predefined protocols aim to standardize imaging and ensure comprehensive diagnostic assessment, ChatGPT-4 demonstrated the ability to dynamically adjust protocol complexity in a time-efficient manner, offering a promising tool to optimize workflow and resource use in clinical practice, the authors continued.
“To assess the broader applicability of ChatGPT-4 in radiology, future research should investigate its performance in musculoskeletal, thoracic, abdominal, and other subspecialties using similarly structured validation frameworks,” they pointed out. “Future studies will explore other open source large-language models (LLMs) hosted publicly as well as locally.”
Overall, the study confirms the feasibility of integrating AI tools like ChatGPT-4 as a means of support within clinical workflows, potentially leading to more efficient and patient-tailored radiological assessments, the researchers concluded.
“Furthermore, it could serve as a valuable adjunct for educational and training purposes, complementing the standard care provided by experienced neuroradiologists,” they wrote. “Importantly, this application of ChatGPT-4 is readily deployable in the clinical environment without requiring additional implementation efforts or raising new data protection concerns.”
Reflections from the authors
The researchers admitted they were surprised by the results, Bendella told AuntMinnieEurope on 15 September.
"What stood out most was the finding that only a very small number of MRI protocol suggestions were rated as unacceptable, even when tested across independent and external datasets," she noted. "In addition, we had not anticipated that ChatGPT-4 would provide clear explanations for its protocol choices, which made its output more transparent and educational."
Another notable finding was its ability to suggest alternative, time-saving protocols that were still diagnostically valid, highlighting a potential for workflow optimization beyond simple protocol matching, she added.
In Bonn, ChatGPT is primarily used in educational settings and occasionally to support general MRI protocol discussions, but always under strict adherence to data protection regulations, Bendella said. "It is currently not applied in routine patient care or final decision-making. However, the study has shown the potential of LLMs to provide meaningful input and explanations in the context of MRI protocol selection, which may translate into routine clinical practice, particularly if locally deployed models can be implemented in a secure environment."
LLMs have improved markedly, she continued. "We see progress in reasoning ability, accuracy, and stability, as well as better handling of clinical free-text. At Bonn, we currently use GPT-4 for research and pilot projects in radiology, and we are also evaluating locally hosted LLMs (e.g. Llama, Mixtral, and others) for data protection and reproducibility reasons. Additionally, we benchmark open-source models to understand their potential in medical imaging workflows."
The group is now working on a follow-up project that involves training a locally hosted open-source LLM to automatically select appropriate MRI sequences.
Top left: Dr. Zeynep Bendella. Top right: Dr. Barbara Wichtmann. Bottom left: Dr. Katerina Deike. Bottom right: Prof. Dr. Daniel Paech.
You can read the full EJR article here. The co-authors were Drs. Barbara Daria Wichtmann, Ralf Clauberg, Vera C. Keil, Nils C. Lehnen, Robert Haase, Laura C. Sáez, Isabella C. Wiest, Jakob Nikolas Kather, Christoph Endler, Alexander Radbruch, Daniel Paech, and Katerina Deike.