JNCI: Breast screening AI software equals radiologists

Mar 6, 2019

2018 08 28 18 59 9066 Artificial Intelligence Ai 400

Artificial intelligence (AI)-based software can detect cancer on screening mammography studies as accurately as an average breast radiologist, according to research published online on 5 March in the Journal of the National Cancer Institute (JNCI). But it could not beat the best radiologists in the study.

A team of researchers led by first author Alejandro Rodriguez-Ruiz and senior author Ioannis Sechopoulos, PhD, of Radboud University Medical Center in Nijmegen, the Netherlands, retrospectively compared the performance of a deep learning-based software application to that of more than 100 radiologists on nine diverse datasets of screening mammograms.

The AI-based software was statistically noninferior to the average performance of the radiologists for accurately detecting breast cancer, they found. Although it didn't outperform the best radiologists, the software was more accurate than over 60% of the readers.

"Before we could decide what is the best way for AI systems to be introduced in the realm of breast cancer screening with mammography, we wanted to know how good can these systems really be," Sechopoulos said in a statement from Oxford University Press. "It was exciting to see that these systems have reached the level of matching the performance of not just radiologists but of radiologists who spend at least a substantial portion of their time reading screening mammograms."

Noting that breast cancer detection screening could be more accurate and efficient if AI systems could perform similarly to radiologists, the researchers sought to compare the standalone performance of an AI-based software (Transpara 1.4.0, ScreenPoint Medical) with radiologists on nine different datasets of digital mammography exams from four different vendors.

The nine study cohorts included a total of 2,652 exams with 653 malignant cases. These studies included interpretations by 101 radiologists for a total of 28,296 independent interpretations, according to the authors.

Radiologists vs. AI algorithm for breast cancer detection
	Average of 101 radiologists	AI-based software
Area under the curve (AUC)	0.814	0.84
AUC range at 95% confidence interval	0.787 to 0.841	0.82 to 0.86

The AI-based software was statistically noninferior to the average performance of the 101 radiologists, the researchers found. In other findings, they discovered that the AI-based software had a higher area under the curve than 62 (61.4%) of the 101 radiologists and higher sensitivity than 55 (57.9%) of 95 radiologists.

"Our results clearly show that recent advances in AI algorithms have narrowed the gap between computers and human experts in detecting breast cancer in digital mammograms," the authors wrote. "Nevertheless, the performance of AI was consistently lower than the best radiologists in all datasets."

However, the AI system had similar performance to an average radiologist for detecting breast cancer on mammography, the authors concluded.

"These results were consistently observed across a large, heterogeneous, multicenter, multivendor, cancer-enriched cohort of mammograms," they wrote. "Although promising, the performance and fashion of implementation of such an AI system in a screening setting remain to be further investigated."