AI software can help increase detection of interval cancers on screening mammograms that were missed by two human readers, according to research published in Radiology.
A research team led by first author Muzna Nanaa, PhD, and senior author Prof. Fiona Gilbert of the University of Cambridge in the U.K. found that at a high specificity threshold, a commercial AI algorithm detected nearly one in four missed interval cancers. It also correctly localized these cancers in almost three out of four cases.
“At lower specificity thresholds, more interval cancers (ICs) could be detected but at the expense of increased arbitration or recall rates,” they wrote in the article, which went live on 27 August.
In their study, the researchers sought to evaluate AI localizations of interval cancers by cancer category and histopathologic characteristics. They retrospectively applied a commercial AI algorithm Insight MMG v 1.1.2.0 (Lunit) to 2,052 screening mammograms acquired between January 2011 to December 2018 and interpreted by two readers. Of these mammograms, 1,548 were normal and 514 had interval cancers.
The AI software analyzes two-view digital screening mammograms and provides a per-lesion, per-image, per-breast, and per-case cancer likelihood score as well as an overall risk score. It also classifies breast density and provides the locations of suspicious lesions via a heatmap.
AI performance by specificity threshold setting for detecting and localizing interval cancers | ||
---|---|---|
89% specificity | 96% specificity | |
Correct flagging of interval cancers | 35.2% | 23.5% |
Correct localization of interval cancers | 73.5% | 76.9% |
False-positive heatmaps | 109 | 48 |
The authors emphasized that false-positive heatmaps should be kept to a minimum to not increase reading time, not distract the reader from recognizing true cancer areas, and not lead to unnecessary workups.
“Previous publications have shown that readers can suffer from ‘prompt fatigue’,” they wrote.
Although cancer localization performance did not vary by tumor histologic type, the software did have a higher median AI score for invasive cancers than for noninvasive cancers (p < 0.01), as well as for high-grade cancers compared with low-grade cancers (p = 0.02). The software correctly localized a lower proportion of true-negative interval cancers compared with interval cancers with minimal signs of malignancy, and false-negative interval cancers. It also localized a higher proportion of node-positive cancers than node-negative cancers.
In addition, the authors observed higher AI scores in false-negative cases compared with normal mammograms.
“However, none of the other cancer characteristics (invasive versus noninvasive or high-grade versus low-grade cancer) was associated with a median score above 80,” the authors wrote. “A threshold might help the reader decide whether to recall a woman for supplemental screening if her screening mammogram has a high AI score but no discernible signs of malignancy.”
These cases might otherwise be inappropriately dismissed as normal, they said.
“For true-negative mammograms on which ICs are not detected by the AI system, further studies are needed to examine whether these lesions are detectable by another screening method,” the authors concluded.
The full article can be found here.