Dutch researchers determined that a computer-aided detection (CAD) software they developed could be used as a second reader in breast cancer screening, with the CAD algorithm producing diagnostic performance comparable to that of residents, with low false-positive rates.
The study, published online on 9 July in European Radiology, was first presented at the 2012 European Congress of Radiology (ECR). Lead author Dr. Rianne Hupse, formerly from the department of radiology at Radboud University Nijmegen Medical Center, and colleagues provided more detail in the publication.
CAD software is already widely used, but most applications have been developed to mark regions suspected of microcalcification clusters or masses to avoid perceptual oversight of abnormalities by radiologists, according to the authors.
Most systems operate at a high sensitivity, but CAD's specificity is relatively low, they added.
"In general, radiologists are positive about the use of CAD for the detection of microcalcifications," the group wrote. "However, for the detection of masses radiologists have less confidence in CAD, because current systems still show a relatively large number of false positives."
An example of a study case with an invasive cancer not detected in the screening program. The cancer was detected by most radiologists in the study and by CAD. Image courtesy of Dr. Nico Karssemeijer. |
Many CAD systems also only target perceptual oversights, when misinterpreting suspicious regions may be a more common cause of missed malignant masses, according to the authors. As such, Hupse and colleagues looked at alternative ways to use CAD that help radiologists with interpretation of suspicious regions instead of exclusively focusing on avoiding oversight errors.
Another goal was to see if CAD could function as a second or third reader in countries such as the Netherlands, where all screening mammograms are read independently by two radiologists.
"It is evident that the potential benefit of CAD as decision support for the detection of malignant masses depends strongly on the quality of the CAD system that is used," they wrote. "In order to be used for decision support or as [an] independent reader in screening, it would be of great importance if CAD could operate at a sensitivity and specificity comparable to that of a human reader."
To accomplish that goal, the Dutch researchers developed a CAD algorithm that has good performance with low false-positive rates. Novel features that represent normal tissue context and the similarity between the mediolateral oblique and craniocaudal view were included. The system was also trained using a combined total of 200 normal and abnormal mammograms.
The researchers compared the performance of CAD to that of nine radiologists and three residents reading 200 digital screening mammograms without CAD. The performances were computed as the true-positive fraction (TPF) at a false-positive fraction of 0.05 and 0.2.
At a false-positive fraction of 0.05, CAD's performance, with a TPF of 0.487, had no statistically significant difference compared to that of the radiologists, who reported a TPF of 0.518 (p = 0.17). At a false-positive fraction of 0.2, CAD's performance at a TPF of 0.620 was significantly lower than that of the radiologists, who had a TPF of 0.736 (p < 0.001). However, compared to the residents, CAD's performance was similar for all false-positive fractions.
"For most cases (190 out of 200) prior screening mammograms were available for the readers, so that they could judge if findings were already visible in a previous screening round," the authors wrote. "This presentation is similar to screening practice and can be valuable to detect growing lesions, which are more likely to be malignant. One should take into account that the CAD system we developed did not use any information from prior mammograms."
The researchers expect CAD results will improve when temporal features are included and when the presence of microcalcifications is used.
Despite the retrospective nature of the study, the researchers have no indication that readers were less vigilant when they were unaided by CAD than they would have been if the same cases were read during screening because the average reading time per case was 44 seconds. However, they acknowledged that large-scale, real-life prospective studies are necessary.
Compared to radiologists, standalone CAD performed better for noncancer cases that were referred in original screening and cancer cases that were missed in original screening, they found.
"This suggests that the performance difference between standalone CAD and radiologists is very dependent on the dataset analyzed," Hupse and colleagues wrote. "Because the dataset is so different from screening practice, it is hard to translate absolute performance differences found in this study to screening practice."
If these results are verified by large-scale, prospective studies, CAD has the potential to be used as an independent reader next to the screening radiologists.
"An application of standalone CAD can be to select a small percentage of mammograms that obtained a high CAD suspiciousness score but were not referred by double reading," the authors wrote. "These mammograms could be selected for additional reading by a third radiologist. We expect that this will improve cancer detection without a large increase in false positives because our results show that part of the cancer cases missed in screening were detected at a very low false-positive level."
"We are continuing the development of CAD in ongoing research projects," said co-author Dr. Nico Karssemeijer in an interview with AuntMinnieEurope.com. "One of the topics is integration of modules for detection of suspicious microcalcification clusters."