CAD can't outpoint second human reader in chest radiography

Apr 13, 2014

While computer-aided detection (CAD) software can improve lung nodule detection on chest radiographs, the use of CAD as a second reader can't yet outperform double reading by two human readers, Dutch researchers have reported.

In a study involving 12 radiologists and 300 chest radiographs, a group led by Dr. Steven Schalekamp from Radboud University Medical Center in Nijmegen found that CAD increased mean reader sensitivity for lung nodules from 64% with single reading to an average of 67.8%. However, the combination of two readers yielded an average sensitivity of 73.1%.

"CAD as an independent second reader also significantly improves observer performance, although does not yield the result of combining the readings of two humans," he said. He presented the findings during a scientific session at ECR 2014 in Vienna.

Schalekamp noted that lung cancer is the most frequent and most deadly cancer worldwide; patient survival rates strongly correlates with catching the disease early. While chest radiography is the most common radiological exam and an important diagnostic tool for lung cancer, studies in the literature report that up to 26% of visible lung cancers are being missed on these studies, he said.

While double reading has been shown to improve detection performance in mammography and chest radiography, the use of a second reader is costly and time-consuming.

"This might be the reason that it's never been considered for chest radiography," he said.

Better than double reading?

As a result, the researchers sought to assess if a CAD system could replace a second human reader in a double-reading process. Making use of image data from a previous observer study, the team included 300 digital chest radiographs, 111 of which had a CT-proven solitary visible lung nodule. Nodules ranged in conspicuity from obvious to very subtle and had an average size of 16.2 mm, he said.

The remaining 189 were normal studies. Bone-suppressed images (via version 2.4 of Riverain Technologies' ClearRead software) were also provided. Six radiologists and six residents were asked to independently read the radiographs and the bone-suppressed images and provide localization and scoring of suspicious lesions on a scale from 1 to 100.

CAD software (ClearRead + Detect 5.2) was then applied to the same images. Designed to detect nodules between 9 mm and 30 mm, the software produces CAD marks with probability scores ranging from 1 to 100.

To simulate a double-reading process, the researchers averaged the scores of two readers for lesions identified within 1.5 cm of the same location. This was performed for all combinations of the 12 observers. In addition, CAD was used to replace one of the second readers for another combination.

For each combination of interpretations, the researchers performed location-based receiver operating characteristic (ROC) analysis using one score per image. False-positives were disregarded for abnormal images and the highest false-positive location was used for normal images.

They also assessed mean sensitivity in a clinically relevant high specificity range of 80% to 100%. To determine double-reading performance, the researchers averaged all possible combinations with other readers, he said.

High specificity comes with low sensitivity

Standalone CAD reached a sensitivity of 81% at 1.9 false-positives per image on the dataset. However, the sensitivity dropped to 35% when performed at specificity ≥ 80%.

"That is still far less than the mean sensitivity of the single readers and also worse than the worst observer," he said.

	Minimum sensitivity	Maximum sensitivity	Mean sensitivity
Single reading	45.5%	78.2%	64%
Double reading	58.3%	83.8%	73%
Single reading + CAD	58.1%	81.1%	68%

While the combination of single readers with CAD increased sensitivity over single reading alone (p = 0.02), the mean sensitivity of double reading (p = 0.001) was higher.

Chart shows the mean sensitivity of each reader in a clinically relevant high specificity range (80% to 100%). All charts courtesy of Dr. Steven Schalekamp.

In general, the combination of two observers (double reading) was superior to a single observer and to a combination of observer and CAD. In addition, the combination of the human observer and CAD was significantly better than the human observer alone, Schalekamp said.

However, one observer had a drop in performance with double reading, while three observers experienced a decline in sensitivity when their interpretations were combined with CAD, he said.

"The best observer benefited more from being combined with CAD than with other observers," Schalekamp told AuntMinnieEurope.com.

Chart shows the mean sensitivity of each reader in a clinically relevant high specificity range (80% to 100%), but pairs each reader with the worst of the 12 observers.

Schalekamp also noted that because the range of observer performances is large, double-reading performance is very dependent on the quality of the two observers being combined.

"Although the CAD standalone performance is worse than the worst observer (mean sensitivity of 35% versus 46%), the combination of CAD and observer is superior to the combination with a weak observer," he said.