Running CT lung images through multiple computer-aided detection (CAD) applications delivers better results than any single algorithm alone, researchers are finding. This phenomenon is unleashing a minitrend that may be discomforting to CAD developers concerned about the success of their systems.
But they needn't worry, said image-analysis specialist Bram van Ginneken, PhD, from the Netherlands. Using multiple CAD applications on the same datasets could actually result in more opportunities for more software vendors to contribute to and profit from better systems, van Ginneken said in a presentation at the 2010 Computer Assisted Radiology and Surgery (CARS) meeting in Geneva.
By 2020, projections show that four of the top seven killers -- lung cancer, chronic obstructive pulmonary disease (COPD), lower respiratory infection, and tuberculosis -- will be lung diseases, van Ginneken said.
The large numbers of patients involved and the seriousness of lung disease in public health all but ensure that CAD will play an important role in many areas, including:
- Detection: nodules, pulmonary embolism
- Quantification: nodule volumetry, emphysema
- Diagnosis: benign/malignant classification, differential diagnosis of interstitial lung disease
- Phenotyping and clinical research support
But of all the emerging needs for CAD analysis of lung CT scans, nodule detection is by far the most vital because lung cancer is so important -- and the leading cause of cancer-related deaths, van Ginneken said. The problem is the performance of individual software applications is subpar.
"What is the main limitation of CAD software? The software is not good enough," said van Ginneken, who an associate professor of radiology at the Radboud University Nijmegen in the Netherlands. "If the software works very well, I'm sure radiologists will use it in their practice."
Multiple studies, middling results
The new millennium has seen enormous interest and large numbers of studies published in the field of lung nodule detection with CAD schemes developed in academia, and several applications are already on the market, he said. And because their performance varies so widely, researchers have sought to compare the various techniques to each other.
By doing so "we can not only look at which system works best, we can see what happens when we start to combine various systems," he said. "Why should I only use one system? If I start to combine them I get better results, and performance is a crucial issue."
Toward that end, the Automatic Nodule Detection study (ANODE09) was presented at the 2009 Society of Photo-Optical Instrumentation Engineers (SPIE) meeting, an event that represented one of the first major efforts to compare the performance of various CAD systems for lung nodule detection. Van Ginneken led the ANODE09 Internet challenge by which interested developers tested their CAD algorithms on a series of selected cases from the Nederlands-Leuvens Longkanker Screenings Onderzoek (NELSON) lung cancer screening trial.
In his CARS presentation, van Ginneken discussed those results, to be published in Medical Image Analysis (December 2010, Vol. 14:6, pp. 707-722), stating that they unequivocally show the superiority of using multiple CAD systems.
ANODE09 data were open to developers willing to process a series of CT scans from the NELSON trial (Utrecht University, the Netherlands), which acquired low-dose noncontrast CT scans of the thorax at 16 x 0.75 mm, with images reconstructed at 1 mm.
The dataset consisted of 55 anonymized CT scans from the trial, five of which were used to train the algorithm, with the remaining 50 used for testing. Participants were invited to download the CT data and upload the locations of nodules and their conspicuity. There was no public reference standard, and all results were evaluated with the same protocol, van Ginneken said.
To combine very different systems, the researchers came up with a fairly simple blending method based on two factors: the degree of suspicion that a finding was actually a true lung nodule, and performance characteristics of the software.
Internally, every finding has a degree of suspicion (DOG), van Ginneken said. Thesholding the results of multiple systems together based on this degree of suspicion "sweeps out the [free-response receiver operator characteristics (FROC)] curve, and translates DOG to a probability that a finding with at least one DOG is a true positive," van Ginneken said.
In other words, if two CAD systems have a high degree of suspicion that a finding is positive, chances are higher that it is, and the method boosts its probability. In the study, findings on a single CAD system were merged with nearby findings, with the result based on average probabilities. If a method didn't find a lesion nearby, the probability equaled zero, he said.
The blending method uses only the findings (based on location coordinates and degree of suspicion for each finding) and information about the performance of individual systems, the authors wrote in Medical Image Analysis.
"It uses this performance information in such a way that systems with better performance are implicitly weighed more heavily in the combination," van Ginneken and colleagues wrote. "Without such knowledge, making a proper combination of systems with widely different performance levels is difficult."
The results showed considerable variation in the overall scores between the six CAD applications, labeled A through F. System E clearly outperformed the others, while the results for the different classes of nodules revealed more subtle differences between the systems. For example, one application scored much better for detecting larger nodules compared to smaller ones, while two other systems were superior for detecting smaller nodules.
"In general, isolated nodules seem easier to detect than perifissural and vascular nodules, and pleural nodules are the hardest," the authors wrote. "But for some systems, this general trend does not hold."
Better together
Combinations of any two software applications generally improved the results. For example, combining systems B and C, with individual scores of 0.291 and 0.254, yielded a system score of 0.437, an increase of 0.146 compared to B alone. An even larger improvement was obtained by combining systems C and D, which produced a score of 0.471.
"The results of this system ... [demonstrate] that for some categories of nodules performance almost doubles," the team wrote.
Combining all six systems produced an overall score of 0.685, even better than the 0.632 score obtained for the best-performing system (E) alone. Moreover, combining all the systems except E also delivered better performance than system E alone.
To be fair, van Ginneken said, it must be noted that system E benefited from a large training set that originated from the same NELSON lung cancer screening trial. Nevertheless, the results show something far more important than the "good" or "bad" performance of any individual CAD system, he said. Van Ginneken also noted that the blending method used in the study, while functional, needs to be optimized.
"The real value of this study lies in the demonstration that the combination of systems yields such spectacular improvements," the authors wrote. "As we noted, the methods have different strengths and weaknesses. The effect of combining systems reveals how complementary they are."
For example, system F delivered mediocre overall performance, and adding it to the best-performing system (E) led only to minor improvements in the score (from 0.632 to 0.634). However, when performance from all the systems was combined, leaving out system F resulted in decreased performance, from 0.685 to 0.668.
Combining the findings of different CAD applications is a powerful way to improve their performance, the group concluded. By combining six CAD algorithms, the system was able to detect 80% of all lung nodules with only two false-positive detections per scan, and 65% of all nodules with only 0.5 false positives per scan.
The results not only boost the possibility of finding lung cancer, they make CAD more interesting for developers, van Ginneken said in his presentation.
"Suppose you have a CAD system and you compare it to a state-of-the-art, commercially available system that has been investigated" in multiple studies, he said in his talk. "You try to publish a paper about it and [journal] reviewers would say it's not so interesting -- not state-of-the-art."
But it may be that combining your system with the well-tested commercial system ends up providing "something that's actually better than state-of-the-art -- so even if your system is not No. 1, it may still improve No. 1," van Ginneken said.
When you think about it, he added, it doesn't make a lot of sense to think that any single company should provide the "best" solution to one of the most difficult problems in medical imaging -- and lung nodule detection is even harder than mammography.
Another, related study has come to the same conclusion about combining lung nodule CAD schemes. In a paper published online earlier this month, researchers including van Ginneken combined data from ANODE09 to a second online study, ROC09, to examine multiple CAD systems for lung nodule detection. Once again, the authors concluded that combining the results of multiple CAD systems "provides a large and significant increase in performance when compared to the best individual CAD system" (IEEE Transactions on Medical Imaging, September 2, 2010).
On to nodule registration
Registration of CT images is another key area of research related to lung nodule assessment, van Ginneken said. For example, to determine if a nodule has grown in volume over time, radiologists need to compare nodules from different CT scans quickly and accurately.
A session at the Medical Image Computing and Computer Assisted Intervention (MICCAI) conference being held this week in Beijing will address this issue, van Ginneken said. Similar to the ANODE09 challenge, Evaluation of Methods for Pulmonary Image Registration 2010 (EMPIRE 10) will allow developers to test their best registration algorithms. Once again the organizers hope to advance the state of the art based on collaboration between the some of the best minds in medical imaging analysis.
"Enthusiastic computer scientists tackled the most difficult problems and came up with algorithms that were not very good yet," van Ginneken said. "So the challenges really are about collaborating and not about competing."
By Eric Barnes
AuntMinnie.com staff writer
September 21, 2010
Related Reading
Combined PET/CT CAD improves lung nodule detection, August 6, 2010
Lung cancer often recurs more than 5 years after resection, July 19, 2010
CT method measures airway calcium to diagnose lung cancer, June 28, 2010
Chest x-ray CAD offers value in detecting lung cancers, June 18, 2010
CARS report: New CAD tool follows lung nodules over time, June 30, 2009
Copyright © 2010 AuntMinnie.com