What's the sweet spot for batch reading mammograms?

May 10, 2016

2015 10 12 11 00 59 668 Mammogram Calendar 200

What's the right number of cases to interpret when batch reading mammography studies? Too few can affect productivity, while too many can risk burning out your radiologists. Fortunately, researchers provide some guidance in a new study published in the May 10 issue of the Journal of the American Medical Association.

Let's face it: Interpreting screening mammograms can be repetitive work. And the longer a person performs the task in a single sitting, the higher the risk of missing something crucial, a phenomenon that's known to occur in jobs such as assembly line inspection, airport baggage screening, and military drone operation.

Sian Taylor-Phillips, PhD, from the University of Warwick.

That's why U.K. researchers explored whether batch reading of mammograms resulted in lower performance for cancer detection, and if so, if this effect could be addressed by a double-read protocol that reversed the order in which two readers interpreted mammograms. They also examined how many cases make up an optimal batch.

"We investigated whether ... changing the order in which the two experts examined the batch of mammograms could increase the cancer detection rate, through readers' experiencing peak vigilance at differing points within the reading batch," wrote the team led by Sian Taylor-Phillips, PhD, from the University of Warwick in Coventry, England.

Decreasing vigilance?

For the study, the researchers used data collected over the course of a year from 46 breast screening centers in England's National Health Service (NHS) Breast Screening Program. The study included 1.2 million screening mammograms read by 360 radiologists over the study time frame. In the U.K., two readers examine each mammogram independently (JAMA, May 10, 2016, Vol. 315:18, pp. 1956-1965).

Half of these cases were placed in an intervention group, in which the two interpreting radiologists reviewed mammograms in opposite order of each other, with one reading the batch forward and the other reading it in reverse (both radiologists had the opportunity to read forward and reverse). For the control group, the radiologists read the batch in the same order as each other, but they had the opportunity to do this both forward and reverse.

The median batch size was 35 cases, consistent with current U.K. practice, according to Taylor-Phillips and colleagues.

"The idea [in the intervention group] is that reversing the order for one reader results in high-vigilance states occurring for the two readers when examining different women's mammograms, so the cancers are detected by at least one of the experts," the team wrote.

But changing the reading order didn't have much of an effect. There was no statistically significant difference in cancer detection between the intervention and control groups. The intervention group's cancer detection rate was 0.88% (5,272 cancers out of 596,642 screenings), while the control group's rate was 0.87% (5,212 cancers out of 597,505 screenings).

The researchers also found that the intervention had no effect when cases were categorized into various subgroups such as the age of the women screened, the first and last five cases in the batch, and the first batch of the day for each reader.

In addition, on further analysis, Taylor-Phillips and colleagues found that the cancer detection rates for individual readers did not change with the amount of time these readers spent on task. The odds of detecting cancer on the first case were the same as for the 40th case, they wrote. And some readers found that switching up the case order had a downside, Taylor-Phillips told AuntMinnie.com via email.

"While most readers found the intervention no trouble at all, some found it inconvenient to reverse the order of their associated paperwork," she said.

Reducing recall

The study also showed that the intervention did not affect overall recall rates. However, the researchers did find that recall rates for individual readers decreased during the time spent on task. The mean change over the course of 40 cases was a reduction in recall rate from 6.4% to 4.6%, and this trend continued as batch numbers increased, up to about 60 cases.

"Breast screening readers seemed to get 'into the zone' and their performance improved with time on task," Taylor-Phillips said. "They recalled fewer women for further tests as they got nearer to the end of the batch, while cancer detection rates stayed constant."

These results jibe with real-world practices throughout England and should boost clinicians' confidence in batch-reading standards in that country and in other screening programs that practice similarly, according to Taylor-Phillips.

"Our main finding is actually very reassuring for clinicians, since we found no evidence that performance drops off with time on task when examining batches of around 35 mammograms," she said.

Differences with the U.S.

The results highlight both the differences and similarities in screening programs between the U.S. and U.K., according to a commentary that accompanied the study by Dr. Elizabeth Burnside, Dr. Edward Sickles, and Stephen Duffy, PhD. In England, population-based screening is offered with less frequency in a narrower age range than in the U.S., resulting in fewer false positives but also fewer detected cancers.

As for similarities, both countries have higher cancer detection rates among women undergoing their first mammogram compared to subsequent mammograms. Also, recall rates are lower in both countries for women whose prior exams are compared with current ones.

The study also reflects the desire in both countries to improve screening by applying evidence-based practice and quality improvement. The NHS has set up formal training, credentialing, and outcomes activities, supporting the performance of research such as the JAMA study for improving mammography outcomes -- which in the end could be the most important message of the new study.