Hybrid Reading Strategy For Screening Mammography Reduces Radiologist Workload

Trending 3 weeks ago

A hybrid reference strategy for screening mammography, developed by Dutch researchers and deployed retrospectively to much than 40,000 exams, reduced radiologist workload by 38% without changing callback aliases crab discovery rates. The study, which emphasizes AI confidence, was published coming successful Radiology, a diary of nan Radiological Society of North America (RSNA). 

"Although nan wide capacity of state-of-the-art AI models is very high, AI sometimes makes mistakes," said Sarah D. Verboom, M.Sc., a doctoral campaigner successful nan Department of Medical Imaging astatine Radboud University Medical Center successful nan Netherlands. "Identifying exams successful which AI mentation is unreliable is important to let for and optimize usage of AI models successful breast crab screening programs."

The hybrid reference strategy involves utilizing a operation of radiologist readers and a stand-alone AI mentation of cases successful which nan AI exemplary performs arsenic good as, aliases amended than, nan radiologist.

"We tin execute this capacity level if nan AI exemplary provides not only an appraisal of nan probability of malignancy (PoM) for a lawsuit but besides a standing of its certainty of that assessment," Verboom said. "Unfortunately, nan PoM itself is not ever a bully predictor of certainty because heavy neural networks thin to beryllium overconfident successful their predictions."

To create and measure a hybrid reference strategy, nan researchers utilized a dataset of 41,469 screening mammography exams from 15,522 women (median property 59 years) pinch 332 screen-detected cancers and 34 interval cancers. The exams were performed betwixt 2003 and 2018 successful Utrecht, Netherlands, arsenic portion of nan Dutch National Breast Cancer Screening Program.

The dataset was divided astatine nan diligent level into 2 adjacent groups pinch identical crab detection, callback and interval crab rates. The first group was utilized to find nan optimal thresholds for nan hybrid reference strategy, while nan 2nd group was utilized to measure nan reference strategies.

Of nan uncertainty metrics evaluated by nan researchers, nan entropy of nan mean PoM people of nan astir suspicious region produced a crab discovery complaint of 6.6 per 1,000 cases and a callback complaint of 23.7 per 1,000 cases, akin to rates of modular double-reading by radiologists.

The last hybrid reference strategy progressive AI evaluating each screening mammogram to nutrient 2 outputs: nan PoM and an uncertainty estimate of that prediction. When AI wished nan PoM was beneath nan established period pinch certainty, nan lawsuit was considered normal. When AI detected a PoM supra nan established threshold, women were recalled for further testing, but only erstwhile that prediction was deemed confident. Otherwise, nan exam was double-read by radiologists.

Although nan mostly of AI decisions were uncertain and deferred to a quality reader, 38% were classified arsenic definite and could beryllium publication solely by AI. Using nan researchers' strategy reduced radiologist reference workload to 61.9% without changing callback (23.6‰ vs 23.9‰) aliases crab discovery (6.6‰ vs 6.7‰) rates, some of which are comparable to those of modular double-reading.

When nan AI exemplary was certain, nan area nether nan curve (AUC) was higher (0.96 vs 0.87). Its sensitivity astir matched that of double radiologist reference (85.4% vs 88.9%). Younger women pinch dense breasts were much apt to person an uncertain AI score.

"The cardinal constituent of our study isn't needfully that this is nan champion measurement to divided nan workload, but that it's adjuvant to person uncertainty quantification built into AI models," Verboom said. "I dream commercialized products merge this into their models, because I deliberation it's a very useful metric."

Verboom noted that if nan study results occurred successful objective practice, nan determination to callback 19% of women would beryllium made by AI without nan involution of a radiologist.

"Several studies person shown that women participating successful bosom crab screening programs person affirmative attitudes astir nan usage of AI," she said. "However, astir women for illustration their mammogram to beryllium publication by astatine slightest 1 radiologist."

She said it whitethorn beryllium much acceptable for radiologists to reappraisal exams deemed uncertain by AI, arsenic good arsenic AI callback cases.

"The usage of AI pinch uncertainty quantification tin beryllium a imaginable solution for workforce shortages and could thief build spot successful nan implementation of AI," Verboom said.

Verboom said further research, ideally a prospective trial, is needed to find really nan workload simplification achieved by nan hybrid reference strategy could alteration radiologist reference time.

"I deliberation successful nan future, we could get to a constituent wherever a information of women are sent location without ever having a radiologist look astatine their mammogram because AI will find that their exam is normal," she said. "We're not location yet, but I deliberation we could get location pinch this uncertainty metric and value control."

This study is portion of nan aiREAD project, which is financed by nan Dutch Research Council, Dutch Cancer Society and Health Holland.

Source:

Journal reference:

Verboom, S. D., et al. (2025) AI Should Read Mammograms Only When Confident: A Hybrid Breast Cancer Screening Reading Strategy. Radiology. doi.org/10.1148/radiol.242594.

More