Autonomous Chest Radiograph Reporting Using AI: Estimation of Clinical Impact

Published Online:https://doi.org/10.1148/radiol.222268

In a multicenter retrospective study of 1529 consecutive patients, 28% of normal posteroanterior chest radiographs, or 7.8% of all radiographs, could be potentially safely automated by an artificial intelligence tool.

Background

Automated interpretation of normal chest radiographs could alleviate the workload of radiologists. However, the performance of such an artificial intelligence (AI) tool compared with clinical radiology reports has not been established.

Purpose

To perform an external evaluation of a commercially available AI tool for (a) the number of chest radiographs autonomously reported, (b) the sensitivity for AI detection of abnormal chest radiographs, and (c) the performance of AI compared with that of the clinical radiology reports.

Materials and Methods

In this retrospective study, consecutive posteroanterior chest radiographs from adult patients in four hospitals in the capital region of Denmark were obtained in January 2020, including images from emergency department patients, in-hospital patients, and outpatients. Three thoracic radiologists labeled chest radiographs in a reference standard based on chest radiograph findings into the following categories: critical, other remarkable, unremarkable, or normal (no abnormalities). AI classified chest radiographs as high confidence normal (normal) or not high confidence normal (abnormal).

Results

A total of 1529 patients were included for analysis (median age, 69 years [IQR, 55–69 years]; 776 women), with 1100 (72%) classified by the reference standard as having abnormal radiographs, 617 (40%) as having critical abnormal radiographs, and 429 (28%) as having normal radiographs. For comparison, clinical radiology reports were classified based on the text and insufficient reports excluded (n = 22). The sensitivity of AI was 99.1% (95% CI: 98.3, 99.6; 1090 of 1100 patients) for abnormal radiographs and 99.8% (95% CI: 99.1, 99.9; 616 of 617 patients) for critical radiographs. Corresponding sensitivities for radiologist reports were 72.3% (95% CI: 69.5, 74.9; 779 of 1078 patients) and 93.5% (95% CI: 91.2, 95.3; 558 of 597 patients), respectively. Specificity of AI, and hence the potential autonomous reporting rate, was 28.0% of all normal posteroanterior chest radiographs (95% CI: 23.8, 32.5; 120 of 429 patients), or 7.8% (120 of 1529 patients) of all posteroanterior chest radiographs.

Conclusion

Of all normal posteroanterior chest radiographs, 28% were autonomously reported by AI with a sensitivity for any abnormalities higher than 99%. This corresponded to 7.8% of the entire posteroanterior chest radiograph production.

© RSNA, 2023

Supplemental material is available for this article.

See also the editorial by Park in this issue.

References

  • 1. Raoof S, Feigin D, Sung A, Raoof S, Irugulpati L, Rosenow EC 3rd. Interpretation of plain chest roentgenogram. Chest 2012;141(2):545–558. Crossref, MedlineGoogle Scholar
  • 2. van Leeuwen KG, Schalekamp S, Rutten MJCM, van Ginneken B, de Rooij M. Artificial intelligence in radiology: 100 commercially available products and their scientific evidence. Eur Radiol 2021;31(6):3797–3804. Crossref, MedlineGoogle Scholar
  • 3. Yu AC, Mohajer B, Eng J. External Validation of Deep Learning Algorithms for Radiologic Diagnosis: A Systematic Review. Radiol Artif Intell 2022;4(3):e210064. LinkGoogle Scholar
  • 4. Mongan J, Moy L, Kahn CE Jr. Checklist for Artificial Intelligence in Medical Imaging (CLAIM): A Guide for Authors and Reviewers. Radiol Artif Intell 2020;2(2):e200029. LinkGoogle Scholar
  • 5. Tang YX, Tang YB, Peng Y, et al. Automated abnormality classification of chest radiographs using deep convolutional neural networks. NPJ Digit Med 2020;3(1):70. Crossref, MedlineGoogle Scholar
  • 6. Putha P, Tadepalli M, Reddy B, Raj T, Chiramal JA, Govil S, et al. Can Artificial Intelligence Reliably Report Chest X-Rays?: Radiologist Validation of an Algorithm trained on 2.3 Million X-Rays. arXiv. 2018 Jul 19. Google Scholar
  • 7. Nabulsi Z, Sellergren A, Jamshy S, et al. Deep learning for distinguishing normal versus abnormal chest radiographs and generalization to two unseen diseases tuberculosis and COVID-19. Sci Rep 2021;11(1):15523. Crossref, MedlineGoogle Scholar
  • 8. Keski-Filppula T, Nikki M, Haapea M, Ramanauskas N, Tervonen O. Using artificial intelligence to detect chest X-rays with no significant findings in a primary health care setting in Oulu, Finland. arXiv. 2022 May 17; doi: https://doi.org/10.48550/arxiv.2205.08123. Google Scholar
  • 9. Hwang EJ, Nam JG, Lim WH, et al. Deep learning for chest radiograph diagnosis in the emergency department. Radiology 2019;293(3):573–580. LinkGoogle Scholar
  • 10. Dyer T, Dillard L, Harrison M, et al. Diagnosis of normal chest radiographs using an autonomous deep-learning algorithm. Clin Radiol 2021;76(6):473.e9–473.e15. Crossref, MedlineGoogle Scholar
  • 11. Dunnmon JA, Yi D, Langlotz CP, Ré C, Rubin DL, Lungren MP. Assessment of convolutional neural networks for automated classification of chest radiographs. Radiology 2019;290(2):537–544. LinkGoogle Scholar
  • 12. Annarumma M, Withey SJ, Bakewell RJ, Pesce E, Goh V, Montana G. Automated triaging of adult chest radiographs with deep artificial neural networks. Radiology 2019;291(1):196–202. LinkGoogle Scholar
  • 13. Henderson M. RSNA News: “Radiology facing a global shortage". https://www.rsna.org/news/2022/may/Global-Radiologist-Shortage. Published May 10, 2022. Accessed August 29, 2022. Google Scholar
  • 14. Bossuyt PM, Reitsma JB, Bruns DE, et al. STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies. BMJ 2015;351:h5527. Crossref, MedlineGoogle Scholar
  • 15. Jones SR, Carley S, Harrison M. An introduction to power and sample size estimation. Emerg Med J 2003;20(5):453–458. Crossref, MedlineGoogle Scholar
  • 16. R Core Team R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing, 2022. Google Scholar
  • 17. Hwang EJ, Park J, Hong W, et al. Artificial intelligence system for identification of false-negative interpretations in chest radiographs. Eur Radiol 2022;32(7):4468–4478. Crossref, MedlineGoogle Scholar
  • 18. Kim JH, Kim JY, Kim GH, et al. Clinical Validation of a Deep Learning Algorithm for Detection of Pneumonia on Chest Radiographs in Emergency Department Patients with Acute Febrile Respiratory Illness. J Clin Med 2020;9(6):1981. Crossref, MedlineGoogle Scholar
  • 19. Rajpurkar P, O’Connell C, Schechter A, et al. CheXaid: deep learning assistance for physician diagnosis of tuberculosis using chest x-rays in patients with HIV. NPJ Digit Med 2020;3(1):115. Crossref, MedlineGoogle Scholar
  • 20. Jang S, Song H, Shin YJ, et al. Deep Learning-based Automatic Detection Algorithm for Reducing Overlooked Lung Cancers on Chest Radiographs. Radiology 2020;296(3):652–661. LinkGoogle Scholar
  • 21. Hwang EJ, Park S, Jin KN, et al. Development and Validation of a Deep Learning-Based Automated Detection Algorithm for Major Thoracic Diseases on Chest Radiographs. JAMA Netw Open 2019;2(3):e191095. Crossref, MedlineGoogle Scholar
  • 22. Nam JG, Park S, Hwang EJ, et al. Development and Validation of Deep Learning-based Automatic Detection Algorithm for Malignant Pulmonary Nodules on Chest Radiographs. Radiology 2019;290(1):218–228. LinkGoogle Scholar
  • 23. Dixon S. NHS England, Diagnostic Imaging Dataset Statistical Release. https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2018/06/Provisional-Monthly-Diagnostic-Imaging-Dataset-Statistics-2018-06-21.pdf. Accessed November 10, 2022. Google Scholar

Article History

Received: Sept 8 2022
Revision requested: Nov 1 2022
Revision received: Jan 5 2023
Accepted: Jan 18 2023
Published online: Mar 07 2023