Deep Learning at Chest Radiography: Automated Classification of Pulmonary Tuberculosis by Using Convolutional Neural Networks

Published Online:

Deep learning with convolutional neural networks can accurately classify tuberculosis at chest radiography with an area under the curve of 0.99.


To evaluate the efficacy of deep convolutional neural networks (DCNNs) for detecting tuberculosis (TB) on chest radiographs.

Materials and Methods

Four deidentified HIPAA-compliant datasets were used in this study that were exempted from review by the institutional review board, which consisted of 1007 posteroanterior chest radiographs. The datasets were split into training (68.0%), validation (17.1%), and test (14.9%). Two different DCNNs, AlexNet and GoogLeNet, were used to classify the images as having manifestations of pulmonary TB or as healthy. Both untrained and pretrained networks on ImageNet were used, and augmentation with multiple preprocessing techniques. Ensembles were performed on the best-performing algorithms. For cases where the classifiers were in disagreement, an independent board-certified cardiothoracic radiologist blindly interpreted the images to evaluate a potential radiologist-augmented workflow. Receiver operating characteristic curves and areas under the curve (AUCs) were used to assess model performance by using the DeLong method for statistical comparison of receiver operating characteristic curves.


The best-performing classifier had an AUC of 0.99, which was an ensemble of the AlexNet and GoogLeNet DCNNs. The AUCs of the pretrained models were greater than that of the untrained models (P < .001). Augmenting the dataset further increased accuracy (P values for AlexNet and GoogLeNet were .03 and .02, respectively). The DCNNs had disagreement in 13 of the 150 test cases, which were blindly reviewed by a cardiothoracic radiologist, who correctly interpreted all 13 cases (100%). This radiologist-augmented approach resulted in a sensitivity of 97.3% and specificity 100%.


Deep learning with DCNNs can accurately classify TB at chest radiography with an AUC of 0.99. A radiologist-augmented approach for cases where there was disagreement among the classifiers further improved accuracy.

© RSNA, 2017


  • 1.. World Health Organization. Global tuberculosis report 2015. Published October 28, 2015. Accessed September 20, 2016. Google Scholar
  • 2. World Health Organization. Systematic screening for active tuberculosis: Principles and recommendations. Published April 2013. Accessed September 20, 2016. Google Scholar
  • 3. Bhalla AS, Goyal A, Guleria R, Gupta AK. Chest tuberculosis: Radiological review and imaging recommendations. Indian J Radiol Imaging 2015;25(3):213–225. Crossref, MedlineGoogle Scholar
  • 4. Melendez J, Sánchez CI, Philipsen RH, et al. An automated tuberculosis screening strategy combining X-ray-based computer-aided detection and clinical information. Sci Rep 2016;6:25265. Crossref, MedlineGoogle Scholar
  • 5. Hoog AH, Meme HK, van Deutekom H, et al. High sensitivity of chest radiograph reading by clinical officers in a tuberculosis prevalence survey. Int J Tuberc Lung Dis 2011;15(10):1308–1314. Crossref, MedlineGoogle Scholar
  • 6. Antani S. Automated Detection of Lung Diseases in Chest X-Rays. A Report to the Board of Scientific Counselors. US National Library of Medicine. Published April 2015. Accessed September 20, 2016. Google Scholar
  • 7. Jaeger S, Karargyris A, Candemir S, et al. Automatic screening for tuberculosis in chest radiographs: a survey. Quant Imaging Med Surg 2013;3(2):89–99. MedlineGoogle Scholar
  • 8. Pande T, Cohen C, Pai M, Ahmad Khan F. Computer-aided detection of pulmonary tuberculosis on digital chest radiographs: a systematic review. Int J Tuberc Lung Dis 2016;20(9):1226–1230. Crossref, MedlineGoogle Scholar
  • 9. Maduskar P, Muyoyeta M, Ayles H, Hogeweg L, Peters-Bax L, van Ginneken B. Detection of tuberculosis using digital chest radiography: automated reading vs. interpretation by clinical officers. Int J Tuberc Lung Dis 2013;17(12):1613–1620. Crossref, MedlineGoogle Scholar
  • 10. Jaeger S, Karargyris A, Candemir S, et al. Automatic tuberculosis screening using chest radiographs. IEEE Trans Med Imaging 2014;33(2):233–245. Crossref, MedlineGoogle Scholar
  • 11. Russakovsky O, Deng J, Su H, et al. Imagenet large scale visual recognition challenge. Int J Comput Vis 2015;115(3):211–252. CrossrefGoogle Scholar
  • 12. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. arXiv preprint. Published December 10, 2015. Accessed September 20, 2016. Google Scholar
  • 13. LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE 1998;86(11):2278–2324. CrossrefGoogle Scholar
  • 14. Bar Y, Diamant I, Wolf L, Greenspan H. Deep learning with non-medical training used for chest pathology identification. In: Hadjiiski LM, Tourassi GD, eds. Proceedings of SPIE: medical imaging 2015—computer-aided diagnosis. Vol 9414. Bellingham, Wash: International Society for Optics and Photonics, 2015; 94140V. Google Scholar
  • 15. Shin HC, Roth HR, Gao M, et al. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imaging 2016;35(5):1285–1298. Crossref, MedlineGoogle Scholar
  • 16. Hua KL, Hsu CH, Hidayati SC, Cheng WH, Chen YJ. Computer-aided classification of lung nodules on computed tomography images via deep learning technique. Onco Targets Ther 2015;8:2015–2022. MedlineGoogle Scholar
  • 17. Roth HR, Farag A, Lu L, Turkbey EB, Summers RM. Deep convolutional networks for pancreas segmentation in CT imaging. In: Ourselin S, Styner MA, eds. Proceedings of SPIE: medical imaging 2015—image processing. Vol 9413. Bellingham, Wash: International Society for Optics and Photonics, 2015; 94131G. Google Scholar
  • 18. Zhang W, Li R, Deng H, et al. Deep convolutional neural networks for multi-modality isointense infant brain image segmentation. Neuroimage 2015;108:214–224. Crossref, MedlineGoogle Scholar
  • 19. Hwang S, Kim HE, Jeong J, Kim HJ. A novel approach for tuberculosis screening based on deep convolutional neural networks. In: Tourassi GD, Armato SG, eds. Proceedings of SPIE: medical imaging 2016—title. Vol 9785. Bellingham, Wash: International Society for Optics and Photonics, 2016; 97852W. Google Scholar
  • 20. Jaeger S, Candemir S, Antani S, Wáng YX, Lu PX, Thoma G. Two public chest X-ray datasets for computer-aided screening of pulmonary diseases. Quant Imaging Med Surg 2014;4(6):475–477. MedlineGoogle Scholar
  • 21. Belarus Tuberculosis Portal. Belarus Public Health Web site. Published September 1, 2011. Updated July 17, 2015. Accessed August 20, 2016. Google Scholar
  • 22. Jia Y, Shelhamer E, Donahue J, et al. Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on Multimedia 2014. New York, NY: ACM, 2014. CrossrefGoogle Scholar
  • 23. Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates, 2012; 1097–1105. Google Scholar
  • 24. Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2015; 1–9. CrossrefGoogle Scholar
  • 25. Rasband WS. Image J. U.S. National Institutes of Health, Bethesda, Maryland, USA. 1997-2016. Google Scholar
  • 26. Hastie T, Tibshirani R, Friedman J. Model assessment and selection. In: The elements of statistical learning. 2nd ed. New York, NY: Springer, 2009; 219–259. CrossrefGoogle Scholar
  • 27. Obuchowski NA. Receiver operating characteristic curves and their use in radiology. Radiology 2003;229(1):3–8. LinkGoogle Scholar
  • 28. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 1988;44(3):837–845. Crossref, MedlineGoogle Scholar
  • 29. Steyerberg EW, Vickers AJ, Cook NR, et al. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology 2010;21(1):128–138. Crossref, MedlineGoogle Scholar
  • 30. Bradley AP. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit 1997;30(7):1145–1159. CrossrefGoogle Scholar
  • 31. Fawcett T. ROC graphs: Notes and practical considerations for researchers. Mach Learn 2004;31(1):1–38. Google Scholar
  • 32. Agresti A, Coull BA. Approximate is better than “exact” for interval estimation of binomial proportions. Am Stat 1998;52(2):119–126. Google Scholar
  • 33. Wang S, Summers RM. Machine learning and radiology. Med Image Anal 2012;16(5):933–951. Crossref, MedlineGoogle Scholar
  • 34. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521(7553):436–444. Crossref, MedlineGoogle Scholar
  • 35. Wu R, Yan S, Shan Y, Dang Q, Sun G. Deep image: Scaling up image recognition. arXiv preprint. Published January 13, 2015. Updated July 6, 2015. Accessed September 21, 2016. Google Scholar
  • 36. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 2014;15(1):1929–1958. Google Scholar
  • 37. Dietterich TG. Ensemble methods in machine learning. Lect Notes Comput Sci 2000;1857:1–15. CrossrefGoogle Scholar
  • 38. Yosinski J, Clune J, Nguyen A, Fuchs T, Lipson H. Understanding neural networks through deep visualization. arXiv preprint. Published June 22, 2015. Accessed September 21, 2016. Google Scholar

Article History

Received October 5, 2016; revision requested November 23; revision received December 12; accepted January 9, 2017; final version accepted January 19.
Published online: Apr 24 2017
Published in print: Aug 2017