Computer-aided Classification of Breast Masses: Performance and Interobserver Variability of Expert Radiologists versus Residents

Published Online:https://doi.org/10.1148/radiol.10081308

Our study demonstrates that a computer-aided diagnosis model can potentially provide accurate classification of breast lesions for both radiology residents and experienced breast imagers with 3–17 years of experience.

Purpose

To evaluate the interobserver variability in descriptions of breast masses by dedicated breast imagers and radiology residents and determine how any differences in lesion description affect the performance of a computer-aided diagnosis (CAD) computer classification system.

Materials and Methods

Institutional review board approval was obtained for this HIPAA-compliant study, and the requirement to obtain informed consent was waived. Images of 50 breast lesions were individually interpreted by seven dedicated breast imagers and 10 radiology residents, yielding 850 lesion interpretations. Lesions were described with use of 11 descriptors from the Breast Imaging Reporting and Data System, and interobserver variability was calculated with the Cohen κ statistic. Those 11 features were selected, along with patient age, and merged together by a linear discriminant analysis (LDA) classification model trained by using 1005 previously existing cases. Variability in the recommendations of the computer model for different observers was also calculated with the Cohen κ statistic.

Results

A significant difference was observed for six lesion features, and radiology residents had greater interobserver variability in their selection of five of the six features than did dedicated breast imagers. The LDA model accurately classified lesions for both sets of observers (area under the receiver operating characteristic curve = 0.94 for residents and 0.96 for dedicated imagers). Sensitivity was maintained at 100% for residents and improved from 98% to 100% for dedicated breast imagers. For residents, the computer model could potentially improve the specificity from 20% to 40% (P < .01) and the κ value from 0.09 to 0.53 (P < .001). For dedicated breast imagers, the computer model could increase the specificity from 34% to 43% (P = .16) and the κ value from 0.21 to 0.61 (P < .001).

Conclusion

Among findings showing a significant difference, there was greater interobserver variability in lesion descriptions among residents; however, an LDA model using data from either dedicated breast imagers or residents yielded a consistently high performance in the differentiation of benign from malignant breast lesions, demonstrating potential for improving specificity and decreasing interobserver variability in biopsy recommendations.

© RSNA, 2010

References

  • 1 Lee CH. Screening mammography: proven benefit, continued controversy. Radiol Clin North Am 2002;40(3):395–407. Crossref, MedlineGoogle Scholar
  • 2 Pisano ED, Gatsonis C, Hendrick E, et al.. Diagnostic performance of digital versus film mammography for breast-cancer screening. N Engl J Med 2005;353(17):1773–1783. Crossref, MedlineGoogle Scholar
  • 3 Helvie MA, Ikeda DM, Adler DD. Localization and needle aspiration of breast lesions: complications in 370 cases. AJR Am J Roentgenol 1991;157(4):711–714. Crossref, MedlineGoogle Scholar
  • 4 Dixon JM, John TG. Morbidity after breast biopsy for benign disease in a screened population. Lancet 1992;339(8785):128. Crossref, MedlineGoogle Scholar
  • 5 Hall FM, Storella JM, Silverstone DZ, Wyshak G. Nonpalpable breast lesions: recommendations for biopsy based on suspicion of carcinoma at mammography. Radiology 1988;167(2):353–358. LinkGoogle Scholar
  • 6 Cyrlak D. Induced costs of low-cost screening mammography. Radiology 1988;168(3):661–663. LinkGoogle Scholar
  • 7 Barton MB, Morley DS, Moore S, et al.. Decreasing women’s anxieties after abnormal mammograms: a controlled trial. J Natl Cancer Inst 2004;96(7):529–538. Crossref, MedlineGoogle Scholar
  • 8 Varas X, Leborgne F, Leborgne JH. Nonpalpable, probably benign lesions: role of follow-up mammography. Radiology 1992;184(2):409–414. LinkGoogle Scholar
  • 9 Sickles EA. Periodic mammographic follow-up of probably benign lesions: results in 3,184 consecutive cases. Radiology 1991;179(2):463–468. LinkGoogle Scholar
  • 10 Elmore JG, Barton MB, Moceri VM, Polk S, Arena PJ, Fletcher SW. Ten-year risk of false positive screening mammograms and clinical breast examinations. N Engl J Med 1998;338(16):1089–1096. Crossref, MedlineGoogle Scholar
  • 11 Stavros AT, Thickman D, Rapp CL, Dennis MA, Parker SH, Sisney GA. Solid breast nodules: use of sonography to distinguish between benign and malignant lesions. Radiology 1995;196(1):123–134. LinkGoogle Scholar
  • 12 Rahbar G, Sie AC, Hansen GC, et al.. Benign versus malignant solid breast masses: US differentiation. Radiology 1999;213(3):889–894. LinkGoogle Scholar
  • 13 Jackson VP. The role of US in breast imaging. Radiology 1990;177(2):305–311. LinkGoogle Scholar
  • 14 Jackson VP. Management of solid breast nodules: what is the role of sonography? Radiology 1995;196(1):14–15. LinkGoogle Scholar
  • 15 Zonderland HM, Coerkamp EG, Hermans J, van de Vijver MJ, van Voorthuisen AE. Diagnosis of breast cancer: contribution of US as an adjunct to mammography. Radiology 1999;213(2):413–422. LinkGoogle Scholar
  • 16 Chang RF, Kuo WJ, Chen DR, Huang YL, Lee JH, Chou YH. Computer-aided diagnosis for surgical office-based breast ultrasound. Arch Surg 2000;135(6):696–699. Crossref, MedlineGoogle Scholar
  • 17 Chen D, Chang RF, Huang YL. Breast cancer diagnosis using self-organizing map for sonography. Ultrasound Med Biol 2000;26(3):405–411. Crossref, MedlineGoogle Scholar
  • 18 Giger ML. Computerized analysis of images in the detection and diagnosis of breast cancer. Semin Ultrasound CT MR 2004;25(5):411–418. Crossref, MedlineGoogle Scholar
  • 19 Drukker K, Giger ML, Vyborny CJ, Mendelson EB. Computerized detection and classification of cancer on breast ultrasound. Acad Radiol 2004;11(5):526–535. Crossref, MedlineGoogle Scholar
  • 20 Drukker K, Giger ML, Metz CE. Robustness of computerized lesion detection and classification scheme across different breast US platforms. Radiology 2005;237(3):834–840. LinkGoogle Scholar
  • 21 Moon WK, Chang RF, Chen CJ, Chen DR, Chen WL. Solid breast masses: classification with computer-aided analysis of continuous US images obtained with probe compression. Radiology 2005;236(2):458–464. LinkGoogle Scholar
  • 22 Chen DR, Chang RF, Chen CJ, et al.. Classification of breast ultrasound images using fractal feature. Clin Imaging 2005;29(4):235–245. Crossref, MedlineGoogle Scholar
  • 23 Jiang Y, Nishikawa RM, Schmidt RA, Metz CE, Giger ML, Doi K. Improving breast cancer diagnosis with computer-aided diagnosis. Acad Radiol 1999;6(1):22–33. Crossref, MedlineGoogle Scholar
  • 24 Huo Z, Giger ML, Vyborny CJ, Wolverton DE, Schmidt RA, Doi K. Automated computerized classification of malignant and benign masses on digitized mammograms. Acad Radiol 1998;5(3):155–168. Crossref, MedlineGoogle Scholar
  • 25 Hadjiiski L, Sahiner B, Chan HP, Petrick N, Helvie MA, Gurcan M. Analysis of temporal changes of mammographic features: computer-aided classification of malignant and benign breast masses. Med Phys 2001;28(11):2309–2317. Crossref, MedlineGoogle Scholar
  • 26 Baker JA, Kornguth PJ, Lo JY, Williford ME, Floyd CE. Breast cancer: prediction with artificial neural network based on BI-RADS standardized lexicon. Radiology 1995;196(3):817–822. LinkGoogle Scholar
  • 27 Jesneck JL, Lo JY, Baker JA. Breast mass lesions: computer-aided diagnosis models with mammographic and sonographic descriptors. Radiology 2007;244(2):390–398. LinkGoogle Scholar
  • 28 Jesneck JL, Nolte LW, Baker JA, Floyd CE, Lo JY. Optimized approach to decision fusion of heterogeneous data for breast cancer diagnosis. Med Phys 2006;33(8):2945–2954. Crossref, MedlineGoogle Scholar
  • 29 Drukker K, Horsch K, Giger ML. Multimodality computerized diagnosis of breast lesions using mammography and sonography. Acad Radiol 2005;12(8):970–979. Crossref, MedlineGoogle Scholar
  • 30 Shi J, Sahiner B, Chan HP, et al.. Characterization of mammographic masses based on level set segmentation with new image features and patient information. Med Phys 2008;35(1):280–290. Crossref, MedlineGoogle Scholar
  • 31 Hadjiiski L, Sahiner B, Helvie MA, et al.. Breast masses: computer-aided diagnosis with serial mammograms. Radiology 2006;240(2):343–356. LinkGoogle Scholar
  • 32 Sahiner B, Petrick N, Chan HP, et al.. Computer-aided characterization of mammographic masses: accuracy of mass segmentation and its effects on characterization. IEEE Trans Med Imaging 2001;20(12):1275–1284. Crossref, MedlineGoogle Scholar
  • 33 Sahiner B, Chan HP, Petrick N, Helvie MA, Hadjiiski LM. Improvement of mammographic mass characterization using spiculation measures and morphological features. Med Phys 2001;28(7):1455–1465. Crossref, MedlineGoogle Scholar
  • 34 Elter M, Horsch A. CADx of mammographic masses and clustered microcalcifications: a review. Med Phys 2009;36(6):2052–2068. Crossref, MedlineGoogle Scholar
  • 35 Sahiner B, Chan HP, Hadjiiski LM, et al.. Multi-modality CADx: ROC study of the effect on radiologists’ accuracy in characterizing breast masses on mammograms and 3D ultrasound images. Acad Radiol 2009;16(7):810–818. Crossref, MedlineGoogle Scholar
  • 36 Drukker K, Gruszauskas NP, Sennett CA, Giger ML. Breast US computer-aided diagnosis workstation: performance with a large clinical diagnostic population. Radiology 2008;248(2):392–397. LinkGoogle Scholar
  • 37 Li H, Giger ML, Yuan Y, et al.. Evaluation of computer-aided diagnosis on a large clinical full-field digital mammographic dataset. Acad Radiol 2008;15(11):1437–1445. Crossref, MedlineGoogle Scholar
  • 38 American College of Radiology. Breast Imaging Reporting and Data System (BI-RADS) 4th ed. Reston, Va: American College of Radiology, 2004. Google Scholar
  • 39 Huo Z, Giger ML, Vyborny CJ, Metz CE. Breast cancer: effectiveness of computer-aided diagnosis observer study with independent database of mammograms. Radiology 2002;224(2):560–568. LinkGoogle Scholar
  • 40 Horsch K, Giger ML, Vyborny CJ, Venta LA. Performance of computer-aided diagnosis in the interpretation of lesions on breast sonography. Acad Radiol 2004;11(3):272–280. Crossref, MedlineGoogle Scholar
  • 41 Horsch K, Giger ML, Metz CE. Potential effect of different radiologist reporting methods on studies showing benefit of CAD. Acad Radiol 2008;15(2):139–152. Crossref, MedlineGoogle Scholar
  • 42 Gupta S, Chyn PF, Markey MK. Breast cancer CADx based on BI-RADS descriptors from two mammographic views. Med Phys 2006;33(6):1810–1817. Crossref, MedlineGoogle Scholar
  • 43 Markey MK, Lo JY, Floyd CE. Differences between computer-aided diagnosis of breast masses and that of calcifications. Radiology 2002;223(2):489–493. LinkGoogle Scholar
  • 44 Markey MK, Lo JY, Tourassi GD, Floyd CE. Self-organizing map for cluster analysis of a breast cancer database. Artif Intell Med 2003;27(2):113–127. Crossref, MedlineGoogle Scholar
  • 45 Elmore JG, Miglioretti DL, Reisch LM, et al.. Screening mammograms by community radiologists: variability in false-positive rates. J Natl Cancer Inst 2002;94(18):1373–1380. Crossref, MedlineGoogle Scholar
  • 46 Barlow WE, Chi C, Carney PA, et al.. Accuracy of screening mammography interpretation by characteristics of radiologists. J Natl Cancer Inst 2004;96(24):1840–1850. Crossref, MedlineGoogle Scholar
  • 47 Smith-Bindman R, Chu P, Miglioretti DL, et al.. Physician predictors of mammographic accuracy. J Natl Cancer Inst 2005;97(5):358–367. Crossref, MedlineGoogle Scholar
  • 48 Tan A, Freeman DH, Goodwin JS, Freeman JL. Variation in false-positive rates of mammography reading among 1067 radiologists: a population-based assessment. Breast Cancer Res Treat 2006;100(3):309–318. Crossref, MedlineGoogle Scholar
  • 49 Kan L, Olivotto IA, Warren Burhenne LJ, Sickles EA, Coldman AJ. Standardized abnormal interpretation and cancer detection ratios to assess reading volume and reader performance in a breast screening program. Radiology 2000;215(2):563–567. LinkGoogle Scholar
  • 50 Molins E, Macià F, Ferrer F, Maristany MT, Castells X. Association between radiologists’ experience and accuracy in interpreting screening mammograms. BMC Health Serv Res 2008;8:91. Crossref, MedlineGoogle Scholar
  • 51 Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas 1960;20(1):37–46. CrossrefGoogle Scholar
  • 52 Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977;33(1):159–174. Crossref, MedlineGoogle Scholar
  • 53 Berg WA, D’Orsi CJ, Jackson VP, et al.. Does training in the Breast Imaging Reporting and Data System (BI-RADS) improve biopsy recommendations or feature analysis agreement with experienced breast imagers at mammography? Radiology 2002;224(3):871–880. LinkGoogle Scholar
  • 54 Lee HJ, Kim EK, Kim MJ, et al.. Observer variability of Breast Imaging Reporting and Data System (BI-RADS) for breast ultrasound. Eur J Radiol 2008;65(2):293–298. Crossref, MedlineGoogle Scholar
  • 55 Antonio AL, Crespi CM. Predictors of interobserver agreement in breast imaging using the Breast Imaging Reporting and Data System. Breast Cancer Res Treat 2010;120(3):539–546. Crossref, MedlineGoogle Scholar
  • 56 Abdullah N, Mesurolle B, El-Khoury M, Kao E. Breast imaging reporting and data system lexicon for US: interobserver agreement for assessment of breast masses. Radiology 2009;252(3):665–672. LinkGoogle Scholar
  • 57 Lazarus E, Mainiero MB, Schepps B, Koelliker SL, Livingston LS. BI-RADS lexicon for US and mammography: interobserver variability and positive predictive value. Radiology 2006;239(2):385–391. LinkGoogle Scholar
  • 58 Lo JY, Markey MK, Baker JA, Floyd CE. Cross-institutional evaluation of BI-RADS predictive model for mammographic diagnosis of breast cancer. AJR Am J Roentgenol 2002;178(2):457–463. Crossref, MedlineGoogle Scholar
  • 59 Elmore JG, Wells CK, Lee CH, Howard DH, Feinstein AR. Variability in radiologists’ interpretations of mammograms. N Engl J Med 1994;331(22):1493–1499. Crossref, MedlineGoogle Scholar
  • 60 Baker JA, Kornguth PJ, Lo JY, Floyd CE. Artificial neural network: improving the quality of breast biopsy recommendations. Radiology 1996;198(1):131–135. LinkGoogle Scholar
  • 61 Bilska-Wolak AO, Floyd CE, Lo JY, Baker JA. Computer aid for decision to biopsy breast masses on mammography: validation on new cases. Acad Radiol 2005;12(6):671–680. Crossref, MedlineGoogle Scholar

Article History

Received July 29, 2008; revision requested August 29; revision received June 2, 1010; accepted June 18; final version accepted July 14.
Published online: Jan 2011
Published in print: Jan 2011