Evaluation of Reader Variability in the Interpretation of Follow-up CT Scans at Lung Cancer Screening

Published Online:https://doi.org/10.1148/radiol.10101254

In lung cancer screening, the presence or absence of a change in the size of noncalcified lung nodules appears to be the most important consideration in detecting change and making follow-up recommendations; reader agreement for those determinations seems acceptable but could be improved.

Purpose

To measure reader agreement in determining whether lung nodules detected at baseline screening computed tomography (CT) had changed at subsequent screening examinations and to evaluate the variability in recommendations for further follow-up.

Materials and Methods

All subjects were enrolled in the National Lung Screening Trial (NLST), and each participant consented to the use of their de-identified images for research purposes. The authors randomly selected 100 cases of nodules measuring at least 4.0 mm at 1-year screening CT that were considered by the original screening CT reader to be present on baseline CT scans; nodules considered by the original reader to have changed were oversampled. Selected images from each case showing the entire nodule at both examinations were preloaded on a picture archiving and communication system workstation. Nine radiologists served as readers, and they evaluated whether the nodule was present at baseline and recorded the bidimensional measurements and nodule characteristics at each examination, presence or absence of change, results of screening CT, and follow-up recommendations (high-level follow-up, low-level follow-up, no follow-up).

Results

On the basis of reviews during case selection, five nodules seen at follow-up were judged not to have been present at baseline; for 19 of the remaining 95 cases, at least one reader judged the nodule not to have been present at baseline. For the 76 nodules that were unanimously considered to have been present at baseline, 21%–47% (mean ± standard deviation, 30% ± 9) were judged to have grown. The κ values were similar for growth (κ = 0.55) and a positive screening result (κ = 0.51) and were lower for a change in margins and attenuation (κ = 0.27–0.31). The κ value in the recommendation of high- versus low-level follow-up was high (κ = 0.66).

Conclusion

Reader agreement on nodule growth and screening result was moderate to substantial. Agreement on follow-up recommendations was lower.

© RSNA, 2011

Supplemental material: http://radiology.rsna.org/lookup/suppl/doi:10.1148/radiol.10101254/-/DC1

References

  • 1 Swensen SJ, Jett JR, Sloan JA, et al.. Screening for lung cancer with low-dose spiral computed tomography. Am J Respir Crit Care Med 2002;165(4):508–513. Crossref, MedlineGoogle Scholar
  • 2 Erasmus JJ, Gladish GW, Broemeling L, et al.. Interobserver and intraobserver variability in measurement of non-small-cell carcinoma lung lesions: implications for assessment of tumor response. J Clin Oncol 2003;21(13):2574–2582. Crossref, MedlineGoogle Scholar
  • 3 Bogot NR, Kazerooni EA, Kelly AM, Quint LE, Desjardins B, Nan B. Interobserver and intraobserver variability in the assessment of pulmonary nodule size on CT using film and computer display methods. Acad Radiol 2005;12(8):948–956. Crossref, MedlineGoogle Scholar
  • 4 Gierada DS, Pilgram TK, Ford M, et al.. Lung cancer: interobserver agreement on interpretation of pulmonary findings at low-dose CT screening. Radiology 2008;246(1):265–272. LinkGoogle Scholar
  • 5 Leader JK, Warfel TE, Fuhrman CR, et al.. Pulmonary nodule detection with low-dose CT of the lung: agreement among radiologists. AJR Am J Roentgenol 2005;185(4):973–978. Crossref, MedlineGoogle Scholar
  • 6 Church TR; National Lung Screening Trial Executive Committee. Chest radiography as the comparison for spiral CT in the National Lung Screening Trial. Acad Radiol 2003;10(6):713–715. Crossref, MedlineGoogle Scholar
  • 7 Gohagan J, Marcus P, Fagerstrom R, et al.. Baseline findings of a randomized feasibility trial of lung cancer screening with spiral CT scan vs chest radiograph: the Lung Screening Study of the National Cancer Institute. Chest 2004;126(1):114–121. Crossref, MedlineGoogle Scholar
  • 8 Fleiss JL. Measuring nominal scale agreement among many raters. Psychol Bull 1971;76(5):378–382. CrossrefGoogle Scholar
  • 9 Landis JR, Koch GG. An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. Biometrics 1977;33(2):363–374. Crossref, MedlineGoogle Scholar
  • 10 Reeves AP, Biancardi AM, Apanasovich TV, et al.. The Lung Image Database Consortium (LIDC): a comparison of different size metrics for pulmonary nodule measurements. Acad Radiol 2007;14(12):1475–1485. Crossref, MedlineGoogle Scholar
  • 11 Revel MP, Bissery A, Bienvenu M, Aycard L, Lefort C, Frija G. Are two-dimensional CT measurements of small noncalcified pulmonary nodules reliable? Radiology 2004;231(2):453–458. LinkGoogle Scholar
  • 12 Yankelevitz DF, Gupta R, Zhao B, Henschke CI. Small pulmonary nodules: evaluation with repeat CT—preliminary experience. Radiology 1999;212(2):561–566. LinkGoogle Scholar
  • 13 Yankelevitz DF, Reeves AP, Kostis WJ, Zhao B, Henschke CI. Small pulmonary nodules: volumetrically determined growth rates based on CT evaluation. Radiology 2000;217(1):251–256. LinkGoogle Scholar
  • 14 Gavrielides MA, Kinnard LM, Myers KJ, Petrick N. Noncalcified lung nodules: volumetric assessment with thoracic CT. Radiology 2009;251(1):26–37. LinkGoogle Scholar
  • 15 Goodman LR, Gulsun M, Washington L, Nagy PG, Piacsek KL. Inherent variability of CT lung nodule measurements in vivo using semiautomated volumetric measurements. AJR Am J Roentgenol 2006;186(4):989–994. Crossref, MedlineGoogle Scholar
  • 16 Elmore JG, Wells CK, Lee CH, Howard DH, Feinstein AR. Variability in radiologists’ interpretations of mammograms. N Engl J Med 1994;331(22):1493–1499. Crossref, MedlineGoogle Scholar
  • 17 Kerlikowske K, Grady D, Barclay J, et al.. Variability and accuracy in mammographic interpretation using the American College of Radiology Breast Imaging Reporting and Data System. J Natl Cancer Inst 1998;90(23):1801–1809. Crossref, MedlineGoogle Scholar
  • 18 Ciccone G, Vineis P, Frigerio A, Segnan N. Inter-observer and intra-observer variability of mammogram interpretation: a field study. Eur J Cancer 1992;28(6-7):1054–1058. CrossrefGoogle Scholar
  • 19 Henschke CI, Yankelevitz DF, Naidich DP, et al.. CT screening for lung cancer: suspiciousness of nodules according to size on baseline scans. Radiology 2004;231(1):164–168. LinkGoogle Scholar
  • 20 MacMahon H, Austin JH, Gamsu G, et al.. Guidelines for management of small pulmonary nodules detected on CT scans: a statement from the Fleischner Society. Radiology 2005;237(2):395–400. LinkGoogle Scholar

Article History

Received June 28, 2010; revision requested July 30; revision received September 8; accepted October 14; final version accepted November 9.
Published online: Apr 2011
Published in print: Apr 2011