Interobserver Reproducibility of the PI-RADS Version 2 Lexicon: A Multicenter Study of Six Experienced Prostate Radiologists

Six experienced radiologists achieved moderate reproducibility for Prostate Imaging Reporting and Data System version 2 and neither required nor benefitted from a training session; agreement tended to be better in the peripheral zone than the transition zone, although it was weak for dynamic contrast–enhanced imaging in the peripheral zone.

Purpose

To determine the interobserver reproducibility of the Prostate Imaging Reporting and Data System (PI-RADS) version 2 lexicon.

Materials and Methods

This retrospective HIPAA-compliant study was institutional review board–approved. Six radiologists from six separate institutions, all experienced in prostate magnetic resonance (MR) imaging, assessed prostate MR imaging examinations performed at a single center by using the PI-RADS lexicon. Readers were provided screen captures that denoted the location of one specific lesion per case. Analysis entailed two sessions (40 and 80 examinations per session) and an intersession training period for individualized feedback and group discussion. Percent agreement (fraction of pairwise reader combinations with concordant readings) was compared between sessions. κ coefficients were computed.

Results

No substantial difference in interobserver agreement was observed between sessions, and the sessions were subsequently pooled. Agreement for PI-RADS score of 4 or greater was 0.593 in peripheral zone (PZ) and 0.509 in transition zone (TZ). In PZ, reproducibility was moderate to substantial for features related to diffusion-weighted imaging (κ = 0.535–0.619); fair to moderate for features related to dynamic contrast material–enhanced (DCE) imaging (κ = 0.266–0.439); and fair for definite extraprostatic extension on T2-weighted images (κ = 0.289). In TZ, reproducibility for features related to lesion texture and margins on T2-weighted images ranged from 0.136 (moderately hypointense) to 0.529 (encapsulation). Among 63 lesions that underwent targeted biopsy, classification as PI-RADS score of 4 or greater by a majority of readers yielded tumor with a Gleason score of 3+4 or greater in 45.9% (17 of 37), without missing any tumor with a Gleason score of 3+4 or greater.

Conclusion

Experienced radiologists achieved moderate reproducibility for PI-RADS version 2, and neither required nor benefitted from a training session. Agreement tended to be better in PZ than TZ, although was weak for DCE in PZ. The findings may help guide future PI-RADS lexicon updates.

© RSNA, 2016

Online supplemental material is available for this article.

References

  • 1. Barentsz JO, Weinreb JC, Verma S, et al. Synopsis of the PI-RADS v2 Guidelines for Multiparametric Prostate Magnetic Resonance Imaging and Recommendations for Use. Eur Urol 2016;69(1):41–49. Crossref, MedlineGoogle Scholar
  • 2. Heidenreich A. Consensus criteria for the use of magnetic resonance imaging in the diagnosis and staging of prostate cancer: not ready for routine use. Eur Urol 2011;59(4):495–497. Crossref, MedlineGoogle Scholar
  • 3. Barentsz JO, Richenberg J, Clements R, et al. ESUR prostate MR guidelines 2012. Eur Radiol 2012;22(4):746–757. Crossref, MedlineGoogle Scholar
  • 4. Barrett T, Turkbey B, Choyke PL. PI-RADS version 2: what you need to know. Clin Radiol 2015;70(11):1165–1176. Crossref, MedlineGoogle Scholar
  • 5. American College of Radiology. MR Prostate Imaging Reporting and Data System version 2.0. http://www.acr.org/Quality-Safety/Resources/PIRADS/. Accessed December 23, 2015. Google Scholar
  • 6. Rosenkrantz AB, Kim S, Lim RP, et al. Prostate cancer localization using multiparametric MR imaging: comparison of Prostate Imaging Reporting and Data System (PI-RADS) and Likert scales. Radiology 2013;269(2):482–492. LinkGoogle Scholar
  • 7. Davenport MS, Khalatbari S, Liu PS, et al. Repeatability of diagnostic features and scoring systems for hepatocellular carcinoma by using MR imaging. Radiology 2014;272(1):132–142. LinkGoogle Scholar
  • 8. Mendhiratta N, Meng X, Rosenkrantz AB, et al. Prebiopsy MRI and MRI-ultrasound fusion-targeted prostate biopsy in men with previous negative biopsies: impact on repeat biopsy strategies. Urology 2015;86(6):1192–1199. Crossref, MedlineGoogle Scholar
  • 9. Mendhiratta N, Rosenkrantz AB, Meng X, et al. Magnetic resonance imaging-ultrasound fusion targeted prostate biopsy in a consecutive cohort of men with no previous biopsy: reduction of over detection through improved risk stratification. J Urol 2015;194(6):1601–1606. Crossref, MedlineGoogle Scholar
  • 10. Meng X, Rosenkrantz AB, Mendhiratta N, et al. Relationship between prebiopsy multiparametric magnetic resonance imaging (MRI), biopsy indication, and MRI-ultrasound fusion-targeted prostate biopsy outcomes. Eur Urol 2016;69(3):512–517. Crossref, MedlineGoogle Scholar
  • 11. Rosenkrantz AB, Bennett GL, Doshi A, Deng FM, Babb JS, Taneja SS. T2-weighted imaging of the prostate: Impact of the BLADE technique on image quality and tumor assessment. Abdom Imaging 2015;40(3):552–559. Crossref, MedlineGoogle Scholar
  • 12. Rosenkrantz AB, Khalef V, Xu W, Babb JS, Taneja SS, Doshi AM. Does normalisation improve the diagnostic performance of apparent diffusion coefficient values for prostate cancer assessment? A blinded independent-observer evaluation. Clin Radiol 2015;70(9):1032–1037. Crossref, MedlineGoogle Scholar
  • 13. Rosenkrantz AB, Meng X, Ream JM, et al. Likert score 3 prostate lesions: Association between whole-lesion ADC metrics and pathologic findings at MRI/ultrasound fusion targeted biopsy. J Magn Reson Imaging 2016;43(2):325–332. Crossref, MedlineGoogle Scholar
  • 14. Rosenkrantz AB, Shanbhogue AK, Wang A, Kong MX, Babb JS, Taneja SS. Length of capsular contact for diagnosing extraprostatic extension on prostate MRI: Assessment at an optimal threshold. J Magn Reson Imaging 2015 Sep 23. [Epub ahead of print] Google Scholar
  • 15. Rosenkrantz AB, Chandarana H, Hindman N, et al. Computed diffusion-weighted imaging of the prostate at 3 T: impact on image quality and tumour detection. Eur Radiol 2013;23(11):3170–3177. Crossref, MedlineGoogle Scholar
  • 16. Rosenkrantz AB, Geppert C, Grimm R, et al. Dynamic contrast-enhanced MRI of the prostate with high spatiotemporal resolution using compressed sensing, parallel imaging, and continuous golden-angle radial sampling: preliminary experience. J Magn Reson Imaging 2015;41(5):1365–1373. Crossref, MedlineGoogle Scholar
  • 17. Hoeks CM, Somford DM, van Oort IM, et al. Value of 3-T multiparametric magnetic resonance imaging and magnetic resonance-guided biopsy for early risk restratification in active surveillance of low-risk prostate cancer: a prospective multicenter cohort study. Invest Radiol 2014;49(3):165–172. Crossref, MedlineGoogle Scholar
  • 18. Harada T, Abe T, Kato F, et al. Five-point Likert scaling on MRI predicts clinically significant prostate carcinoma. BMC Urol 2015;15:91. Crossref, MedlineGoogle Scholar
  • 19. Wysock JS, Rosenkrantz AB, Huang WC, et al. A prospective, blinded comparison of magnetic resonance (MR) imaging-ultrasound fusion and visual estimation in the performance of MR-targeted prostate biopsy: the PROFUS trial. Eur Urol 2014;66(2):343–351. Crossref, MedlineGoogle Scholar
  • 20. Martin PR, Cool DW, Romagnoli C, Fenster A, Ward AD. Magnetic resonance imaging-targeted, 3D transrectal ultrasound-guided fusion biopsy for prostate cancer: Quantifying the impact of needle delivery error on diagnosis. Med Phys 2014;41(7):073504. Crossref, MedlineGoogle Scholar
  • 21. Sparks R, Bloch BN, Feleppa E, et al. Multiattribute probabilistic prostate elastic registration (MAPPER): application to fusion of ultrasound and magnetic resonance imaging. Med Phys 2015;42(3):1153–1163. Crossref, MedlineGoogle Scholar
  • 22. Fedorov A, Khallaghi S, Sánchez CA, et al. Open-source image registration for MRI-TRUS fusion-guided prostate interventions. Int J CARS 2015;10(6):925–934. CrossrefGoogle Scholar
  • 23. Feinstein AR, Cicchetti DV. High agreement but low kappa: I. The problems of two paradoxes. J Clin Epidemiol 1990;43(6):543–549. Crossref, MedlineGoogle Scholar
  • 24. Lantz CA, Nebenzahl E. Behavior and interpretation of the kappa statistic: resolution of the two paradoxes. J Clin Epidemiol 1996;49(4):431–434. Crossref, MedlineGoogle Scholar
  • 25. Shankar V, Bangdiwala SI. Observer agreement paradoxes in 2x2 tables: comparison of agreement measures. BMC Med Res Methodol 2014;14:100. Crossref, MedlineGoogle Scholar
  • 26. Berg WA, D’Orsi CJ, Jackson VP, et al. Does training in the Breast Imaging Reporting and Data System (BI-RADS) improve biopsy recommendations or feature analysis agreement with experienced breast imagers at mammography? Radiology 2002;224(3):871–880. LinkGoogle Scholar
  • 27. Lazarus E, Mainiero MB, Schepps B, Koelliker SL, Livingston LS. BI-RADS lexicon for US and mammography: interobserver variability and positive predictive value. Radiology 2006;239(2):385–391. LinkGoogle Scholar
  • 28. Abdullah N, Mesurolle B, El-Khoury M, Kao E. Breast imaging reporting and data system lexicon for US: interobserver agreement for assessment of breast masses. Radiology 2009;252(3):665–672. LinkGoogle Scholar
  • 29. Cheng SP, Lee JJ, Lin JL, Chuang SM, Chien MN, Liu CL. Characterization of thyroid nodules using the proposed thyroid imaging reporting and data system (TI-RADS). Head Neck 2013;35(4):541–547. Crossref, MedlineGoogle Scholar
  • 30. Muller BG, Shih JH, Sankineni S, et al. Prostate cancer: interobserver agreement and accuracy with the revised prostate imaging reporting and data system at multiparametric MR imaging. Radiology 2015;277(3):741–750. LinkGoogle Scholar
  • 31. Renard-Penna R, Mozer P, Cornud F, et al. Prostate imaging reporting and data system and Likert scoring system: multiparametric MR imaging validation study to screen patients for initial biopsy. Radiology 2015;275(2):458–468. LinkGoogle Scholar
  • 32. Vaché T, Bratan F, Mège-Lechevallier F, Roche S, Rabilloud M, Rouvière O. Characterization of prostate lesions as benign or malignant at multiparametric MR imaging: comparison of three scoring systems in patients treated with radical prostatectomy. Radiology 2014;272(2):446–455. LinkGoogle Scholar
  • 33. Schimmöller L, Quentin M, Arsov C, et al. Inter-reader agreement of the ESUR score for prostate MRI using in-bore MRI-guided biopsies as the reference standard. Eur Radiol 2013;23(11):3185–3190. Crossref, MedlineGoogle Scholar
  • 34. Akin O, Riedl CC, Ishill NM, Moskowitz CS, Zhang J, Hricak H. Interactive dedicated training curriculum improves accuracy in the interpretation of MR imaging of prostate cancer. Eur Radiol 2010;20(4):995–1002. Crossref, MedlineGoogle Scholar
  • 35. Garcia-Reyes K, Passoni NM, Palmeri ML, et al. Detection of prostate cancer with multiparametric MRI (mpMRI): effect of dedicated reader education on accuracy and confidence of index and anterior cancer diagnosis. Abdom Imaging 2015;40(1):134–142. Crossref, MedlineGoogle Scholar
  • 36. Rosenkrantz AB, Lim RP, Haghighi M, Somberg MB, Babb JS, Taneja SS. Comparison of interreader reproducibility of the prostate imaging reporting and data system and likert scales for evaluation of multiparametric prostate MRI. AJR Am J Roentgenol 2013;201(4):W612–W618. Crossref, MedlineGoogle Scholar
  • 37. Mendhiratta N, Meng X, Taneja SS. Using multiparametric MRI to ‘personalize’ biopsy for men. Curr Opin Urol 2015;25(6):498–503. Crossref, MedlineGoogle Scholar

Article History

Received November 18, 2015; revision requested December 22; revision received January 8, 2016; accepted January 19; final version accepted January 28.
Published online: Apr 01 2016
Published in print: Sept 2016