MRI-based Bosniak Classification of Cystic Renal Masses, Version 2019: Interobserver Agreement, Impact of Readers’ Experience, and Diagnostic Performance
Abstract
Background
The 2019 Bosniak classification (version 2019) of cystic renal masses (CRMs) provides a systematic update to the currently used 2005 Bosniak classification (version 2005). Further validation is required before widespread application.
Purpose
To evaluate the interobserver agreement of MRI criteria, the impact of readers’ experience, and the diagnostic performance between version 2019 and version 2005.
Materials and Methods
From January 2009 to December 2018, consecutive patients with CRM who had undergone renal MRI and surgical-pathologic examination were included in this retrospective study. On the basis of version 2019 and version 2005, all CRMs were independently classified by eight radiologists with different levels of experience. By using multirater κ statistics, interobserver agreement was evaluated with comparisons between classifications and between senior and junior radiologists. Diagnostic performance between classifications by dichotomizing classes I–IV into lower (I–IIF) and higher (III–IV) classes was compared by using the McNemar test. P < .05 was considered to indicate a statistically significant difference.
Results
A total of 207 patients (mean age ± standard deviation, 49 years ± 12; 139 male and 68 female patients) with CRMs were included. Overall, interobserver agreement was higher with version 2019 than version 2005 (weighted κ = 0.64 vs 0.50, respectively; P < .001). Interobserver agreement between senior and junior radiologists did not differ between version 2019 (weighted κ = 0.65 vs 0.64, respectively; P = .71) and version 2005 (weighted κ = 0.54 vs 0.46; P < .001). Diagnostic specificity for malignancy was higher with version 2019 than with version 2005 (83% [92 of 111] vs 68% [75 of 111], respectively; P < .001), without any difference in sensitivity (89% [85 of 96] vs 84% [81 of 96]; P = .34).
Conclusion
In the updated Bosniak classification, interobserver agreement improved and was unaffected by observers’ experience. The diagnostic performance with version 2019 was superior to that with version 2005, with higher specificity.
Published under a CC BY 4.0 license.
Online supplemental material is available for this article.
See also the editorial by Choyke in this issue.
Summary
With the updated 2019 Bosniak classification of cystic renal masses, interobserver agreement substantially improved and was unaffected by readers’ experience; diagnostic specificity for malignancy was superior to that with the 2005 classification.
Key Results
■ Interobserver agreement improved with the 2019 Bosniak classification of cystic renal masses compared with the currently used 2005 classification (weighted κ = 0.64 vs 0.50, respectively; P < .001).
■ Interobserver agreement between senior and junior radiologists differed with the 2005 classification (weighted κ = 0.54 vs 0.46; P = .001), but not the 2019 classification (weighted κ = 0.65 vs 0.64; P = .71).
■ Higher diagnostic specificity for malignancy was noted for the 2019 versus 2005 classification (83% [92 of 111] vs 68% [75 of 111]; P < .001).
Introduction
The Bosniak classification of cystic renal masses (CRMs) has contributed substantially to the stratification of malignancy risk in the 3 decades since it was proposed (1). As a living system, refinements were made in 1993 and 2005 (version 2005) (2–4). With the current version of the Bosniak classification (version 2005), several shortcomings in both clinical practice and scientific research have been noted.
A systematic review has suggested that interreader variability for the Bosniak classification is large, ranging from 6% to 75% (5), especially for Bosniak classes II, IIF, and III. This variability is partly explained by relatively subjective classification criteria. Moreover, the reported risk for malignancy of each class of CRM varies widely. For example, the likelihood for Bosniak classes II, IIF, III, and IV is, respectively, 9% (range, 5%–14%), 18% (range, 12%–26%), 51% (range, 42%–61%), and 86% (range, 81%–89%) (5). The high prevalence of a benign finding among Bosniak class III CRMs (approximately 49%) (6) is also a concern because unnecessary surgery may cause potential harm and present no clinical benefit.
Bosniak version 2005 was established on the basis of CT findings. Although it has been applied to MRI, several studies have suggested that MRI can provide more details regarding the wall or septa of CRMs and thus lead to upgrades of CRMs (7–9). It has also been reported that more experienced senior readers obtain a higher level of agreement than the junior readers when using version 2005 with MRI (10).
To address the shortcomings with version 2005, the 2019 Bosniak classification (hereafter, version 2019) included some essential updates (6). In addition to revisions of CT imaging criteria, version 2019 provided specific criteria for the imaging features of wall and septa, structure, and enhancement, which lacked a clear definition in version 2005. It also formally incorporated MRI criteria for classification that require further validation.
The purpose of our study was to evaluate the interobserver agreement for MRI criteria, the impact of readers’ experience, and the diagnostic performance with version 2019 compared with version 2005.
Materials and Methods
Patients
The institutional review board of our hospital approved this retrospective study and waived the need for informed consent. From January 2009 to June 2019, two radiologists (X.B. and S.M.S, with 5 and 15 years of experience, respectively) searched the local picture archiving and communication system and collated consecutive patients with CRM who underwent preoperative renal MRI. Inclusion criteria were as follows: (a) CRM diagnosed in accordance with version 2019 (enhancing tissue <25% of the renal lesion), (b) CRM with surgical-pathology reports, and (c) complete renal MRI examinations performed within 6 months before surgery. Exclusion criteria were as follows: (a) insufficient MRI scan quality, including images with obvious artifacts, incomplete MRI coverage, and incomplete renal MRI protocol; (b) CRM with infection, inflammation, and vascular diseases; (c) polycystic kidney disease, an inherited disorder in which clusters of cysts develop primarily within kidneys; and (d) suspected or confirmed syndrome related to renal cell carcinoma, such as von Hippel-Lindau syndrome. The flowchart of patient enrollment (including the prestudy training cohort and formal evaluation cohort) is provided in Figure 1.

Figure 1: Flowchart of the study cohort. CRM = cystic renal mass, PACS = picture archiving and communication system.
MRI Acquisition
MRI was performed at 1.5 T and 3.0 T. Renal MRI protocols included axial fat-suppressed T2-weighted, coronal fat-suppressed T2-weighted, axial T1-weighted, axial multiphase contrast material–enhanced T1-weighted, and coronal contrast-enhanced T1-weighted sequences. MRI machines and protocols used are listed in Tables E1–E4 (online).
Pathologic Analysis
All pathologic results of CRMs (Figs 2–6) were retrospectively reviewed by one uropathologist (A.T.G., with 20 years of experience) according to the 2016 World Health Organization Classification of Tumors of the Urinary System and Male Genital Organs (11).

Figure 2a: Images in a 64-year-old woman with left-sided cystic renal mass. (a–c) Lesion shows cerebrospinal fluid–like signal intensity and no septa on axial T2-weighted MRI scan (a), low signal intensity on T1-weighted MRI scan (b), and an enhancing smooth wall 2 mm or less in width (arrowhead) on axial nephrographic-phase contrast-enhanced T1-weighted image (c). (d) Photomicrograph (hematoxylin-eosin stain; original magnification, ×5.5) helps confirm the diagnosis of a renal cyst. The lesion was classified as class I according to Bosniak classification, version 2019, and as class IIF according to Bosniak classification, version 2005, because of the thicker-than-hairline wall with perceived enhancement.

Figure 2b: Images in a 64-year-old woman with left-sided cystic renal mass. (a–c) Lesion shows cerebrospinal fluid–like signal intensity and no septa on axial T2-weighted MRI scan (a), low signal intensity on T1-weighted MRI scan (b), and an enhancing smooth wall 2 mm or less in width (arrowhead) on axial nephrographic-phase contrast-enhanced T1-weighted image (c). (d) Photomicrograph (hematoxylin-eosin stain; original magnification, ×5.5) helps confirm the diagnosis of a renal cyst. The lesion was classified as class I according to Bosniak classification, version 2019, and as class IIF according to Bosniak classification, version 2005, because of the thicker-than-hairline wall with perceived enhancement.

Figure 2c: Images in a 64-year-old woman with left-sided cystic renal mass. (a–c) Lesion shows cerebrospinal fluid–like signal intensity and no septa on axial T2-weighted MRI scan (a), low signal intensity on T1-weighted MRI scan (b), and an enhancing smooth wall 2 mm or less in width (arrowhead) on axial nephrographic-phase contrast-enhanced T1-weighted image (c). (d) Photomicrograph (hematoxylin-eosin stain; original magnification, ×5.5) helps confirm the diagnosis of a renal cyst. The lesion was classified as class I according to Bosniak classification, version 2019, and as class IIF according to Bosniak classification, version 2005, because of the thicker-than-hairline wall with perceived enhancement.

Figure 2d: Images in a 64-year-old woman with left-sided cystic renal mass. (a–c) Lesion shows cerebrospinal fluid–like signal intensity and no septa on axial T2-weighted MRI scan (a), low signal intensity on T1-weighted MRI scan (b), and an enhancing smooth wall 2 mm or less in width (arrowhead) on axial nephrographic-phase contrast-enhanced T1-weighted image (c). (d) Photomicrograph (hematoxylin-eosin stain; original magnification, ×5.5) helps confirm the diagnosis of a renal cyst. The lesion was classified as class I according to Bosniak classification, version 2019, and as class IIF according to Bosniak classification, version 2005, because of the thicker-than-hairline wall with perceived enhancement.

Figure 3a: Images in a 54-year-old man with a right-sided cystic renal mass. (a–c) The lesion shows two septa (arrowhead) on T2-weighted MRI scan (a), homogeneous hypointensity on T1-weighted MRI scan (b), and 1-mm-wide smooth wall with no enhancement of septa on axial nephrographic phase contrast-enhanced T1-weighted MRI scan (c). (d) Photomicrograph (hematoxylin-eosin stain; original magnification, ×7.1) helps confirm the diagnosis of a renal cyst. The lesion was classified as class II according to Bosniak classification, versions 2019 and 2005.

Figure 3b: Images in a 54-year-old man with a right-sided cystic renal mass. (a–c) The lesion shows two septa (arrowhead) on T2-weighted MRI scan (a), homogeneous hypointensity on T1-weighted MRI scan (b), and 1-mm-wide smooth wall with no enhancement of septa on axial nephrographic phase contrast-enhanced T1-weighted MRI scan (c). (d) Photomicrograph (hematoxylin-eosin stain; original magnification, ×7.1) helps confirm the diagnosis of a renal cyst. The lesion was classified as class II according to Bosniak classification, versions 2019 and 2005.

Figure 3c: Images in a 54-year-old man with a right-sided cystic renal mass. (a–c) The lesion shows two septa (arrowhead) on T2-weighted MRI scan (a), homogeneous hypointensity on T1-weighted MRI scan (b), and 1-mm-wide smooth wall with no enhancement of septa on axial nephrographic phase contrast-enhanced T1-weighted MRI scan (c). (d) Photomicrograph (hematoxylin-eosin stain; original magnification, ×7.1) helps confirm the diagnosis of a renal cyst. The lesion was classified as class II according to Bosniak classification, versions 2019 and 2005.

Figure 3d: Images in a 54-year-old man with a right-sided cystic renal mass. (a–c) The lesion shows two septa (arrowhead) on T2-weighted MRI scan (a), homogeneous hypointensity on T1-weighted MRI scan (b), and 1-mm-wide smooth wall with no enhancement of septa on axial nephrographic phase contrast-enhanced T1-weighted MRI scan (c). (d) Photomicrograph (hematoxylin-eosin stain; original magnification, ×7.1) helps confirm the diagnosis of a renal cyst. The lesion was classified as class II according to Bosniak classification, versions 2019 and 2005.

Figure 4a: Images in a 41-year-old man with a left-sided cystic renal mass. (a–c) The lesion shows a few septa on T2-weighted MRI scan (a), hypointensity with focal hyperintensity (mean signal intensity, 1048) (circle) on T1-weighted MRI scan (b), and enhancement of 3-mm-wide septa (mean signal intensity, 1620) (arrowhead) and nonenhancing area (mean signal intensity, 1057) (circle) on axial nephrographic-phase contrast-enhanced-T1-weighted image (c). (d) Photomicrograph (hematoxylin-eosin stain; original magnification, ×6.3) helps confirm the diagnosis of mixed epithelial and stromal tumor. The lesion was classified as class IIF according to Bosniak classification, version 2019, because of its minimally thickened enhancing septum and class III according to Bosniak classification, version 2005, because of the measurable enhancement of the thickened septum.

Figure 4b: Images in a 41-year-old man with a left-sided cystic renal mass. (a–c) The lesion shows a few septa on T2-weighted MRI scan (a), hypointensity with focal hyperintensity (mean signal intensity, 1048) (circle) on T1-weighted MRI scan (b), and enhancement of 3-mm-wide septa (mean signal intensity, 1620) (arrowhead) and nonenhancing area (mean signal intensity, 1057) (circle) on axial nephrographic-phase contrast-enhanced-T1-weighted image (c). (d) Photomicrograph (hematoxylin-eosin stain; original magnification, ×6.3) helps confirm the diagnosis of mixed epithelial and stromal tumor. The lesion was classified as class IIF according to Bosniak classification, version 2019, because of its minimally thickened enhancing septum and class III according to Bosniak classification, version 2005, because of the measurable enhancement of the thickened septum.

Figure 4c: Images in a 41-year-old man with a left-sided cystic renal mass. (a–c) The lesion shows a few septa on T2-weighted MRI scan (a), hypointensity with focal hyperintensity (mean signal intensity, 1048) (circle) on T1-weighted MRI scan (b), and enhancement of 3-mm-wide septa (mean signal intensity, 1620) (arrowhead) and nonenhancing area (mean signal intensity, 1057) (circle) on axial nephrographic-phase contrast-enhanced-T1-weighted image (c). (d) Photomicrograph (hematoxylin-eosin stain; original magnification, ×6.3) helps confirm the diagnosis of mixed epithelial and stromal tumor. The lesion was classified as class IIF according to Bosniak classification, version 2019, because of its minimally thickened enhancing septum and class III according to Bosniak classification, version 2005, because of the measurable enhancement of the thickened septum.

Figure 4d: Images in a 41-year-old man with a left-sided cystic renal mass. (a–c) The lesion shows a few septa on T2-weighted MRI scan (a), hypointensity with focal hyperintensity (mean signal intensity, 1048) (circle) on T1-weighted MRI scan (b), and enhancement of 3-mm-wide septa (mean signal intensity, 1620) (arrowhead) and nonenhancing area (mean signal intensity, 1057) (circle) on axial nephrographic-phase contrast-enhanced-T1-weighted image (c). (d) Photomicrograph (hematoxylin-eosin stain; original magnification, ×6.3) helps confirm the diagnosis of mixed epithelial and stromal tumor. The lesion was classified as class IIF according to Bosniak classification, version 2019, because of its minimally thickened enhancing septum and class III according to Bosniak classification, version 2005, because of the measurable enhancement of the thickened septum.

Figure 5a: Images in a 47-year-old woman with a right-sided cystic renal mass. (a–c) The lesion shows multiple (four or more) septa on T2-weighted MRI scan (a), heterogeneous hypointensity on T1-weighted MRI scan (b), and irregular enhancing septa (arrowhead) on axial nephrographic-phase contrast-enhanced T1-weighted MRI scan (c). (d) Photomicrograph (hematoxylin-eosin stain; original magnification, ×11.6) helps confirm a clear cell renal cell carcinoma. The lesion was classified as class III according to Bosniak classification, version 2019, because of the irregular enhancing septa and as class IIF according to Bosniak classification, version 2005, because of more than a few septa with perceived enhancement.

Figure 5b: Images in a 47-year-old woman with a right-sided cystic renal mass. (a–c) The lesion shows multiple (four or more) septa on T2-weighted MRI scan (a), heterogeneous hypointensity on T1-weighted MRI scan (b), and irregular enhancing septa (arrowhead) on axial nephrographic-phase contrast-enhanced T1-weighted MRI scan (c). (d) Photomicrograph (hematoxylin-eosin stain; original magnification, ×11.6) helps confirm a clear cell renal cell carcinoma. The lesion was classified as class III according to Bosniak classification, version 2019, because of the irregular enhancing septa and as class IIF according to Bosniak classification, version 2005, because of more than a few septa with perceived enhancement.

Figure 5c: Images in a 47-year-old woman with a right-sided cystic renal mass. (a–c) The lesion shows multiple (four or more) septa on T2-weighted MRI scan (a), heterogeneous hypointensity on T1-weighted MRI scan (b), and irregular enhancing septa (arrowhead) on axial nephrographic-phase contrast-enhanced T1-weighted MRI scan (c). (d) Photomicrograph (hematoxylin-eosin stain; original magnification, ×11.6) helps confirm a clear cell renal cell carcinoma. The lesion was classified as class III according to Bosniak classification, version 2019, because of the irregular enhancing septa and as class IIF according to Bosniak classification, version 2005, because of more than a few septa with perceived enhancement.

Figure 5d: Images in a 47-year-old woman with a right-sided cystic renal mass. (a–c) The lesion shows multiple (four or more) septa on T2-weighted MRI scan (a), heterogeneous hypointensity on T1-weighted MRI scan (b), and irregular enhancing septa (arrowhead) on axial nephrographic-phase contrast-enhanced T1-weighted MRI scan (c). (d) Photomicrograph (hematoxylin-eosin stain; original magnification, ×11.6) helps confirm a clear cell renal cell carcinoma. The lesion was classified as class III according to Bosniak classification, version 2019, because of the irregular enhancing septa and as class IIF according to Bosniak classification, version 2005, because of more than a few septa with perceived enhancement.

Figure 6a: Images in a 54-year-old man with a right-sided cystic renal mass. (a–c) The lesion shows a nodule on the wall with slight hyperintensity on T2-weighted MRI scan (a), heterogeneous signal intensity (mean, 480) (circle) on T1-weighted MRI scan (b), and enhancement of the nodule (signal intensity, 1403) (circle) on axial nephrographic-phase contrast-enhanced T1-weighted image (c). (d) Photomicrograph (hematoxylin-eosin stain; original magnification, ×11.6) helps confirm a clear cell renal cell carcinoma. The lesion was classified as class IV according to Bosniak classification, version 2019, because of the enhancing nodule and as class IV according to Bosniak classification, version 2005, because of the soft-tissue component with measurable enhancement.

Figure 6b: Images in a 54-year-old man with a right-sided cystic renal mass. (a–c) The lesion shows a nodule on the wall with slight hyperintensity on T2-weighted MRI scan (a), heterogeneous signal intensity (mean, 480) (circle) on T1-weighted MRI scan (b), and enhancement of the nodule (signal intensity, 1403) (circle) on axial nephrographic-phase contrast-enhanced T1-weighted image (c). (d) Photomicrograph (hematoxylin-eosin stain; original magnification, ×11.6) helps confirm a clear cell renal cell carcinoma. The lesion was classified as class IV according to Bosniak classification, version 2019, because of the enhancing nodule and as class IV according to Bosniak classification, version 2005, because of the soft-tissue component with measurable enhancement.

Figure 6c: Images in a 54-year-old man with a right-sided cystic renal mass. (a–c) The lesion shows a nodule on the wall with slight hyperintensity on T2-weighted MRI scan (a), heterogeneous signal intensity (mean, 480) (circle) on T1-weighted MRI scan (b), and enhancement of the nodule (signal intensity, 1403) (circle) on axial nephrographic-phase contrast-enhanced T1-weighted image (c). (d) Photomicrograph (hematoxylin-eosin stain; original magnification, ×11.6) helps confirm a clear cell renal cell carcinoma. The lesion was classified as class IV according to Bosniak classification, version 2019, because of the enhancing nodule and as class IV according to Bosniak classification, version 2005, because of the soft-tissue component with measurable enhancement.

Figure 6d: Images in a 54-year-old man with a right-sided cystic renal mass. (a–c) The lesion shows a nodule on the wall with slight hyperintensity on T2-weighted MRI scan (a), heterogeneous signal intensity (mean, 480) (circle) on T1-weighted MRI scan (b), and enhancement of the nodule (signal intensity, 1403) (circle) on axial nephrographic-phase contrast-enhanced T1-weighted image (c). (d) Photomicrograph (hematoxylin-eosin stain; original magnification, ×11.6) helps confirm a clear cell renal cell carcinoma. The lesion was classified as class IV according to Bosniak classification, version 2019, because of the enhancing nodule and as class IV according to Bosniak classification, version 2005, because of the soft-tissue component with measurable enhancement.
Diagnostic Performance with Version 2019 and Version 2005 Bosniak Classifications
Two junior radiologists (X.B. and S.M.S.) were responsible for the image anonymization and randomization of all CRM cases in the formal evaluation cohort. Two senior abdominal radiologists (H.Y.Y. and H.Y.W., with 30 and 20 years of experience, respectively) in our hospital studied both versions of the Bosniak classification (version 2019 and version 2005) (3,4,6,12) and translated the CT criteria of version 2005 into MRI criteria (Table E5 [online]) for our study.
Blinded to the clinical and pathologic information, these two senior radiologists then classified in consensus all the CRMs in the formal evaluation cohort into classes I–IV according to versions 2019 and 2005, with an interval of 1 month between evaluations. These results acted as the reference standard for CRM classification for the subsequent clinical diagnostic performance and observer studies.
On the basis of the pathologic result for each CRM, the malignancy rate for each class of the consensus clinical classification was also calculated for versions 2019 and 2005. Furthermore, according to the different clinical management (nonsurgery for classes I, II, and IIF and potential surgery for classes III and IV), all five classes of CRMs were dichotomized into two groups (lower [I–IIF] and higher [III–IV]classes), and the diagnostic performance for malignancy of Bosniak classification was investigated and compared between version 2019 and version 2005.
Interobserver Study
Prestudy training.—Eight board-certified radiologists from eight different hospitals were invited to evaluate CRM classification in this study: four senior radiologists (H.Y.L., Q.G.L.G., G.C.L., and Y.Q.J., with 16–21 years of experience) and four junior radiologists (L.L.L., P.W., Q.R.W., and S.L.C., with 1–10 years of experience). The two senior abdominal radiologists (H.Y.Y. and H.Y.W.) acted as trainers in this study. In the month before the formal evaluation, the trainers gave the eight readers three lectures, including a review of version 2005, introduction to version 2019, and further details of version 2019, and they performed one simulation evaluation using the prestudy training cohort.
Formal evaluation.—Blinded to clinical and pathologic information, all eight radiologists independently reviewed all MRI scans of the CRMs in the formal evaluation cohort during a 1-week period, with 1 month between the version 2019 and version 2005 readings. Readers were asked to document the status of critical MRI features of CRMs and make a final classification. The laterality of the kidney of CRMs and the slice number of the maximum diameter of CRMs in the MRI sequences were provided to help readers locate the targeted lesion.
For version 2005, the readers documented the number of septa (none, few, and more than a few), enhancement (no enhancement, perceptual and measurable enhancement), thickness (hairline thin, minimally thickened, and thickened), and morphologic characteristics (smooth, irregular, and soft-tissue components) of the cystic wall or septa.
Correspondingly, for version 2019, the readers documented the enhancement (nonenhancing and enhancing) of wall or septa, the number (none, one to three, and four or more) of enhancing septa, the thickness (thin [≤2 mm], minimally thickened [3 mm], and thickened [≥4 mm]), and the morphologic characteristics (smooth, irregular [displaying 3 mm obtusely margined convex protrusion], and nodule [4-mm convex protrusion with obtuse margins, or a convex protrusion of any size that has acute margins]) of the enhancing wall or septa.
Interobserver analysis.—Interobserver agreement of the Bosniak classification and the critical imaging features of CRMs among the eight readers was evaluated and compared between version 2019 and version 2005. Comparison of the interobserver agreement between version 2019 and version 2005 was also conducted for a cohort that included only class II, IIF, and III. To analyze the impact of readers’ work experience on interobserver agreement, the interobserver agreement among senior and junior readers was evaluated and compared, respectively, in version 2019 and version 2005.
Statistical Analysis
Statistical analyses were performed by a statistician (L.L., with 20 years of experience) using Stata/MP 14.0 (StataCorp, College Station, Tex) for Windows. The specificity and sensitivity for malignancy between version 2019 and version 2005 were compared by using the McNemar test. Interobserver agreement among different readers was evaluated by using multirater κ analysis, and weighted κ values were calculated and compared between version 2019 and version 2005 or between senior and junior readers. A method introduced by Gwet (13) for testing the statistical difference of correlated agreement coefficients was applied for the comparison of weighed κ values on the same set of CRMs, which is similar to the pairwise t test. For the agreement analysis, the weighted κ statistic was interpreted as follows: 0.2 or less, slight agreement; 0.21–0.40, fair agreement; 0.41–0.60, moderate agreement; 0.61–0.80, substantial agreement; 0.81–1.00, almost perfect agreement (14). P < .05 was considered to indicate a statistically significant difference.
Results
Patient Demographic Characteristics
In the prestudy training cohort, 13 patients (eight men and five women; mean age ± standard deviation, 56 years ± 25; age range, 23–77 years) with 13 pathology-proven CRMs (benign cysts, 54% [seven of 13]; mixed epithelial and stromal tumors, 23% [three of 13]; clear cell renal cell carcinoma, 23% [three of 13]) were included from January to June 2019.
In the formal evaluation cohort, 207 patients (139 male and 68 female patients; mean age, 49 years ± 12; age range, 16–75 years) with 207 pathology-proven CRMs (benign cysts, 40% [83 of 207]; benign tumors, 14% [28 of 207]; malignant tumors, 46% [96 of 207]) were included from January 2009 to December 2018. Among them, benign tumors were composed of mixed epithelial and stromal tumors (46% [13 of 28]), cystic nephroma (50% [14 of 28]), and angiomyolipoma with epithelial cysts (4% [one of 28]) (15); malignant tumors were composed of multilocular cystic renal neoplasm of low malignant potential (13% [12 of 96]), clear cell renal cell carcinoma (86% [83 of 96]), and papillary renal cell carcinoma (1% [one of 96]). Clinical characteristics for patients with CRMs are shown in Table 1.
![]() |
Diagnostic Performance with Version 2019 and Version 2005 Bosniak Classifications
The classification results for Bosniak I, II, IIF, III, and IV CRMs with version 2019 and version 2005 (Figs 2–6) are shown in Table 2. The malignancy rates of each class are reported in Table 3. With version 2019, 83% (92 of 111) of benign lesions were classified into the lower class (class I–IIF) and 89% (85 of 96) of malignant tumors were classified into the higher class (class III or IV). With version 2005, 68% (75 of 111) of benign lesions were classified into the lower class and 84% (81 of 96) of malignant tumors were classified into the higher class. The specificity and sensitivity for malignancy were 83% (95% confidence interval [CI]: 75%, 89%) and 89% (95% CI: 80%, 94%), respectively, with version 2019 and 68% (95% CI: 58%, 76%) and 84% (95% CI: 76%, 91%), respectively, with version 2005. Bosniak version 2019 had a higher diagnostic specificity for malignancy than did version 2005 (83% [92 of 111] vs 68% [75 of 111], respectively; P < .001), with no difference in sensitivity (89% [85 of 96] vs 84% [81 of 96]; P = .34).
![]() |
![]() |
Comparison of Interobserver Agreement between Version 2019 and Version 2005 for All Classes of CRMs
The distribution of CRMs in each class is listed in Table E6 (online). For all classes of CRMs in the final cohort, the interobserver agreement among the eight readers was substantial with version 2019 (weighted κ = 0.64 [95% CI: 0.60, 0.68]) and moderate with version 2005 (weighted κ = 0.50 [95% CI: 0.46, 0.55]). The interobserver agreement with version 2019 was significantly higher than that with version 2005 (P < .001) (Table 4).
![]() |
Comparison of Interobserver Agreement between Version 2019 and Version 2005 for Critical Imaging Features
Interobserver agreement of the critical imaging features with version 2019 was higher than the corresponding features with version 2005 (enhancement of wall or septa: weighted κ = 0.54 [95% CI: 0.48, 0.61] vs 0.41 [95% CI: 0.37, 0.46], respectively, P < .001; number of enhancing septa, weighted κ = 0.68 [95% CI: 0.63, 0.72] vs 0.58 [95% CI: 0.53, 0.63], P < .001; morphologic characteristics of enhancing wall or septa: weighted κ = 0.58 [95% CI: 0.53, 0.64] vs 0.49 [95% CI: 0.44, 0.54], P < .001) with the exception of the thickness of the enhancing wall or septa (weighted κ = 0.37 [95% CI: 0.32, 0.42] vs 0.39 [95% CI: 0.34, 0.44], P = .54).
Comparison of Interobserver Agreement between Version 2019 and Version 2005 with Only Class II, IIF, and III Included
With only class II, IIF, and III included, there were 147 CRMs for version 2019 and 123 CRMs for version 2005. For the 147 CRMs based on version 2019, the weighted κ values among the eight readers were 0.53 (95% CI: 0.48, 0.58) and 0.37 (95% CI: 0.32, 0.41) with version 2019 and version 2005, respectively (P < .001). For the 123 CRMs based on version 2005, the weighted κ values among the eight readers were 0.56 (95% CI: 0.52, 0.61) and 0.35 (95% CI: 0.31, 0.40) with version 2019 and version 2005, respectively (P < .001) (Table 4).
Impact of Observers’ Experience on Classification Agreement
With version 2019, the senior and junior readers achieved similar interobserver agreement (weighted κ = 0.65 [95% CI: 0.60, 0.69] vs 0.64 [95% CI: 0.59, 0.68], respectively; P = .71), whereas with version 2005, the senior readers showed higher agreement than the junior readers (weighted κ = 0.54 [95% CI: 0.49, 0.59] vs 0.46 [95% CI: 0.41, 0.51]; P = .001) (Table E7 [online]).
Discussion
To overcome the shortcomings of the current Bosniak classification (version 2005), an update (version 2019) was proposed in June 2019 (6). Our study aimed to test whether the update improved the clarity and specificity of classification. On the basis of MRI scans of 207 pathologically confirmed cystic renal masses, our study revealed higher interobserver agreement among eight readers from different centers with version 2019 (weighted κ = 0.64) than with version 2005 (weighted κ = 0.50).
The interobserver agreement in version 2019 was superior to that in previous studies (κ = 0.51–0.52) based on MRI (16) or CT (17) scans, probably because of more objective and precise classification criteria. However, agreement was lower than that of other previous studies (κ = 0.66–0.85) (10,18–20), partially because of the relatively small number of malignant tumors, fewer readers, higher rate of class I and IV CRMs, or the bias in the inclusion of CRMs and establishment of the study cohort in those studies.
Of the critical imaging features in versions 2019 and 2005, most features except for the thickness of enhancing wall or septa in version 2019 achieved higher interobserver agreement to varying degrees compared with version 2005, which could partly explain the overall improvement in interobserver agreement for version 2019. As noted by the authors of version 2019, the measurement of thickness will vary among readers, and the quantitative criteria are intended to “serve as guideposts rather than absolute expressions” (6). Nevertheless, overall, the increased interobserver agreement of imaging features in version 2019 confirmed the necessity and feasibility of redefining specific features.
The inclusion of clear-cut Bosniak I and IV CRMs may contribute to higher interobserver agreement (10,21,22). After exclusion of class I and IV CRMs from our study cohort, interobserver agreement was lower with both version 2019 (weighted κ = 0.53 vs 0.64) and version 2005 (weighted κ = 0.35 vs 0.50), which confirms that Bosniak I and IV CRMs influence interobserver agreement. However, the degree of interobserver agreement remained higher with version 2019 than with version 2005, further confirming the validity of the updates.
The level of radiologists’ work experience may also affect the classification of CRMs (10,18,20). Similar to the previous researchers, we found a difference in interobserver agreement between senior and junior readers (weighted κ = 0.54 vs 0.46, P < .001) with version 2005. However, with version 2019, interobserver agreement was similar (weighted κ = 0.65 vs 0.64, P = .71), implying the level of experience was less critical if there was a clear understanding of the critical points of classification.
We also report the malignancy rates of each class of CRMs with versions 2019 and 2005 from a consensus read by two experienced readers. Similar malignancy rates of class I, II, IIF, and IV were observed between versions 2019 and 2005, which was also similar to that in the previous study (5). However, the malignancy rate of class III in version 2019 demonstrated a tendency of being higher than that in version 2005 (83% vs 64%) and also higher than the average level of 51% reported in the previous literature (5). Moreover, when the five-class classification system was divided into lower (I–IIF) and higher (III–IV) classes, the diagnostic specificity of version 2019 for malignant tumors was higher than that of version 2005 (83% vs 68%, P < .001), which was also superior to the pooled specificity in previous studies (67%) (23). These improvements can be explained by correct downgrading of more pathologically proven benign CRMs into the lower class, which correspondingly increased the classification’s specificity for malignancy in version 2019.
Our study had several limitations. First, the reference standard of CRMs was set by two experienced radiologists in consensus; although this was acceptable, if additional observers were involved, the results may have been different. Second, in this retrospective analysis, subtraction images from T1-weighted imaging were not provided, which could lower the confidence to determine the enhancement of CRM for readers and influence interobserver agreement. Third, because of the potential difference of cyst complexity between 1.5 T and 3.0 T (16), the images of CRMs in our study from the different MRI scanners and two field strengths may influence the classification results and interobserver agreement. Fourth, our study focused on MRI classification criteria. Although CT and MRI findings were similar in most CRMs (2), whether our results can be generalized for CT should be further investigated.
In conclusion, with respect to MRI classification, interobserver agreement with Bosniak version 2019 was substantially improved compared to that with version 2005 and was not affected by the level of reader experience. Moreover, the diagnostic specificity for malignancy in version 2019 was higher, which would facilitate the management of cystic renal masses in clinical practice.
Acknowledgment
We thank Xiutang Cao, PhD, the dean of the Department of Medical Statistics, Institute for Hospital Management Research, Chinese PLA General Hospital, for his guidance on the statistical analysis in this study.
Author Contributions
Author contributions: Guarantors of integrity of entire study, X.B., H.Y.W.; study concepts/study design or data acquisition or data analysis/interpretation, all authors; manuscript drafting or manuscript revision for important intellectual content, all authors; approval of final version of submitted manuscript, all authors; agrees to ensure any questions related to the work are appropriately resolved, all authors; literature research, X.B., S.M.S., H.H.K., H.Y.W.; clinical studies, X.B., S.M.S., W.X., H.H.K., Y.Q.J., Q.G.L.G., G.C.L., H.Y.L., L.L.L., S.L.C., Q.R.W., P.W., A.T.G., Q.B.H., X.J.Z., H.Y.Y.; statistical analysis, X.B., L.L.; and manuscript editing, X.B., H.Y.Y., H.Y.W.
Supported by the National Natural Science Foundation of China (grant 81971580) and the Medical Big Data Research and Development Project supported by Chinese PLA General Hospital (grant 2018MBD-023)
References
- 1. . The current radiological approach to renal cysts. Radiology 1986;158(1):1–10.
- 2. . An update of the Bosniak renal cyst classification system. Urology 2005;66(3):484–488.
- 3. . The Bosniak renal cyst classification: 25 years later. Radiology 2012;262(3):781–785.
- 4. . Problems in the radiologic diagnosis of renal parenchymal tumors. Urol Clin North Am 1993;20(2):217–230.
- 5. . Bosniak classification for complex renal cysts reevaluated: a systematic review. J Urol 2017;198(1):12–21.
- 6. . Bosniak classification of cystic renal masses, version 2019: an update proposal and needs assessment. Radiology 2019;292(2):475–488.
- 7. . MRI evaluation of complex renal cysts using the Bosniak classification: a comparison to CT. Abdom Radiol (NY) 2016;41(10):2011–2019.
- 8. . Magnetic resonance imaging as an adjunct diagnostic tool in computed tomography defined Bosniak IIF-III renal cysts: a multicenter study. World J Urol 2018;36(6):905–911.
- 9. . Imaging of cystic renal masses. Radiol Clin North Am 2017;55(2):259–277.
- 10. . Inter-rater agreement in the characterization of cystic renal lesions on contrast-enhanced MRI. Abdom Imaging 2014;39(6):1267–1273.
- 11. . The 2016 WHO classification of tumours of the urinary system and male genital organs: part A—renal, penile, and testicular tumours. Eur Urol 2016;70(1):93–105.
- 12. . MR imaging of cystic renal masses. Magn Reson Imaging Clin N Am 2004;12(3):403–412, v.
- 13. . Testing the difference of correlated agreement coefficients for statistical significance. Educ Psychol Meas 2016;76(4):609–637.
- 14. . The measurement of observer agreement for categorical data. Biometrics 1977;33(1):159–174.
- 15. . Contemporary update on imaging of cystic renal masses with histopathological correlation and emphasis on patient management. Clin Radiol 2019;74(2):83–94.
- 16. . Complex cystic renal masses: comparison of cyst complexity and Bosniak classification between 1.5 T and 3 T MRI. Eur J Radiol 2014;83(3):503–508.
- 17. . Enhancing component on CT to predict malignancy in cystic renal masses and interobserver agreement of different CT features. AJR Am J Roentgenol 2006;186(3):665–672.
- 18. . Characterization of atypical cystic renal masses with MDCT: comparison of 5-mm axial images and thin multiplanar reconstructed images. AJR Am J Roentgenol 2010;195(3):693–700.
- 19. . Malignant renal cysts: diagnostic performance and strong predictors at MDCT. Acta Radiol 2010;51(5):590–598.
- 20. . Bosniak classification system: inter-observer and intra-observer agreement among experienced uroradiologists. Acta Radiol 2015;56(3):374–383.
- 21. . Interpersonal variability and present diagnostic dilemmas in Bosniak classification system. Scand J Urol Nephrol 2011;45(4):239–244.
- 22. . Comparison of contrast-enhanced sonography with unenhanced sonography and contrast-enhanced CT in the diagnosis of malignancy in complex cystic renal masses. AJR Am J Roentgenol 2008;191(4):1239–1249.
- 23. . Malignancy rates and diagnostic performance of the Bosniak classification for the diagnosis of cystic renal lesions in computed tomography: a systematic review and meta-analysis. Eur Radiol 2017;27(6):2239–2247.
Article History
Received: Feb 14 2020Revision requested: Apr 14 2020
Revision received: July 9 2020
Accepted: Aug 4 2020
Published online: Sept 22 2020
Published in print: Dec 2020