reducing the number of Measurements in liver Point shear-Wave elastography : Factors that Influence the Number and Reliability of Measurements in Assessment of Liver Fibrosis in Clinical Practice 1

844 radiology.rsna.org n Radiology: Volume 287: Number 3—June 2018 1 From the Department of Radiology (C.F., O.S.J., G.T.Y., E.K., D.J.Q., P.S.S.) and Institute of Liver Studies (K.A., A.Q.), King’s College Hospital, Denmark Hill, London, SE5 9RS, England. Received September 8, 2017; revision requested October 30; revision received December 8; accepted January 3, 2018. Address correspondence to C.F. (e-mail: chengfang@nhs.net).

Ultrasound in Medicine and Biology suggest that at least 10 measurements should be used for liver fibrosis measurements and discuss the potential role of the interquartile range (IQR)to-median ratio as a reliability indicator (21).However, the practice of acquiring 10 measurements appears to be a common convention rather than determined by systematic evaluation.
The purpose of this study was to identify the minimum number of measurements required for the noninvasive assessment of liver fibrosis by using pSWE and to determine whether the use of a reliability indicator such as IQR-to-median ratio will affect diagnostic performance.

Materials and Methods
This retrospective analysis was performed with local ethics approval for several prospective SWE studies performed at our institution (12).Written informed consent was obtained from all patients and healthy volunteers who participated in each study.

Study Population
Between October 2009 and July 2015, 232 patients and healthy volunteers who underwent pSWE were retrospectively reviewed.pSWE measurements from 11 healthy volunteers were used in a previous study (12); this previous study focused on the reproducibility of two-dimensional SWE by comparing it with pSWE.There was no overlap with our analysis.
The patient cohort, including 221 patients suspected of having chronic liver disease who underwent pSWE to evaluate the degree of liver fibrosis, has not been previously reported.The demographic characteristics of the patient and healthy volunteers are listed in Table 1.The patients attended the Radiology Department Day Case Unit to undergo liver biopsy to assess their underlying disease; the pSWE examination was performed on the same day.The causes of liver disease included hepatitis

L
iver biopsy is a so-called reference standard in the evaluation of the degree of liver fibrosis in patients with chronic liver disease (1).However, noninvasive methods for assessment of liver fibrosis including shear-wave elastography (SWE) have gained increasing acceptance, obviating the need for a liver biopsy, a diagnostic procedure associated with substantial morbidity and mortality (2,3).SWE involves mechanical excitation of tissue with a short duration of an acoustic push pulse, which generates localized displacement in tissue resulting in shear-wave propagation.The speed of shear-wave propagation within the tissue measured in meters per second is related to the stiffness of the tissue (4,5).Previous studies (6-13) demonstrated that point SWE (pSWE) is a reliable and accurate technique in assessing the stage of liver fibrosis in both healthy volunteers and patients with chronic liver disease.This noninvasive diagnostic method is included in current clinical guidelines for the initial assessment of liver fibrosis or cirrhosis, with liver biopsy reserved for patients in whom there is uncertainty about potential additional etiologic causes of liver disease (14,15).
The majority of previous studies that evaluated pSWE to stage liver fibrosis used 10 valid measurements and reported median or mean values, with some studies using a median of five (16), six (17,18 In addition, Wilcoxon signed-rank tests were used to determine whether there was significant difference in accuracy between the median of the first five and the median of the last five measurements. The minimum number of measurements with an ICC threshold of greater than 0.95 (ie, an error of less than 5%) versus that from all 10 measurements was determined.Bland-Altman 95% confidence intervals were calculated to assess the agreement between the median value of the minimum number of measurements and the median of 10 measurements.Any difference in continuous variables between patients with an IQR-to-median ratio of 30% or less or an IQR-to-median ratio of greater than 30% was analyzed by using Mann-Whitney U test (ie, nonparametric data) or t test (ie, parametric data).The x 2 test or Fisher exact test was used for categorical variables.All variables that were significantly associated with IQRto-median ratio greater than 30% at univariate analysis were analyzed by using multivariable logistic regression analysis.
Correlation between the SWV measurements and fibrosis stages was assessed by using Spearman rank correlation coefficient (r).The diagnostic performance of the median value of the minimum number of pSWE measurements and the reference median value were assessed by receiver operator characteristic curves.Differences between the areas under the receiver operating characteristic curve were compared by using a Delong test (24).The optimal cutoff value was determined by using the Youden index (25).Subgroup diagnostic performance analyses were performed by using participants with measurements placed in the right lobe of the liver at a depth between 2 and 6 cm, avoiding biliary ducts and large vessels by using an intercostal scanning approach (Fig 1).The tissue elasticity analysis was performed by using software (Virtual Touch Quantification; Siemens) that quantified liver stiffness as shear-wave velocity (SWV) in meters per second during brief breath-holding in a neutral position (5 seconds).Ten consecutive valid SWV measurements were recorded without discarding any results.These measurements were divided into those that had an IQR-to-median ratio of greater than 30% and 30% or less; the latter were considered to indicate reliable results.Failure of pSWE measurement was defined as the inability to obtain a valid measurement (displayed on the screen as X.XX) after 10 attempts.

Histologic Evaluation
Core liver biopsy samples were taken from the right lobe of the liver by using an 18-gauge needle (Bard Marquee Disposable Core Biopsy Instrument; Bard Biopsy Systems, Tempe, Ariz) and prepared per standard departmental protocol.Biopsy fragments of 15 mm or greater were considered appropriate for pathologic evaluation.Biopsies were examined by a single hepatobiliary pathologist (A.Q., with .20 years of experience) who was blinded to the SWV measurements.Ishak fibrosis staging system was used to evaluate pathologic liver fibrosis and inflammatory activity (Table E1 [online]) (22).Significant fibrosis was defined as Ishak stage 3 and above, and severe fibrosis or cirrhosis was defined as Ishak stage 5 and above.The fibrosis stages of healthy volunteers were Ishak stage 0. The degree of steatosis was classified by visual estimation of the biopsy sample according to the percentage of hepatocytes with fatty changes: none (0% of hepatocytes), mild (33%), moderate (33%-66%), or marked (>66%) (23).The presence of steatosis is assumed to be none or mild for the healthy volunteers.Steatosis was grouped into no or mild steatosis and moderate or marked steatosis for the statistical analysis.measurements that demonstrated IQRto-median ratio of greater than 30%, the range of ICC was between 0.859 (95% confidence interval: 0.790, 0.907; two measurements) and 0.991 (95% confidence interval: 0.986, 0.994; nine measurements), and the median of the first eight measurements and 10 measurements demonstrated an ICC value above 0.95.There was no significant difference between the ICC by using the median of the first five measurements (ie, measurements from one to five) versus the median of the second five measurements (ie, measurements from six to 10) (P = .562).In the subgroup analysis of patients with no or mild steatosis (n = 162), or moderate or marked steatosis (n = 24), the number minimum measurements required was eight for both subgroups.

Comparison of Performance for the Fibrosis Assessment by Using Median of Different Number of Measurements
The median SWVs increased with higher Ishak stage of fibrosis (1.176 m/sec confidence interval: 0.873, 0.922) to 0.995 (95% confidence interval: 0.993, 0.996) (Table 2).The ICC between the median of the first six measurements and median of all 10 measurements was 0.966 (95% confidence interval: 0.957, 0.974), which is the minimum number of measurements that demonstrate an ICC above the cutoff value of 0.95.The Bland-Altman analysis shows that the mean difference between median of six and median of 10 measurements is 20.002 m/sec (95% confidence interval: 20.33, 0.32) with no statistically significant systematic bias (P = .82)(Fig 2).
The ICC for the 150 patients and healthy volunteers with IQR-to-median ratio 30% or less ranged from 0.939 (95% confidence interval: 0.916, 0.955; two repeated measurements) to 0.998 (95% confidence interval: 0.997, 0.999; nine repeated measurements), and the ICC between the median of the first three repeated measurements and 10 measurements demonstrated an ICC value above 0.95.In the cohort of patients and volunteers with that demonstrated an IQR-to-median ratio greater than 30% and IQR-to-median ratio of 30% or less.A two-sided P value of less than .05was considered to indicate statistical significance.

Study Participant Characteristics
The characteristics of the study patients and healthy volunteers are listed in Table 1.Histologic grading of liver fibrosis was available from 221 patients.The 11 healthy volunteers did not undergo liver biopsy.SWV measurements were obtained in all 232 patients and healthy volunteers who underwent pSWE.SWV measurements with lower reliability (IQR-to-median ratio, .30%)were observed in 82 of the enrolled participants (35.3%).

ICCs between Median Measurements
In the overall cohort, the ICC between a median of two to a median of nine measurements and a median of 10 measurements ranged from 0.901 (95%  4).Of the additional three patients with discordance, one patient had an IQR-to-median ratio greater than 30%.The number of patients with discordance between pSWE and histologic results for classification of severe fibrosis or cirrhosis (Ishak stage, the subgroups with an IQR-to-median ratio greater than 30% and an IQR-tomedian ratio of 30% or less (Table 3).
The number of patients with discordance between pSWE and histologic results that classified significant fibrosis (Ishak stage 3) was 48 (20.7%) with a median of 10 measurements and 51 The median of 10 SWV measurements differentiated significant fibrosis (Ishak stage 0-2 vs 3-6) and severe fibrosis or cirrhosis (Ishak stage 0-4 vs 5-6) with an area under the receiver operating characteristic curve of 0.839 (95% confidence interval: 0.786, 0.884) and 0.969 (95% confidence interval: 0.937, 0.987), respectively.The median of six SWV measurements differentiated significant fibrosis (Ishak stage 0-2 vs 3-6) and severe fibrosis or cirrhosis (Ishak stage 0-4 vs 5-6), with an area under the receiver operating characteristic curve of 0.828 (95% confidence interval: 0.773, 0.874) and 0.953 (95% confidence interval: 0.918, 0.977), respectively.There was no statistically significant difference in the area under the receiver operating characteristic curve between the median of six SWV and 10 SWV measurements in diagnosing both significant fibrosis (P = .487)and severe fibrosis or cirrhosis (P = .145).Similarly, there was no statistically significant difference in differentiating significant or severe fibrosis between a median of six and median of 10 measurements in Note.-Data in parentheses are 95% confidence intervals.High reliability is an interquartile range-to-median ratio of 30% or less; low reliability is an interquartile range-to-median ratio of greater than 30%.
* Represent intraclass correlation coefficient of 0.95 or greater between the number of repetitions and median value of 10 measurements.Steatosis data were available in 186 participants.5) was 11 (4.7%) with a median of 10 measurements and 17 (7.3%)with six measurements.Of the additional six patients with discordance for the classification of significant fibrosis or cirrhosis, four patients had an IQR-tomedian ratio greater than 30% (Table 4).When a median of 10 measurements were used, the percentage of patients who demonstrated discordance between the pSWE and histologic analysis increased from 14.7% (IQR-to-median ratio, 30%) to 31.7% (IQR-to-median ratio, .30%)and 2.0% (IQR-to-median ratio, 30%) to 9.8% (IQR-to-median ratio, .30%) in differentiating significant fibrosis and severe fibrosis or cirrhosis, respectively.

Discussion
In our study, a median of six SWV measurements was the minimum number required that resulted in an error rate of less than 5% compared with a median of 10 measurements.In addition, the 95% agreement limit from the Bland Altman analysis was also comparable to the 95% agreement limit from interobserver variability previously described (26).The area under the receiver operating characteristic curve GASTROINTESTINAL IMAGING: Reducing the Number of Measurements in Liver Point Shear-Wave Elastography Fang et al rate with histologic evaluation for differentiating between significant and severe fibrosis or cirrhosis, respectively, compared with patients with higher reliability measurements (IQR-to-median ratio, 30%).However, the reduction of measurements from 10 to six resulted in 1.1-fold and 1.7-fold increase in the discordance rate with histologic evaluation in differentiating significant and severe fibrosis, respectively.In our study, the percentage of discordance with histologic evaluation were more affected by IQR-to-median ratio than the number of measurements.IQR-to-median ratio can only be calculated retrospectively because, to our knowledge, no reliability indicator is available at the time of acquiring the measurement.In view of the predefined threshold of a less than 5% measurement error, the number of measurements required compared with 10 measurements will depend on the proportion of measurements with an unreliable result (ie, IQR-to-median ratio of .30%).The presence of significant fibrosis was shown to be an independent predictor for an unreliable result contrary to previous evidence showing that cirrhosis does not influence the rate of reliable pSWE measurements (34).Unlike findings from a previous study by Bota et al (34), age and sex were not independent predictors for unreliable results.This may be because of the proportion of unreliable results in the previous publication also included measurement with success rate less than 60% and the proportion of patients with an unreliable result (6.4%) was significantly lower than the current study (35.3% of participants with 10 measurements).This is comparable to the 42.4% figure published by Ferraioli et al (29) in a cohort of 255 patients by using pSWE with the ElastPQ technique (Philips, Eindhoven).However, despite over a third of participants who were considered to be unreliable, our results demonstrated that it is still possible to assess liver fibrosis with acceptable accuracy (area under the receiver operating characteristic curve for significant fibrosis, 0.822; area under the receiver operating characteristic curve same approach by comparing the ICC for the median of the initial two to nine measurements with the median of 10 measurements.Other studies (28,29) that compared pSWE (ElastPQ, Philips, Endihoven; and Virtual Touch Quantification) with transient elastography (Fibroscan, Paris, France) showed that five measurements could be used instead of 10 without a significant effect on diagnostic performance.Similar findings have been observed with use of transient elastography where the diagnostic accuracy of detecting cirrhosis was not affected by using five valid measurements instead of 10 with histologic evaluation as the comparison (30).
IQR-to-median ratio is a method for measuring data variability, and is recommended as a reliability indicator for transient elastography by the manufacturer (Echosens, Paris, France) (21).The importance of IQR-to-median ratio as a quality control measure for pSWE was observed in several studies (31,32).Increased accuracy for classifying fibrosis was noted when measurements that showed lower reliability (IQR-tomedian ratio, .30%)were excluded (29,(31)(32)(33).Our results showed that use of higher reliability measurements (IQR-to-median ratio, 30%) can further reduce the number of measurement required from six to three by maintaining an ICC greater than 0.95 with 10 measurements.Patients with lower reliability measurements (IQR-tomedian ratio, .30%)showed a 2.2-fold and 4.9-fold increase in discordance for differentiating significant fibrosis and severe fibrosis or cirrhosis by using the median of six measurements was slightly lower than the area under the receiver operating characteristic curve when the median of 10 measurements was used, but the difference was not statistically significant.The area under the receiver operating characteristic curve value by using the median of six measurements was within the range of published data (9,27).In our cohort of 232 participants, the use of six measurements instead of 10 only misclassified three patients with significant fibrosis and six patients with severe fibrosis or cirrhosis.
An increase in the number of measurements (ie, from six to 10) resulted in greater reduction in the percentage of discordant patients with severe fibrosis or cirrhosis (7.3% from six measurements vs 4.7% from 10 measurements) than for those with significant fibrosis (22.0% from six measurements vs 20.7% from 10 measurements).
Previous studies have used the median or the mean of 10 measurements as routine practice.Our data suggest that acquiring fewer measurements will have a limited effect on the diagnostic performance of pSWE with only limited disagreement (nine of 232 [3.9%] cases) versus histologic evaluation, potentially saving clinic time.Yoon et al (18) also suggested that the optimal minimum number of SWE measurements required was six in their cohort of 86 patients; they used the for severe fibrosis, 0.943) in these patients and it is not necessary to exclude measurements with IQR-to-median ratio greater than 30%.
We did not identify any predictors for lower reliability (IQR-to-median ratio, .30%)that could be assessed prior to obtaining the pSWE measurements because the existence of significant fibrosis was unknown at the time of imaging.Recently, other pSWE techniques by using the Acoustic Radiation Force Impulse principle (35)(36)(37) were introduced in which a reliability indicator was displayed at the point of measurement, for example the SWV measurement efficacy rate.This potentially allows selective reduction of the number of measurements required for the noninvasive assessment of liver fibrosis at the point of imaging.
Despite the large number of pSWE studies reported, few have evaluated the number of measurements required and their effect on diagnostic accuracy.The strength of our study compared with previous studies is that the minimum number of measurement was not only determined by the ICC against the conventional practice of the median of 10 measurements but also against liver biopsy results; in previous studies (18,29,38), diagnostic performances were compared with transient elastography and the minimum number of measurements required were only determined by ICC or correlation against a fixed number of measurements.
This study has limitations.First, it was a single center study and the data were analyzed retrospectively.Therefore, certain parameters such as body mass index were not recorded at the time of study.Second, the study population contained a mixed etiologic cause and the optimum cut-off values for assessing fibrosis may vary, depending upon the underlying cause.Finally, the use of Ishak fibrosis stage from liver biopsy, although it provides a reference standard when the diagnostic accuracy of pSWE is assessed, is subject to sampling error because of heterogeneity of the disease process in cirrhosis.

GASTROINTESTINAL IMAGING:
pSWE Examination All patients and healthy volunteers fasted overnight and were scanned in a supine position with arms extended over their head, in suspended respiration, in accordance with guidelines (21).The pSWE examination was performed by trained radiologists (C.F., O.S.J., G.T.Y., E.K.) who had at least 2 years of ultrasonographic (US) experience and training in the pSWE technique (each of them performed at least 200 previous liver elastography examinations per the European Federation of Societies for Ultrasound in Medicine and Biology competency requirement).pSWE was performed with a quantitative software technique (Virtual Touch Quantification; Siemens Healthcare, Mountain View, Calif) by using a US imaging system (Acuson S2000/ S3000; Siemens), equipped with a convex broadband (6C1, 1.5 MHz-6 MHz) transducer.A region-of-interest box was measurements and quality measures required.Recent guidelines from the European Federation of Societies for Abbreviations: ICC = interclass correlation coefficient IQR = interquartile range pSWE = point SWE SWE = shear-wave elastography SWV = shear-wave velocity Author contributions: Guarantors of integrity of entire study, C.F., E.K., P.S.S.; study concepts/study design or data acquisition or data analysis/interpretation, all authors; manuscript drafting or manuscript revision for important intellectual content, all authors; approval of final version of submitted manuscript, all authors; agrees to ensure any questions related to the work are appropriately resolved, all authors; literature research, C.F., O.S.J., G.T.Y., E.K., D.J.Q., P.S.S.; clinical studies, C.F., O.S.J., G.T.Y., E.K., D.J.Q., P.S.S.; experimental studies, C.F.; statistical analysis, C.F., G.T.Y.; and manuscript editing, C.F., O.S.J., G.T.Y., D.J.Q., K.A., A.Q. Conflicts of interest are listed at the end of this article.

Figure 1 :
Figure 1: Point shear-wave elastographic examination of the liver performed by using an S2000 US machine (Siemens, Healthcare).The liver capsule and region of interest are shown.

Figure 2 :
Figure 2: Bland-Altman plot of the differences in the shear-wave velocity (SWV) between median of six and median of 10 measurements.The solid line represents the mean of the difference in SWV; the dashed lines represent the 95% upper and lower limits of agreement.SD = standard deviation.
GASTROINTESTINAL IMAGING: Reducing the Number of Measurements in Liver Point Shear-Wave ElastographyFang et al

Table 1
Note.-Median variables had nonnormal distribution.Unless otherwise indicated, data in parentheses are percentages.ALD = alcoholic liver disease, ALP = alkaline phosphatase, ALT = alanine aminotransferase aminotransferase, AST = aspartate, GGT = g glutamyl transferase, INR = international normalized ratio, NAFLD = nonalcoholic fatty liver disease, PSC = primary sclerosing cholangitis, UC = ulcerative colitis.*Data are median; data in parentheses are range.GASTROINTESTINAL IMAGING: Reducing the Number of Measurements in Liver Point Shear-Wave Elastography Fang et al

Table 2
Interclass Correlation Coefficients between the Median Shear-Wave Velocity Value of Repeated Shear-Wave Elastography Measurement and Median of 10 Measurements

Table 3
Diagnostic Performance of Point Shear-Wave Elastography Note.-Data in parentheses are 95% confidence interval.Significant fibrosis refers to Ishak stage 3 or greater; severe fibrosis or cirrhosis refers to Ishak stage 5 or greater.AUROC = area under the receiver operating characteristic curve, IQR = interquartile range, pSWE = point shear-wave elastography.

Table 4
Discordance between Point Shear-Wave Elastography and Histologic Analysis in the Overall Cohort and Subgroups Data are number of patients; data in parentheses are percentages.Significant fibrosis refers to Ishak stage 3 or greater; severe fibrosis or cirrhosis refers to Ishak stage 5 or greater.IQR = interquartile range, pSWE = point shear-wave elastography.