Use of Volumetry for Lung Nodule Management: Theory and Practice
Abstract
A consistent feature of many lung nodule management guidelines is the recommendation to evaluate nodule size by using diameter measurements and electronic calipers. Traditionally, the use of nodule volumetry applications has primarily been reserved for certain lung cancer screening trials rather than clinical practice. However, even before the first nodule management guidelines were published more than a decade ago, research has been ongoing into the use of nodule volumetry as a means of measuring nodule size, and this research has accelerated in recent years. This article aims to provide radiologists with an up-to-date review of the most recent literature on volumetry and volume doubling times in lung nodule management, outlining their benefits and drawbacks. A brief technical review of typical volumetry applications is also provided.
© RSNA, 2017
Introduction
Historically, lung nodule management guidelines have recommended that nodule size (specifically diameter) is measured by using electronic calipers (1–3), a method that has been widely adopted in clinical practice. In contrast to routine practice, a number of lung cancer screening trials have instead incorporated nodule volume assessment (referred to as volumetry) within their protocols (4–8). Furthermore, more recently published management guidelines for incidentally detected nodules from the Fleischner Society and British Thoracic Society have also recognized the role of volumetry as a tool to aid nodule measurement and management (9,10). The rationale for relying on volume rather than diameter is threefold: First, for diameter to accurately reflect lung nodule size and growth, it has to be assumed that nodules are perfectly spherical and grow in a symmetrical fashion. By contrast, volume measurements may better encapsulate the three-dimensional nature of a pulmonary nodule. Second, volume estimation allows for calculation of the volume doubling time (VDT), a parameter that is proposed to more reliably define nodule growth. Finally, there is the observation that diameter measurements are subject to inconsistency between and among observers.
In the past few years there has been an abundance of literature on the reliability of volumetry applications. As volumetry packages become more widespread and readily accessible in radiology departments, radiologists may wish to better understand their benefits and drawbacks. This article reviews the advantages and limitations of nodule volumetry and discusses whether and how it can be used for both screening programs and incidentally detected lung nodules. The article begins with a brief review of the technical aspects of volumetry.
Technical Aspects of Nodule Volumetry: Basic Principles
Central to establishing nodule volume is the requirement for accurate nodule segmentation. The basic principle underlying the vast majority of nodule segmentation algorithms is that of a “region growing” procedure from a seed-point usually placed by a reader. The process of region growing connects all voxels above a threshold somewhere between tissue attenuation and lung parenchyma attenuation (around −500 HU for solid nodules). Since solid lung nodules are of tissue attenuation and are usually surrounded by lung parenchyma, inevitably there is a high level of contrast between a lung nodule and its surroundings. As a result, most intraparenchymal nodules can be reliably segmented by using this method.
However, the main limitation of such an approach is that many nodules are directly connected to other high-attenuation structures, in particular to vessels and the pleura. Image processing steps that exploit morphologic criteria to remove attached structures are therefore a prerequisite for successful nodule segmentation (11). For example, it is known that nodules are generally rounded, while attached vessels are cylindrical and the pleura is convex. Such information can be used to remove attached structures, though the exact detail of how this is achieved is beyond the scope of this review and is far from trivial. Methods differ considerably in their design choices and underlying assumptions. A particularly challenging scenario is one in which the diameter of the nodule is smaller than the cross-sectional diameter of an attached vessel. In this setting, most currently available segmentation methods will fail to remove attached vessels (Fig 1).
Another basic principle that differs between software analysis packages is two-dimensional versus three-dimensional processing. Older methods that use two-dimensional processing analyze data section by section, producing nodule boundaries that substantially differ from one section to the next. The end product is often a pixelated image that may be considered obviously wrong to the human eye. With the wide use of multidetector CT producing isotropic images, three-dimensional image processing is a preferable approach to nodule segmentation (12).
A further important consideration in nodule segmentation is how to deal with partial volume effect. Especially for smaller nodules, a surprisingly large part of the volume of the nodule will be contained in voxels that are only partly consisting of nodule. For example, consider a CT scan with an isotropic resolution of 1 mm and a nodule that is a perfect sphere of 5 mm diameter, with its center in the middle of a voxel. Only 27 voxels, representing 41% of the total volume of this nodule, will purely contain nodule. For the remaining voxels, one can estimate how much “nodule” they contain from their Hounsfield unit value, and to do this properly, one must make an estimate of the air content of the tissue surrounding the nodule by analyzing the attenuation of nearby voxels. Such a sophisticated “partial volume correction” procedure is a very important component of ensuring accurate nodule volume assessment, though the precise methods of how this is achieved will vary between volumetry packages (11). Figure 2 illustrates the typical steps involved in the segmentation of solid nodules. It should be added that while existing volumetry products rely on image processing pathways such as those outlined above, there may in the future be an increasing role for machine-learning based segmentation, although such algorithms are very much early in their development (13).
Volumetry for subsolid nodules is more challenging than for solid lesions. The main reason is the reduced difference in attenuation between the ground-glass component of a subsolid nodule and the surrounding lung parenchyma. Nevertheless, methods have been proposed and validated for subsolid volumetry that are variations of approaches that have been shown to work well for solid nodules (14). Satisfactory segmentation can be obtained by identifying a lower limit attenuation threshold that separates ground-glass opacity from lung parenchyma. This can be predefined (eg,−750 HU), or an optimal value can be calculated by the volumetry application based on histogram analysis. However, this approach still does not overcome all problems associated with subsolid nodule segmentation. Subsolid lesions are often larger than typical solid nodules, and these lesions are not uncommonly in contact with vessels or may even completely contain vessels. Separating vessels from the ground-glass component of a subsolid lesion is difficult, and differentiating vessels from a solid core is even harder. Traditionally, research studies have resorted to manual segmentation to analyze these complex lesions (15), although more recently presented data suggest these challenges can be successfully overcome (16).
Evaluating Nodule Volumetry Reliability: Basic Principles
Assessing Volumetry Accuracy: How Close to True Volume?
Establishing the true volume of a lung nodule is not straightforward because the reference standard is measurement of volume after nodule excision. Consequently there is an inevitable bias toward larger nodules in studies that rely on this method. Furthermore, other factors such as pathology handling techniques and differences in the degree of lung inflation may produce differences in the in vivo and ex vivo nodule size.
To overcome these issues, most studies examining volumetry accuracy use phantom-embedded synthetic nodules. However, a number of issues limit phantom studies. First, synthetic nodules are often smooth and spherical, not replicating in vivo lung nodules, which may demonstrate irregularity or nonsphericity. Second, nodule volume measurements in vivo are affected by nodule attachment to other structures, such as vessels. Suturing synthetic nodules to vessels within the phantom has been used to overcome this. Third, although synthetic nodules with variable densities can be used to replicate differences in solid and subsolid nodules in vivo, the heterogeneity in density of in vivo subsolid nodules has yet to be perfectly matched by synthetic nodules. Finally, there are some parameters such as breathing artifacts that are challenging to replicate in phantom studies. Despite these challenges, phantom studies do offer the potential to study measurement variability in large numbers of nodules without ethical issues of radiation exposure to patients.
Assessing Volumetry Reproducibility and Repeatability: How Consistent under Varying and Identical Conditions?
One of the principle roles of volumetry is to detect nodule growth, rather than absolute size per se. The use of pathologically confirmed specimens to provide true baseline and follow-up values is obviously impossible. Instead, a commonly used approach is one in which the ability of the volumetry application to detect an absence of growth over time is tested (often referred to as “coffee-break” experiments or zero change datasets). In these studies, the patient is imaged twice, at separate points on the same day. Hence, any difference in volume measurements between CT studies (typically illustrated as the mean difference ± 1.96 standard deviations) can be ascribed to interscan variability. Even so, there is a limit to which factors such as patient inspiratory effort, the time spent lying prone, or cardiovascular pulsation artifact can be controlled for.
Wormanns et al (17) and Gietema et al (18) were among the first to provide the typical study designs for coffee-break experiments. In both studies, patients with small pulmonary metastases were imaged twice on the same day. All other factors were kept constant. Both studies found similar 95% confidence intervals for relative volume difference between scans in the order of ±25%. The figure of 25% has been used in some screening studies as the percentage volume change required to signify true growth (4,5,7).
The assessment of measurement consistency over multiple acquisitions of the same nodule under identical or near-identical conditions is clearly more suited to phantom studies. Li et al, for example, recently reported the results of nodule volume measurement consistency in 61 synthetic nodules imaged 10 times using 72 different permutations of acquisition and reconstruction parameters illustrating the potential for phantom studies (19).
Limitations of Electronic Caliper–based Diameter Measurements
Prior to examining the reliability of volumetry measurements, it is worth examining the deficiencies of electronic caliper–based diameter measurements. In this setting, measurement variation occurs because readers are typically required to (a) select an axial section for which the nodule is estimated to be at its largest and (b) manually place cursors at the boundaries of the nodule. In a widely cited publication, Revel and colleagues found that in a study of 54 nodules with a mean diameter of 8.5 mm, the interreader variability for diameter measurements was 1.73 mm (20) (20% of average diameter). Diameter variation as a percentage of nodule size may be less for larger nodules. In a coffee-break experiment by Oxnard et al, of 30 nodules between 1 and 3 cm, a diameter variation of 10% or less was found in 84% of measurements (21). However, overall variation still ranged from −23% to +31%. It is important to stress that a 20% variation in average diameter equates with a variation in calculated volume closer to 100%, given that volume is proportional to diameter cubed (Fig 3). Studies directly comparing diameter and volume measurement variability in the same cohort of nodules, especially small nodules, are lacking.
Impact of Acquisition, Reconstruction, Patient, and Reader Factors on Volumetry Reliability
Nodule follow-up thoracic CT examinations are in many institutions performed by using dedicated protocols that guarantee consistency of acquisition and reconstruction parameters. Such protocols are mandatory in the setting of lung cancer screening. However, there are instances when follow-up CT examinations may be performed by using different protocols from those used in the CT studies in which the nodule was first detected. Therefore, it is important for radiologists to be aware of the impact of these variables on the accuracy and precision of the applications that they use (Table 1).
Nodule Analysis Factors
Reader interaction.—In volumetry, readers are often required to place a seed point within the nodule, which can in theory be a source of variation. However, in one of the largest investigations of volumetry repeatability, Wang et al demonstrated identical repeat volume measurements in 86% of over 4000 nodules (with relative volume difference > 15% in only 4% of nodules), suggesting that seed point placement was not an important consideration (22).
Volumetry can sometimes fail to accurately segment a nodule. The reported frequency is approximately 6% of intraparenchymal nodules in the Dutch-Belgian lung cancer screening trial (the NELSON study) (7) and 28% of nodules in a report of selected cases from the Danish Lung Cancer Screening Trial (23). There is no widely accepted definition of what constitutes nodule segmentation failure, but some investigators have suggested that a mismatch of greater than 20%–30% (24,25) (as judged visually) could be regarded as unsatisfactory segmentation (Fig 1). In this scenario, radiologists can override the volumetry software and fall back on subjective assessment or diameter measurements. Alternatively, options exist in some systems to allow readers to manually edit cases, although this may produce greater interobserver variation (26).
Choice of volumetry software package.—In one study, de Hoop et al compared six different packages and found that overall variability (taken as the upper limit of the 95% confidence interval) between packages was 16%–22% (24). More recently, Zhao and colleagues showed that the difference in volumes between three different packages may be as great as 50% in a study of nodules smaller than 100 mm (27). Of importance, however, these variations persisted at follow-up CT, and so differences in growth assessment (based on VDT) were less stark. These results have important implications: Packages from different vendors should not be used interchangeably, and even when using the same package, it is important not to assume that different software versions are compatible.
Some volumetry packages also offer readers the ability to vary segmentation algorithms. Ashraf et al showed that by choosing different algorithms within the same package, a greater than 25% difference in volume could be generated in more than 80% of cases (23). This result highlights the importance of standardizing all aspects of volumetry reporting within radiology departments.
CT Acquisition Factors
Radiation exposure and iterative reconstruction techniques.—The reproducibility and repeatability of volumetry measurements has been shown to be good regardless of radiation exposure (28,29). Recent investigations have reinforced these observations. In an in vivo study of small to medium-sized nodules, Hein et al (30) investigated the effect of a “standard dose” (52.2 effective mAs) compared with an “ultra low dose” (3.5 effective mAs) protocol. They reported interscan variation ranging between −25.1% and −28.9%. The authors concluded that radiation dose was not a driver of interscan variability.
Whether the same is true for CT scans reconstructed with various iterative reconstruction (IR) techniques versus filtered back projection (FBP) has also recently been investigated. A small number of recent, mainly phantom-based, investigations have demonstrated that volumetry derived from IR is at least as accurate as FBP. In one in vivo study, Willemink et al found no meaningful differences in solid nodule volumes regardless of the reconstruction methods and also regardless of the strength of IR (31). Doo et al, in a phantom study of 5–12-mm nodules, did find IR to be associated with significantly lower absolute percentage errors compared with FBP, but the difference was of marginal clinical importance (32).
Equivalent or modestly superior accuracy and precision was also found in another study of IR compared with FBP, regardless of dose (33). However, of importance, these results were not uniformly replicated among different volumetry software applications, which implies that some commercially available systems may not be suited for analysis of IR images.
Results from two phantom studies have suggested that IR may substantially improve accuracy of size evaluation in ground-glass nodules (32,34). This may be because the contrast between lung parenchyma and nodule is increased as a result of IR-induced noise reduction. Although improved accuracy is welcome, it may not necessarily be appropriate to compare ground-glass nodule volumes generated by using IR technology with CT scans reconstructed by using FBP. In this regard, a recent in vivo investigation of subsolid nodules found 95% confidence intervals for volume variability of ±23.7% between reconstruction methods (35).
An important practical point that has been identified in some studies is that IR can produce a small number of outlier results whereby volume measurements can be substantially under- or overestimated (31). It has been suggested that noise reduction may in some cases cause inappropriate segmentation by including adjacent structures (33).
Number and type of detectors.—In early phantom studies it was shown by Das et al, using a single semiautomated volumetry package, that significant differences in volume were produced between four- versus 16-section CT scans but not between 16- and 64-section CT scans (36).
Das et al did, however, report in a phantom study that volume accuracy varied significantly according to the CT vendor (28). However, the variations could be regarded as clinically unimportant, with absolute percentage errors (mean ± standard deviation) ranging from 7.5% ± 7.2 to 14.3% ± 11.1. Furthermore, acceptable reproducibility (±18.2) between two different manufacturers of 64-section CT scanners has also been reported in a phantom study for 5–8 mm nodules (37).
Intravenous contrast material.—Increasing the attenuation value of the nodule, particularly at its periphery, would accentuate the contrast difference between that nodule and its surrounding parenchyma. Consequently, the segmentation may incorporate a greater proportion of the periphery of a nodule. In a study of patients with thoracic malignancies and large volume nodules, Honda and colleagues (38) found that the postcontrast volume was higher than the precontrast volume for 88% of patients. The median increase was only approximately 5%, but some outliers existed with percentage volume increases of greater than 20% in 8% of patients. Rampinelli et al (39) also demonstrated similar increases in postcontrast volume (4%–7%), independent of nodule diameter. However, comparison between contrast-enhanced and unenhanced CT scans should be avoided for subsolid nodule assessment. In this setting, intravenous contrast material may give the false impression of increased attenuation within a subsolid nodule.
Pitch.—CT studies acquired at low pitch could hypothetically result in respiratory artifacts, and thus affect volumetry, while a higher pitch normally results in improved z-axis resolution and decreased partial volume effects and thus likely increases volumetry accuracy. Despite this, Way et al (29) found that volume segmentation variability in a phantom study was not significantly influenced by three different pitches (0.53, 0.969, and 1.375) on a 16-section scanner, while another study also found negligible differences in nodule volume between high- and low-pitch protocols by using a 128-section scanner (40).
Reconstruction Parameters
Section thickness and reconstruction overlap.—A number of studies have reported increasing variability of nodule volumes with increasing reconstruction thickness (41,42). As section thickness increases, the mean attenuation of the surface voxels of a nodule decreases due to partial volume averaging, and the total volume of those voxels simultaneously increases, producing nodules with blurred margins. Petrou et al (43) found that thick-section (5-mm) images produced nodule volumes that ranged from being 40% greater to 40% less than those obtained from thin-section (1.25-mm) CT images. Furthermore, Li et al in their phantom study examining the effects of multiple parameters found that increasing section thickness was one of the most important contributors to measurement variability (19).
Therefore, section thicknesses that are appropriate to the minimum nodule size thresholds under investigation (ideally 1.25 mm or less) should be selected and kept constant if volumetric growth analysis is performed.
Besides section thickness, the degree of overlap between sections may affect volumetric accuracy since overlap leads to decreased partial volume averaging and better z-axis resolution, but investigators differ as to exactly how important this parameter is. Honda et al demonstrated that volume measurements were significantly larger on nonoverlapping 5-mm sections compared with 50% overlap reconstructions, but this effect was not seen at 1.25 mm thickness (44). Ravenel et al thus concluded that overlapping reconstructions were unnecessary at section thicknesses less than or equal to 1.25 mm (42). However, recently Gavrielides and colleagues have shown that absolute percent bias in volume estimation can be decreased by about 16% by using 50% overlap even on very thin sections, for artificial 5-mm nodules (45). This finding reiterates that it is small nodules that are most susceptible to slight variation in reconstruction parameters.
Reconstruction kernel.—Different studies have reported conflicting effects of reconstruction kernels on volume accuracy and repeatability. Earlier studies have shown high-frequency bone algorithms to both increase and decrease nodule volume compared with low-frequency soft-tissue algorithms (38,44). A recent study by Christe et al (46) found that larger volumes were generated on soft, low-frequency reconstruction algorithm images than high frequency “bone” algorithms (Fig 4). However, the extent to which this occurred depended on the software package in use (variations between 1.6% and 11.2% for mean volume measurements), highlighting the interactions between reconstruction algorithm and analysis package variables.
Arguably, for practical purposes the repeatability of volume measurements on a particular algorithm is most relevant. In this regard, Wang et al found soft-tissue reconstructions provided more repeatable measurements than a sharp kernel, although the study compared images reconstructed at 2 mm thickness and used one analysis package (47).
Nodule Factors
Size.—Phantom studies have demonstrated increasing volume measurement error with decreasing nodule size (48,49), with one study reporting a tendency to underestimate absolute nodule volume by an average of 40% for small nodules (49).
Reproducibility studies specifically examining small nodules in vivo are limited. In one clinical study of more than 200 nodules, de Hoop et al (24) found variation in interscan variability was size dependent; nodules smaller than 8 mm had more variation (range, 18%–26%) than nodules larger than 8 mm in diameter (range, 13%–17%). Goodman et al (50) however, reported similar variability (approximately 30% standard deviation of mean relative volume difference) for nodules in the smaller than 6-mm range as those in the 6–9-mm range, although the numbers of nodules in each group were relatively small.
Location and morphology.—Nodule location is probably more influential on volumetry reliability than nodule size. On occasions, volumetry may fail to appropriately segment juxtastructural nodules, by including a large component of vasculature or pleura. Predicting in advance which juxtastructural nodules might be susceptible to this phenomenon is problematic, and indeed the same nodule can be appropriately or inaccurately segmented depending on the placing of the seed point by the reader (50).
Juxtavascular nodules are particularly prone to poor segmentation, with satisfactory segmentation reported in 52%–85% of such nodules, depending on the volumetry package used (27). In the large study by Wang et al (22) assessing 4225 solid nodules, juxtavascular nodules were four times more likely (and juxtapleural nodules twice as likely) than purely intraparenchymal nodules to cause significant volume variability. Consequently some lung cancer screening trial protocols have stipulated using caliper measurements for juxtapleural nodules (4,7).
Nodule outline is another important factor when considering volumetry reliability. Spiculated nodules are subject to greater interreader measurement variation compared with smooth nodules (22,43). Furthermore, Xie et al recently showed that the actual volume of synthetic nodules with irregular shapes was underestimated by 39% ± 21, compared with less than 10% in artificial smooth nodules of a similar size (49). These data need to be considered in the context of the known difficulties in diameter measurements in spiculated and irregular nodules.
Density.—Automated volume measurements of subsolid lesions are more difficult than for solid lesions for a number of reasons. As described in the earlier section, differences between the attenuation of subsolid nodules and that of the adjacent lung parenchyma are more subtle, and so defining their margins is challenging. Furthermore, subsolid nodules may contain air bronchograms, as well as traversing vessels, and can have multiple bands of pleural tethering, all of which can be difficult to model for reliable semiautomatic quantification. Although recent technical solutions have been presented to address some of these problems (14,51), screening trials that have used semiautomated volumetry in subsolid nodules have reserved it for measuring the solid component in selected cases, with caliper measurements used for the ground-glass component.
However, the measurement of subsolid nodule diameter using electronic calipers, in particular for part-solid nodules, can be problematic. The solid component within a part-solid nodule is not infrequently multifocal and amorphous with indistinct boundaries. Also it is well recognized that the size of the ground-glass component in part-solid nodules may paradoxically decrease as the nodule becomes more solid.
As such, a new metric of nodule mass (a product of nodule volume and adjusted nodule density) has been proposed as a more reliable predictor of growth in subsolid nodules (52). Recent studies have demonstrated interscan variability of nodule mass to be similar to that of volume, though the results have varied quite substantially (95% confidence intervals were −17.7% to 18.6% in one study [53] and −34.7% to 21.8% in another [54]).
Patient Factors
Inspiration.—Lung volumes from consecutive CT scans in the same patient, acquired at full inspiration, have been shown to vary, albeit modestly, from scan to scan even when patients are provided with the same breathing instructions (ranging from −12% to +16% in one study [18] and from −19% to +26% in another [24]). In theory, reduced lung volumes might be expected to increase nodule volume because of increased lung parenchymal attenuation surrounding a nodule. Moreover, nodules are more likely to be in close proximity to other structures such as vessels at expiration, which may inadvertently lead to unsatisfactory segmentation and increased nodule volumes. These effects, however, have not been convincingly demonstrated in a limited number of in vivo studies. De Hoop et al (24) and Gietema et al (18) both found lung volume to have minimal impact on nodule volumes when CT scans are acquired at maximum inspiration. In another investigation, Petkovsa et al (55) actually showed that CT scans acquired at residual volume could produce both an increase and reduction in nodule volume compared with CT scans obtained at maximum inspiration. By contrast, in a small study of 23 nodules, opposing results were found whereby the majority of nodules (19 of 23) exhibited increased volumes on expiration (56). It seems therefore from limited data that the impact of lung volume on nodule volume reproducibility is unpredictable.
Emphysema and other comorbidities.—While it is known that emphysema can impair the correct subjective classification of pulmonary nodules as either benign or malignant (due to the great degree of overlapping morphologic characteristics between benign and malignant nodules in emphysema), the influence of emphysema on volumetry variability remains underinvestigated. Intuitively, it could be expected that the reduction of attenuation in emphysematous lung parenchyma surrounding a nodule would, if anything, increase the delineation of nodule margins and decrease volumetry variability. However, no such effect was found in one in vivo study (56).
Similarly, the influence of concurrent interstitial abnormalities on volumetry reliability has not received attention. Some form of interstitial lung abnormality has been reported in up to 20% of patients in lung cancer screening (57). Such abnormalities may decrease the distinctness of nodule boundaries and attenuation thresholds and in our experience can interfere with both segmentation success and reliability.
Impact of parameter variation on the clinical utility of volumetry.—There are inevitably instances when it will not be possible to ensure parameter consistency between baseline and follow-up CT. For example, incidental nodules may be detected on contrast-enhanced standard-dose CT scans, whereas nodule follow-up CT scans are typically obtained at reduced dose without contrast material. Sometimes it is necessary to compare CT scans acquired at different institutions with differing protocols. Translating the evidence described above into a handful of practical instructions on how to approach such scenarios is difficult. Instead, radiologists need to exercise judgement on whether it is appropriate to rely on volumetry measurements, based on the variable in question as well as the degree and direction of change in measured nodule volume. Table 1 summarizes the impact of technical factors on nodule volumetry reliability and also provides an assessment of the importance of the variable in question. For instance, it would usually be inappropriate to compare nodule volumes generated from CT scans reconstructed at variable section thicknesses. However, it may be entirely reasonable to compare larger-sized nodules identified on low-dose and standard-dose scans. Ultimately, in instances in which there is doubt as to the influence of parameters on measured nodule volume, radiologists should revert to the existing practice of using electronic calipers.
Clinical Applicability of Volumetry in Nodule Management
VDT and Patterns of Growth
A fundamental part of using volumetry in clinical practice is the evaluation of VDT, which has long been advocated as a surrogate marker for the likelihood of malignancy (58). The VDT can be calculated in days using the following formula:
This formula assumes a so-called exponential growth rate for lung cancer, whereby malignant cells divide at a constant rate, and thus lesion volume increases exponentially with time (58).
The use of VDT based on the exponential model of growth has been challenged (59). In a study of 18 malignant lung nodules, imaged with CT at multiple time points, it was found that growth rates could vary substantially between different time points, including in some instances not showing any growth at all. However, this study used caliper measurements rather than volumetry. In another recent study of 12 lung cancers, using volumetry at multiple time points demonstrated a growth pattern that best fitted with exponential growth, although this was not universal among all nodules (60). It is certainly plausible that growth rates may diminish as lesions become larger and outstrip their vascular supply (61). The opposite phenomenon has also been described, whereby slow growing tumors suddenly and dramatically increase in size (62). However, there is little evidence to suggest that this is a common phenomenon.
VDT in Screening Trials and Clinical Practice Studies
Lung nodule volumetry and VDT have been used in the nodule management protocols of the NELSON (7), Danish Lung Cancer Screening Trial (6), UK Lung Cancer Screening Trial (4), Multicentric Italian Lung Detection (5), and Pittsburg Lung Screening (8) studies. The NELSON, UK Lung Cancer Screening Trial, and Danish Lung Cancer Screening Trial studies all use VDTs of less than 400 days as the trigger for further investigation in indeterminate lung nodules. The Continuous Observation of Smoking study (63) also used a 400-day VDT cutoff, but this was based on volumes calculated from two-dimensional diameter measurements. The figure of 400 days is also often cited as the upper limit for malignant lung lesions in clinical practice guidelines (2). While this figure originates from chest radiograph series (64), CT studies have found comparable results with regard to the range of VDTs encountered in routinely diagnosed lung cancers (61). Importantly, however, many studies also consistently report that a small proportion of cancers demonstrate a VDT substantially greater than 400 days (61,65,66). While very slow-growing tumors undoubtedly exist, developing nodule management algorithms where the principal motive is to capture these less frequent entities can be questioned.
It is important to be aware that in trials such as the NELSON study (7) and the UK Lung Cancer Screening Trial study (4), a VDT of less than 400 days is only regarded as clinically important if it is accompanied by a minimum percentage increase of 25% in tumor volume to take into account the inherent interscan variability in volumetry. This caveat is of particular importance when the interval between scans is short. For example, a 25% increase in tumor volume over a 3-month period corresponds with a VDT of approximately 280 days. In other words, as the duration between scans reduces, the VDT cutoff used to reliably trigger further investigation is also lowered, and hence its utility is reduced. Furthermore, the specificity for malignancy is substantially lower when short-term follow-up scans are used to calculate VDT. Specificity was 15% when VDT was calculated by using 3-month follow-up scans versus 50% when using 12-month scans in the NELSON screening trial (67).
Implications for Nodule Management: Follow-up Scans
The wealth of data emerging from screening trials offers opportunities to enhance surveillance algorithms for lung nodules. While extrapolating data from high-risk screening participants to lower-risk patients with incidentally detected nodules has been questioned (68), it needs to be born in mind that studies of nodules detected in routine practice equivalent in number to large-scale screening trials may not be forthcoming.
The NELSON trial recently published data on the risks of developing lung cancer based on lung nodule volume and VDT, from an analysis of more than 9000 nodules (69). In this study, patients with nodules in the 100–300-mm3 range demonstrated a risk of lung cancer that varied significantly based on the VDT. A VDT of less than 400 days, 400–600 days, and greater than 600 days corresponded with a lung cancer risk of 9.9%, 4.0%, and 0.8%, respectively. Importantly, the study also reported that patients with very small nodules (< 100 mm3) and large nodules (> 300 mm3) at baseline CT had very low and moderately high risks of malignancy (0.6% and 16.9%), respectively, regardless of VDT. These results suggest that VDT may have a role in stratifying nodules only within a specific size range.
While the reliability of volumetry measurements has been well studied, what is less well known is how it compares to subjective assessment (radiologists’ “eyeballs”) and how it impacts the decision-making process in nodule management. By placing CT images side by side, it is certainly possible for early and subtle growth to be detected subjectively by scrutinizing, for example, the proximity of the nodule surfaces to local anatomic structures.
Singh et al investigated the consistency of readers in identifying nodule growth in 100 nodules scanned 12 months apart and also examined variability in nodule management decisions in selected scenarios (70). Readers used electronic calipers to identify growth but also subjectively expressed the level of confidence for identifying growth. Observer agreement for identifying growth was moderate (κ value for presence or absence of growth, 0.55). Specific scenarios in which agreement on nodule management substantially varied included nodules for which growth was marginal and not clear cut.
Similar studies examining the impact of volumetry on the decision-making process in nodule management are lacking. In theory, using the VDT may at least produce more consistent management plans in cases in which nodule growth is equivocal or cases in which growth is clear, but the rate of growth is uncertain (Fig 5).
Implications for Nodule Management: Baseline Scans
The 2005 Fleischner Society guidelines recommended that small solid nodules between 4 mm and 8 mm are followed up with CT (3). However, Gierada et al (71) found in a study involving 16 radiologists and 135 participants from the National Lung Screening Trial, that only when average reader diameter reached 6 mm did all 16 radiologists agree that the nodule was greater than 4 mm. The 2017 Fleischner guidelines recommend raising the nodule size cutoff for further investigation to 6 mm except in some high-risk patients (10). Since the majority of incidental nodules identified at CT are in the 4–7-mm range (72), it can be speculated that variation in distinguishing whether nodules are above or below this critical size is only likely to increase when using manual linear measurements.
Interestingly, when automated volumetry applications are used to measure nodule diameter, consistency has also been shown to improve. In one study, κ values for interobserver agreement of follow-up recommendations increased from 0.54 to 0.67 (73).
The importance of accurate (not just reproducible) nodule size assessment at baseline has also been highlighted in studies evaluating risk prediction models. Mehta et al reported that using volumetry-based nodule volume significantly increased the predictive ability of a validated lung cancer risk prediction model compared with when using diameter measurements (74). It was postulated that this was because volumetry was able to more accurately capture the entirety of a malignant nodule.
Insights into Overdiagnosis and Prognosis
The topic of overdiagnosis and overtreatment is important in lung cancer screening. While CT screening has been shown to significantly reduce lung cancer mortality in the National Lung Screening Trial, it has been estimated that approximately 20%–25% of screening-detected lung cancers may be overdiagnosed (75). A better understanding of overdiagnosis is therefore vital, because patients with overdiagnosed tumors can be regarded as not likely to die of lung cancer, had the tumor not been discovered.
Robust methods for identifying overdiagnosed tumors prospectively do not currently exist, though the VDT has been suggested as an appropriate tool to differentiate indolent from aggressive tumors (76). It could be speculated, for example, that tumors with VDTs greater than 1000 days may fall into the former category (8). In reality, such a strategy would probably only apply to lung cancers characterized by small solid nodules or in patients with clinically important comorbidities. Once screening-detected solid nodules breach a certain size (eg, 15 mm), most practitioners would likely invoke diagnostic or therapeutic intervention rather than advise continuous monitoring to verify or refute overdiagnosis.
There is increasing recognition that very slow growing lung cancers are detected in clinical practice, not just screening, in particular lesions characterized by subsolid nodules, with adenocarcinoma spectrum histologic findings (62). It has been suggested that VDT or mass doubling time estimation may be able to be used to identify such lesions and allow tailored approaches such as sublobar resection in specific patients. At the other end of the spectrum, there is also evidence to suggest that cancers with rapid VDTs may be associated with a poorer prognosis (77).
Volumetry: Remaining Uncertainties and Conclusions
A large amount of evidence has been published in recent years on the subject of lung nodule volumetry, much of which comes from lung cancer screening trials, with less from the setting of clinical practice. One study reporting the experience of a pulmonary nodule clinic that used volumetry found that its impact on the decision-making process (eg, deciding on biopsy versus conservative management) was modest (78). Nevertheless, with the view that nodule clinical practice studies on the scale of the National Lung Screening Trial or the NELSON study are unlikely to be forthcoming in the near future, nodule management guidelines from the British Thoracic Society have recently been published advocating volumetry as the preferred method of nodule size assessment in clinical practice for solid segmentable nodules (9). The Fleischner Society 2017 nodule management guidelines also acknowledge a role for volumetry in evaluating absolute size, though no specific recommendations for VDT cutoffs are made (10).
The influence of volumetry on detecting earlier stage lung cancers and thus outcomes in clinical practice is also unknown. However, rather than simply relying on volumetry to detect early and imperceptible growth, the evidence suggests that it can also be used to more confidently detect stability and hence benignity, for solid nodules. Specific clinical scenarios in which volumetry may have a role are summarized in Table 2.
At present there is no widely accepted method of testing the reliability of nodule volumetry applications. Evaluating applications on nodule datasets with known nodule outcomes may be one way of achieving this.
A further practical limitation that needs to be addressed is the current lack of integration between commercially available products and radiology information systems. Widespread use of volumetry is only likely if solutions that enable efficient transfer of volumetry and VDT data into radiology information systems reports are provided. The seamless transfer to picture archiving and communications system, or PACS, of volume-rendered images demonstrating the adequacy of nodule segmentation and volume measurement is also an important requirement if pulmonologists and other clinicians are to have confidence in radiologists’ measurements and recommendations for follow-up based on VDT.
Essentials
■ There is growing interest in the use of volumetry applications for the management of lung nodules, as measuring nodule size with this method is more reproducible and subject to less intra- and interobserver variation than electronic calipers.
■ The calculated volume doubling time can guide nodule management by differentiating stable from growing nodules and by distinguishing rapidly growing and slow-growing nodules.
■ Where possible, consistency of acquisition, reconstruction, and patient factors should be ensured when comparing nodule volume at baseline and follow-up CT, especially section thickness and reconstruction algorithm.
■ Reliable volumetry measurement first and foremost requires successful nodule segmentation; juxtavascular nodules and subsolid nodules are especially challenging types of nodules to accurately segment.
References
- 1. . Evaluation of patients with pulmonary nodules: when is it lung cancer? ACCP evidence-based clinical practice guidelines (2nd edition). Chest 2007;132(3 Suppl):108S–130S.
- 2. . Evaluation of individuals with pulmonary nodules: when is it lung cancer? Diagnosis and management of lung cancer, 3rd ed: American College of Chest Physicians evidence-based clinical practice guidelines. Chest 2013;143(5 Suppl):e93S–120S.
- 3. . Guidelines for management of small pulmonary nodules detected on CT scans: a statement from the Fleischner Society. Radiology 2005;237(2):395–400.
- 4. . UK Lung Cancer RCT Pilot Screening Trial: baseline findings from the screening arm provide evidence for the potential implementation of lung cancer screening. Thorax 2016;71(2):161–170.
- 5. . Annual or biennial CT screening versus observation in heavy smokers: 5-year results of the MILD trial. Eur J Cancer Prev 2012;21(3):308–315.
- 6. . CT screening for lung cancer brings forward early disease: the randomised Danish Lung Cancer Screening Trial—status after five annual screening rounds with low-dose CT. Thorax 2012;67(4):296–301.
- 7. . Management of lung nodules detected by volume CT scanning. N Engl J Med 2009;361(23):2221–2229.
- 8. . Doubling times and CT screen–detected lung cancers in the Pittsburgh Lung Screening Study. Am J Respir Crit Care Med 2012;185(1):85–89.
- 9. . British Thoracic Society guidelines for the investigation and management of pulmonary nodules. Thorax 2015;70(Suppl 2):ii1–ii54. [Published correction appears in Thorax 2015;70(12):1188.]
- 10. . Guidelines for management of incidental pulmonary nodules detected on CT images: from the Fleischner Society 2017. Radiology 2017;284(1):228–243.
- 11. . Morphological segmentation and partial volume analysis for volumetry of solid pulmonary lesions in thoracic CT scans. IEEE Trans Med Imaging 2006;25(4):417–434.
- 12. . A comparative study for 2D and 3D computer-aided diagnosis methods for solitary pulmonary nodules. Comput Med Imaging Graph 2008;32(4):270–276.
- 13. . Segmentation of pulmonary nodules in computed tomography using a regression neural network approach and its application to the Lung Image Database Consortium and Image Database Resource Initiative dataset. Med Image Anal 2015;22(1):48–62.
- 14. . Robust semi-automatic segmentation of pulmonary subsolid nodules in chest computed tomography scans. Phys Med Biol 2015;60(3):1307–1323.
- 15. . Usefulness of texture analysis in differentiating transient from persistent part-solid nodules(PSNs): a retrospective study. PLoS One 2014;9(1):e85167.
- 16. . Software performance in segmenting ground-glass and solid components of subsolid nodules in pulmonary adenocarcinomas. Eur Radiol 2016;26(12):4465–4474.
- 17. . Volumetric measurements of pulmonary nodules at multi-row detector CT: in vivo reproducibility. Eur Radiol 2004;14(1):86–92.
- 18. . Pulmonary nodules: interscan variability of semiautomated volume measurements with multisection CT— influence of inspiration level, nodule size, and segmentation performance. Radiology 2007;245(3):888–894.
- 19. . Statistical analysis of lung nodule volume measurements with CT in a large-scale phantom study. Med Phys 2015;42(7):3932–3947.
- 20. . Are two-dimensional CT measurements of small noncalcified pulmonary nodules reliable? Radiology 2004;231(2):453–458.
- 21. . Variability of lung tumor measurements on repeat computed tomography scans taken within 15 minutes. J Clin Oncol 2011;29(23):3114–3119.
- 22. . Effect of nodule characteristics on variability of semiautomated volume measurements in pulmonary nodules detected in a lung cancer screening program. Radiology 2008;248(2):625–631.
- 23. . Lung nodule volumetry: segmentation algorithms within the same software package cannot be used interchangeably. Eur Radiol 2010;20(8):1878–1885.
- 24. . A comparison of six software packages for evaluation of solid lung nodules using semi-automated volumetry: what is the minimum increase in size to detect growth in repeated CT examinations. Eur Radiol 2009;19(4):800–808.
- 25. . A comparison of two commercial volumetry software programs in the analysis of pulmonary ground-glass nodules: segmentation capability and measurement accuracy. Korean J Radiol 2013;14(4):683–691.
- 26. . The Lung Image Database Consortium (LIDC): a comparison of different size metrics for pulmonary nodule measurements. Acad Radiol 2007;14(12):1475–1485.
- 27. . Comparison of three software systems for semi-automatic volumetry of pulmonary nodules on baseline and follow-up CT examinations. Acta Radiol 2014;55(6):691–698.
- 28. . Accuracy of automated volumetry of pulmonary nodules across different multislice CT scanners. Eur Radiol 2007;17(8):1979–1984.
- 29. . Effect of CT scanning parameters on volumetric measurements of pulmonary nodules by 3D active contour segmentation: a phantom study. Phys Med Biol 2008;53(5):1295–1312.
- 30. . Variability of semiautomated lung nodule volumetry on ultralow-dose CT: comparison with nodule volumetry on standard-dose CT. J Digit Imaging 2010;23(1):8–17.
- 31. . The effects of computed tomography with iterative reconstruction on solid pulmonary nodule volume quantification. PLoS One 2013;8(2):e58053.
- 32. . Accuracy of lung nodule volumetry in low-dose CT with iterative reconstruction: an anthropomorphic thoracic phantom study. Br J Radiol 2014;87(1041):20130644.
- 33. . Volumetric quantification of lung nodules in CT with iterative reconstruction (ASiR and MBIR). Med Phys 2013;40(11):111902.
- 34. . Impact of the adaptive statistical iterative reconstruction technique on image quality in ultra-low-dose CT. Clin Radiol 2013;68(9):902–908.
- 35. . Comparison of the effects of model-based iterative reconstruction and filtered back projection algorithms on software measurements in pulmonary subsolid nodules. Eur Radiol 2017;27(8):3266–3274.
- 36. . Automated volumetry of solid pulmonary nodules in a phantom: accuracy across different CT scanner technologies. Invest Radiol 2007;42(5):297–302.
- 37. . Inter- and intrascanner variability of pulmonary nodule volumetry on low-dose 64-row CT: an anthropomorphic phantom study. Br J Radiol 2013;86(1029):20130160.
- 38. . Pulmonary nodules: 3D volumetric measurement with multidetector CT–effect of intravenous contrast medium. Radiology 2007;245(3):881–887.
- 39. . Pulmonary nodules: Contrast-enhanced volumetric variation at different CT scan delays. AJR Am J Roentgenol 2010;195(1):149–154.
- 40. . Effect of the high-pitch mode in dual-source computed tomography on the accuracy of three-dimensional volumetry of solid pulmonary nodules: a phantom study. Korean J Radiol 2015;16(3):641–647.
- 41. . Imprecision in automated volume measurements of pulmonary nodules and its effect on the level of uncertainty in volume doubling time estimation. Chest 2009;135(6):1580–1587.
- 42. . Pulmonary nodule volume: effects of reconstruction parameters on automated measurements—a phantom study. Radiology 2008;247(2):400–408.
- 43. . Pulmonary nodule volumetric measurement variability as a function of CT slice thickness and nodule morphology. AJR Am J Roentgenol 2007;188(2):306–312.
- 44. . Computer-assisted lung nodule volumetry from multi-detector row CT: influence of image reconstruction parameters. Eur J Radiol 2007;62(1):106–113.
- 45. . Benefit of overlapping reconstruction for improving the quantitative assessment of CT lung nodule volume. Acad Radiol 2013;20(2):173–180.
- 46. . Volumetric analysis of lung nodules in computed tomography (CT): comparison of two different segmentation algorithm softwares and two different reconstruction filters on automated volume calculation. Acta Radiol 2014;55(1):54–61.
- 47. . Volumetric measurement of pulmonary nodules at low-dose chest CT: effect of reconstruction setting on measurement variability. Eur Radiol 2010;20(5):1180–1187.
- 48. . Volumetric measurement of synthetic lung nodules with multi-detector row CT: effect of various image reconstruction parameters and segmentation thresholds on measurement accuracy. Radiology 2005;235(3):850–856.
- 49. . Small irregular pulmonary nodules in low-dose CT: observer detection sensitivity and volumetry accuracy. AJR Am J Roentgenol 2014;202(3):W202–W209.
- 50. . Inherent variability of CT lung nodule measurements in vivo using semiautomated volumetric measurements. AJR Am J Roentgenol 2006;186(4):989–994.
- 51. . Solid, part-solid, or non-solid?: classification of pulmonary nodules in low-dose chest computed tomography by a computer-aided diagnosis system. Invest Radiol 2015;50(3):168–173.
- 52. . Pulmonary ground-glass nodules: increase in mass as an early indicator of growth. Radiology 2010;255(1):199–206.
- 53. . Pure and part-solid pulmonary ground-glass nodules: measurement variability of volume and mass in nodules with a solid portion less than or equal to 5 mm. Radiology 2013;269(2):585–593.
- 54. . Interscan variation of semi-automated volumetry of subsolid pulmonary nodules. Eur Radiol 2015;25(4):1040–1047.
- 55. . The effect of lung volume on nodule size on CT. Acad Radiol 2007;14(4):476–485.
- 56. . In vivo repeatability of automated volume calculations of small pulmonary nodules with CT. AJR Am J Roentgenol 2009;192(6):1657–1661.
- 57. . Interstitial lung abnormalities in a CT lung cancer screening population: prevalence and progression rate. Radiology 2013;268(2):563–571.
- 58. . Observations on growth rates of human tumors. Am J Roentgenol Radium Ther Nucl Med 1956;76(5):988–1000.
- 59. . Five-year lung cancer screening experience: CT appearance, growth rate, location, and histologic features of 61 lung cancers. Radiology 2007;242(2):555–562.
- 60. . A retrospective study of volume doubling time in surgically resected non-small cell lung cancer. Respirology 2014;19(5):755–762.
- 61. . Turning gray: the natural history of lung cancer over time. J Thorac Oncol 2008;3(7):781–792.
- 62. . Slow-growing lung cancer as an emerging entity: from screening to clinical management. Eur Respir J 2013;42(6):1706–1722.
- 63. . Lung cancer screening with low-dose computed tomography: a non-invasive diagnostic protocol for baseline lung nodules. Lung Cancer 2008;61(3):340–349.
- 64. . The natural history of lung cancer: a review based on rates of tumour growth. Br J Dis Chest 1979;73(1):1–17.
- 65. . Lung cancers diagnosed at annual CT screening: volume doubling times. Radiology 2012;263(2):578–583.
- 66. . Growth rate of lung cancer recognized as small solid nodule on initial CT findings. Eur J Radiol 2012;81(4):e548–e553.
- 67. . Smooth or attached solid indeterminate nodules detected at baseline CT screening in the NELSON study: cancer risk during 1 year of follow-up. Radiology 2009;250(1):264–272.
- 68. . Lung cancer screening: what is the effect of using a larger nodule threshold size to determine who is assigned to short-term CT follow-up? Radiology 2014;273(2):326–327.
- 69. . Lung cancer probability in patients with CT-detected pulmonary nodules: a prespecified analysis of data from the NELSON trial of low-dose CT screening. Lancet Oncol 2014;15(12):1332–1341.
- 70. . Evaluation of reader variability in the interpretation of follow-up CT scans at lung cancer screening. Radiology 2011;259(1):263–270.
- 71. . Lung cancer: interobserver agreement on interpretation of pulmonary findings at low-dose CT screening. Radiology 2008;246(1):265–272.
- 72. . Reduced lung-cancer mortality with low-dose computed tomographic screening. N Engl J Med 2011;365(5):395–409.
- 73. . Computer-aided nodule detection and volumetry to reduce variability between radiologists in the interpretation of lung nodules at low-dose screening computed tomography. Invest Radiol 2012;47(8):457–461.
- 74. . The utility of nodule volume in the context of malignancy prediction for small pulmonary nodules. Chest 2014;145(3):464–472.
- 75. . Overdiagnosis in low-dose computed tomography screening for lung cancer. JAMA Intern Med 2014;174(2):269–274.
- 76. . Avoiding overdiagnosis in lung cancer screening: the volume doubling time strategy. Eur Respir J 2013;42(6):1459–1463.
- 77. . Estimating overdiagnosis in low-dose computed tomography screening for lung cancer: a cohort study. Ann Intern Med 2012;157(11):776–784.
- 78. . The utility of automated volumetric growth analysis in a dedicated pulmonary nodule clinic. J Thorac Cardiovasc Surg 2011;142(2):372–377.
Article History
Received May 18, 2015; revision requested June 30; revision received August 4; accepted September 2; final version accepted September 25; final review and update by authors June 4, 2017.Published online: Sept 15 2017
Published in print: Sept 2017