Accelerating Whole-Body Diffusion-weighted MRI with Deep Learning–based Denoising Image Filters

Purpose To use deep learning to improve the image quality of subsampled images (number of acquisitions = 1 [NOA1]) to reduce whole-body diffusion-weighted MRI (WBDWI) acquisition times. Materials and Methods Both retrospective and prospective patient groups were used to develop a deep learning–based denoising image filter (DNIF) model. For initial model training and validation, 17 patients with metastatic prostate cancer with acquired WBDWI NOA1 and NOA9 images (acquisition period, 2015–2017) were retrospectively included. An additional 22 prospective patients with advanced prostate cancer, myeloma, and advanced breast cancer were used for model testing (2019), and the radiologic quality of DNIF-processed NOA1 (NOA1-DNIF) images were compared with NOA1 images and clinical NOA16 images by using a three-point Likert scale (good, average, or poor; statistical significance was calculated by using a Wilcoxon signed ranked test). The model was also retrained and tested in 28 patients with malignant pleural mesothelioma (MPM) who underwent lung MRI (2015–2017) to demonstrate feasibility in other body regions. Results The model visually improved the quality of NOA1 images in all test patients, with the majority of NOA1-DNIF and NOA16 images being graded as either “average” or “good” across all image-quality criteria. From validation data, the mean apparent diffusion coefficient (ADC) values within NOA1-DNIF images of bone disease deviated from those within NOA9 images by an average of 1.9% (range, 1.1%–2.6%). The model was also successfully applied in the context of MPM; the mean ADCs from NOA1-DNIF images of MPM deviated from those measured by using clinical-standard images (NOA12) by 3.7% (range, 0.2%–10.6%). Conclusion Clinical-standard images were generated from subsampled images by using a DNIF. Keywords: Image Postprocessing, MR-Diffusion-weighted Imaging, Neural Networks, Oncology, Whole-Body Imaging, Supervised Learning, MR-Functional Imaging, Metastases, Prostate, Lung Supplemental material is available for this article. Published under a CC BY 4.0 license.

three different b values, at three orthogonal diffusion-encoding directions without averaging, and the individual direction images were retained (number of acquisitions = 1 [NOA 1 ]). This acquisition was repeated three times, and a "trace-weighted" image (NOA 9 ) was computed for each b value to derive the clinical-quality images (method illustrated in Fig 1). Data were randomly split into training (n = 14) and validation (n = 3) sets. These data were used in a previous publication investigating the utility of multiple image acquisitions (NOA 1 ) for estimating whole-body ADCs through weighted least-squares approximation, along with voxel-wise characterization of the uncertainty in the derived ADCs (21).
Test WBDWI dataset.-WBDWI data were prospectively acquired after the acquisition of the training WBDWI dataset over a 2-month period (May and June 2019) in a separate sample of 22 consecutive patients with advanced prostate cancer (n = 17, all men), myeloma (n = 3, all men), and advanced breast cancer (n = 2, all women) who required clinical evaluation for suspected metastatic disease (age range, 39-84 years). Inclusion criteria were any patient undergoing whole-body MRI for clinical management of secondary bone disease who was deemed fit by the referring radiologist for an additional 5 minutes of imaging time; the exclusion criteria were any contraindication to MRI, including patient claustrophobia. For each patient, images were acquired by using two WBDWI protocols within the same study (the patient remained on the couch between protocols): the first protocol was the same as that performed for the training dataset, except that only a single acquisition at a single diffusionencoding direction (NOA 1 ) was obtained; the second protocol was an institutional clinical protocol (NOA 16 ; parameters are presented in Table 1). The approximate acquisition times for these protocols were 5 minutes and 22-25 minutes, respectively. Patients also underwent whole-body Dixon imaging and sagittal T1-weighted and T2-weighted anatomic spine imaging as per standard clinical care (1,8). Images were acquired by using a 1.5-T Siemens Aera system.
Mesothelioma dataset.-To demonstrate the feasibility of our approach for smaller field-of-view imaging, we retrospectively evaluated data from a sample of 28 patients (four women and 24 men; age range, 52-85 years) imaged for the presence of MPM as part of a single-center study investigating the value of DWI in MPM (February 2015 to November 2017). Patients underwent lung MRI with a 1.5-T Siemens Avanto system. Imaging parameters are provided in Table 1; data were randomly split into a training dataset of 20 and a validation dataset of eight.

Deep Learning Architecture
We developed our quickDWI method by training a deep learning-based denoising image filter (DNIF) model to generate clinical-grade diffusion-weighted images (NOA 9 ) from images acquired by using one diffusion-encoding direction and one signal average with b values of 50, 600, or 900 sec/mm 2 independently (DNIF-processed NOA 1 [NOA 1-DNIF ] images; original (13), image reconstruction (14), quantitative susceptibility mapping (15), artifact reduction (16), and image denoising (17,18). We trained our model on a sample of patients with advanced prostate cancer and subsequently tested it on a separate prospective sample of patients with advanced prostate cancer, advanced breast cancer, and myeloma. In addition, to test the feasibility of the technique for diffusion-weighted MRI (DWI) acquisitions obtained over a smaller field of view, we retrospectively analyzed a sample of patients with malignant pleural mesothelioma (MPM) (19).

Patient Population and Imaging
These studies were reviewed and approved by our local research ethics committee. The ethics committee waived the requirement of written informed consent for participation.
Training WBDWI dataset.-WBDWI was performed with a 1.5-T Siemens Aera system at three b values (50, 600, and 900 sec/mm 2 ) (3) in 17 men with suspected advanced prostate cancer over four to five axial imaging stations (October 2015 to September 2017; parameters are presented in Table 1). This retrospective sample included consecutive patients (age range, 49-82 years) with metastatic prostate cancer that required clinical evaluation of known metastatic bone disease by using WBDWI. For each section position, images were acquired at Abbreviations ADC = apparent diffusion coefficient, DNIF = deep learning-based denoising image filter, DWI = diffusion-weighted MRI, MAE = mean absolute error, MPM = malignant pleural mesothelioma, MSE = mean-squared error, NOA = number of acquisitions, NOA 1-DNIF = DNIF-processed NOA 1 , PSNR = peak SNR, RDM = relative difference of means, SNR = signal-to-noise ratio, SSIM = structural similarity, WBDWI = whole-body DWI Summary A developed model, called quickDWI, enabled accelerated acquisition protocols for whole-body diffusion-weighted MRI of metastatic prostate, breast, and myeloma bone disease by using deep learning, resulting in images that were comparable with clinical-standard images.

Key Points
n A U-Net-based architecture can successfully reduce the magnitude of noise present in diffusion-weighted MR images; the average mean absolute error of all validation images acquired at b values of 50, 600, and 900 sec/mm 2 was reduced from 0.87 3 10 -3 to 0.53 3 10 -3 .
n The algorithm significantly improved the radiologic image quality of fast but noisy whole-body MRI data in 22 patients with bone disease (P , .01). where a is the weight of each loss function and L MAE/SSIM is the combined loss. We empirically set a to 0.7 after experimentation with different values. The training WBDWI dataset provided a total of 59 400 training images (14 patients 3 three directions 3 three acquisitions 3 three b values) 3 [(80 sections 3 one patient with acquisition at only abdomen or pelvis stations) 1 (160 sections 3 12 patients) 1 (200 sections 3 one patient)]. This dataset also provided a total 15 120 validation images (three patients). The mesothelioma dataset provided 43 200 training images (20 patients) and 15 120 validation images (eight patients). The images were normalized from a range of 0-939 to a range of 0-1 prior to input into the model. All code was written in Python (version 3.6.5.) by using the Keras and/or TensorFlow libraries.

Data Analysis
Training WBDWI dataset.-As a measure of similarity to the NOA 9 images, the MSE, SSIM, and peak SNR (PSNR) were computed for the NOA 1-DNIF and NOA 1 images across all b-value images from all three validation patients (calculated by using scikit-learn version 0.14.2). A monoexponential, least-squares fit-acquired images are referred to as NOA 1 images), as illustrated in Figure 1. For this purpose, we adapted a convolutional neural network based on the U-Net architecture (11), which has been modified to solve regression problems. A NOA 1 image of 256 3 208 pixels in size was provided as input into the network (postlinear interpolation) and was grayscale normalized from a range of 0-4095 to a range of 0-1. After empiric experimentation, a linear activation was used for the last layer, whereas a rectified linear unit activation function was used in all preceding layers. We constrained the weights incident to each hidden unit to have a norm value of less than or equal to 3, the weights of the layers were randomly initialized by using He normal initialization (22), and the network was trained with a batch size of 36 for 15 epochs and optimized by using the Adam algorithm (23) with a learning rate of 0.001. The network was trained by using a Tesla P100 PCIE, 16-GB graphics processing unit card (Nvidia), and the trained algorithm was applied by using a MacBook Pro laptop (Apple) with a 2.9-GHz Intel Core i7 central processing unit (16-GB-2133-MHz random access memory with a low-power double data rate of 3).
We experimented with three cost functions that measured the similarity between the NOA 1-DNIF images and the clinical-grade (NOA 9 and NOA 12 ) images used as the ground truth: the meansquared error (MSE) (24), the mean absolute error (MAE), and a combination of the MAE and the structural similarity (SSIM) index (25):   Parallel imaging Note.-All images were acquired axially by using a spin-echo planar technique. GRAPPA = generalized autocalibrating partial parallel acquisition, NSA = number of signal averages. * When individual acquisitions are retained, the NSA is displayed as the "number of repeat acquisitions" 3 the "NSA per acquisition." All acquired gradient directions were retained individually such that for the training dataset, there were three directions 3 three acquisitions = nine images per b value. † Numbers in parentheses represent the final image dimensions following interpolation (when applicable). ‡ The resolution is presented following image interpolation (when applicable).
where ADC 1-DNIF/NOA1 represents the mean ADC within the defined regions of interest for the NOA 1-DNIF or NOA 1 images, respectively. Furthermore, we calculated the coefficient of variation as the standard deviation divided by the average ADC, and the mean absolute voxel-wise difference between the NOA 1-DNIF or NOA 1 ADC maps and the NOA 9 ting algorithm was used to calculate ADC maps by using data from all three b values for the NOA 1 , NOA 9 , and NOA 1-DNIF images. A radiologist delineated regions of bone disease on the NOA 9 images by using an in-house semiautomatic segmentation tool for WBDWI studies of advanced prostate cancer (26) for all validation patient images, and the resulting regions of interest were copied onto the derived ADC maps. The mean ADCs within regions of bone disease were compared across the three imaging schemes by calculating the relative difference of means (RDM) between NOA 1-DNIF or NOA 1 ADC maps and NOA 9 ADC maps: monoexponential, least-squares fitting algorithm. Anatomic images were not provided to ensure a blinded reading. The radiologists qualitatively scored the contrast-to-noise ratio, SNR, and image artifacts of the b = 900 sec/mm 2 images and the ADC maps independently by using a three-point Likert scale (1 = poor, 2 = adequate, and 3 = good). To assist in the qualitative assessment of the SNR and contrast-to-noise ratio metrics, the radiologists reported the average pixel values within regions of interest around a single site of disease surrounded by healthy tissue and background air on b = 900 sec/ mm 2 images.
Mesothelioma dataset.-We compared two versions of the DNIF model: a version incorporating direct application of the WBDWI dataset model without updating of the model ADC maps. The distributions of ADC measurements within disease were compared for all methods by using violin plots; negative calculated ADCs were included in this analysis because they convey important information regarding the distribution of imaging noise.
Test WBDWI dataset.-The DNIF was directly applied to the test WBDWI dataset without further retraining. Two radiologists with 1 year (A.C.) and 10 years (D.M.K.) of experience with using WBDWI for the assessment of metastatic disease reviewed the NOA 16 , NOA 1 , and NOA 1-DNIF images of all 22 patients (readers were blinded to patient clinical details, and images were presented in random order). In each case, radiologists had access to all b-value images (50, 600, and 900 sec/ mm 2 ), and the ADC maps were calculated offline by using a images. In addition, difference maps are shown between the clinical-standard images and the NOA 1-DNIF or NOA 1 images (NOA 1-DNIF − NOA 9 , for example). All equivalent images are displayed by using the same windowing settings. (Right) Violin plots of the ADC distributions within segmented bone disease for the same three patients (example segmentation regions are displayed as red contours on NOA 9 ADC maps). It is clear that there is a reduction in the range of ADCs resulting from DNIF-processed images. Furthermore, ADCs are shown to be equivalent, as indicated by the relative difference of ADC means from NOA 9 measurements (displayed as a percentage above NOA 1 and NOA 1-DNIF violin plots). parameters (WBDWI model) and a version that was retrained from scratch with 20 of the patients with MPM (lung model). The MSE, SSIM, and PSNR scores were calculated for all eight validation patients, as they had been for the training WBDWI dataset. Regions of disease were delineated on axial b = 50 sec/mm 2 images for all eight validation patients by using 3D Slicer (27) and were then copied onto ADC maps calculated from NOA 12 , NOA 1 , and NOA 1-DNIF images. The mean ADCs within disease were compared across all four imaging schemes by using the same RDM, coefficient of variation, and mean absolute voxel-wise difference scores that were used for the training WBDWI dataset; ADC distribu-tions were compared by using violin plots (including negative ADC values).

Statistical Analysis
For the test WBDWI dataset, we calculated the statistical significance of differences between radiologist ratings of image quality for NOA 1-DNIF compared with NOA 1 images and for NOA 12 images compared with NOA 1 images by using a Wilcoxon signed rank test. Comparisons were made for each image-quality metric, each observer, and for b = 900 sec/mm 2 images and ADC maps independently. We used the "wilcoxon" function in the SciPy Python package (version 1.2.1) to perform our evaluations, assuming a two-sided alternative hypothesis. Calculated P values were corrected for multiple comparisons by using the Benjamini-Hochberg procedure, and a P value of less than .05 was chosen to indicate significance.   The network required 8 hours of training on the WBDWI data when using a Tesla P100 for PCIE 16-GB graphics processing unit card. In terms of computational efficiency, the trained network requires approximately 1 second to process a single low-SNR image on our MacBook Pro laptop with a 2.9-GHz Intel Core i7 central processing unit (16-GB-2133-MHz random access memory with a low-power double data rate of 3).

Model Performance on the Validation WBDWI Sample
After initial training of the denoising model on the 14 patients with prostate cancer, the model was assessed on the three pa- Figure 4: Bar plots for the observer rating study of the test whole-body diffusion-weighted MRI dataset for each imagequality criterion: the signal-to-noise ratio (SNR), the contrast-to-noise ratio (CNR), and image artifacts. Results are shown for b = 900 sec/mm 2 images and apparent diffusion coefficient (ADC) maps separately. In all cases, the majority of fast-acquisition (number of acquisitions = 1 [NOA 1 ]) datasets received a "poor" quality score for both b = 900 sec/mm 2 images and ADC maps, whereas for the NOA 16 dataset, the majority of cases received an "average" or "good" score. The use of the deep learning-based denoising image filter (DNIF) consistently increases the number of cases scoring as average or good for datasets obtained through just one acquisition. A significant difference in the image-quality scores is observed in all cases when comparing NOA 16 images with NOA 1 images and when comparing DNIF-processed NOA 1 (NOA 1-DNIF ) images with NOA 1 images. p † = pairwise comparison of NOA 16 scores minus NOA 1 scores by two-tailed Wilcoxon signed rank test, p ‡ = pairwise comparison of NOA 1-DNIF scores minus NOA 1 scores by two-tailed Wilcoxon signed rank test. tients in the validation dataset. An example of the DNIF being applied to each of the three validation patients from this sample (b = 900 sec/mm 2 images and ADC maps) is illustrated in Figure 2; the DNIF was able to reduce the influence of imaging noise in the output image compared with the input NOA 1 image, resulting in superior image quality in the subsequently calculated ADC maps. The NOA 1-DNIF images had improved quantitative metrics compared with the original NOA 1 images for the MSE (5.8 3 10 −6 vs 7.7 3 10 −6 ; P , .001), SSIM (0.994 vs 0.992; P , .001), and PSNR (55.7 vs 53.2; P , .001) ( Table 2). For all three validation patients within this sample, violin plots of ADCs within segmented regions demonstrated the ability of the DNIF model to reduce the range of calculated ADC measurements as a result of improving the SNR; the mean ADCs measured within bone disease from NOA 1-DNIF images deviated from the mean ADC calculated by using NOA 9 images by an average RDM of 1.9% (range, 1.1%-2.6%) (within previously reported repeatability limits for mean ADC measurements [27]). The NOA 1-DNIF images also had a smaller average difference from the ground truth ADC coefficient of variation than did the NOA 1 images (3.5% vs 9.0%), and the mean absolute voxel-wise difference was also smaller (123.4 vs 136.7). Detailed results are presented in Table 2.

Model Performance on the Test WBDWI Dataset
The model was then assessed on a test dataset of 22 patients with advanced prostate cancer, advanced breast cancer, or myeloma-related bone disease. Application of the DNIF was successful in all patients. Visual improvements in image quality in terms of the contrast-to-noise ratio for high-b-value images and the resulting ADC maps were observed for all patients; results for six selected patients are illustrated in Figure 3, and examples from all patients are presented in Appendix E1 (supplement). Radiologist review of these images is summarized in Figure 4. The majority of NOA 1 images (both b = 900 sec/ mm 2 images and ADC maps) were graded as "poor" by both radiologists across all quality criteria, whereas the majority of NOA 16 and NOA 1-DNIF images were graded as either "average" or "good." Statistically significant differences were observed in all comparisons (NOA 16 vs NOA 1 images and NOA 1-DNIF vs NOA 1 images) for all quality metrics and for both radiologists independently. The average quality scores (6 the standard error from the three-point quality scale) of the ADC maps ob- Note.-Data are shown as proportions with percentages in parentheses. These results indicate the ability of the DNIF to provide clinically usable images from images with clinically suboptimal quality, as the majority of DNIF images are considered to be usable, which is in contrast with NOA 1 images, of which the majority are considered "not usable." ADC = apparent diffusion coefficient, CNR = contrast-to-noise ratio, DNIF = deep learning-based denoising image filter, NOA = number of acquisitions, NOA 1-DNIF = DNIF-processed NOA 1 , R = reader, SNR = signal-to-noise ratio.
tained from NOA 1-DNIF images were higher than the scores of the ADC maps obtained from NOA 1 images (SNR, 2.  Table 3 presents the percentage of images defined to be clinically usable (average or good) by either radiologist; the majority of images were defined to be clinically usable for NOA 16 and NOA 1-DNIF images, whereas this was not the case for NOA 1 images.

Performance of the Pretrained WBDWI and Retrained Lung
Model on the Mesothelioma Dataset Next, two different models were assessed on the mesothelioma dataset: the original pretrained WBDWI model and the model retrained on a subset of patients from the mesothelioma da-taset (lung model). Figure 5 compares results for three of the validation patient datasets from the mesothelioma dataset, demonstrating NOA 1 images filtered by using both the WB-DWI model and the lung model. The lung model improved all three quantitative metrics (MSE, SSIM, and PSNR) in all eight test patients (Table 4). Analyzing the ADC distributions from all imaging techniques (Fig 6 and Table 4) revealed low RDM scores, with average values of 2.0% (range, 0.4%-8.4%) for NOA 1 images and 3.7% (range, 0.2%-10.6%) and 4.0% (range, 0.1%-11.2%) for NOA 1-DNIF images derived from the lung model and the WBDWI model, respectively. In one patient (patient 3), the mean ADC of disease from NOA 1-DNIF images deviated from the mean ADC from NOA 12 images by approximately 11%. However, a similar variation was observed for NOA 1 ADC maps, indicating that this deviation was not ) images, the fast-acquisition (NOA 1 ) images, and the deep learning-based denoising image filter (DNIF)-processed NOA 1 (NOA 1-DNIF ) images from the pretrained whole-body diffusion-weighted MRI (WBDWI) model and the retrained lung model (which was retrained by using data acquired specifically in patients with malignant pleural mesothelioma). In addition, difference maps are shown between the NOA 12 images and the NOA 1-DNIF or NOA 1 images (NOA 1-DNIF − NOA 12 , for example). All equivalent images are displayed by using the same windowing settings. Although a clear improvement in image quality is observed when using the pretrained WBDWI, a further improvement is seen from the lung model. In particular, improved disease contrast can be observed in high-b-value images and ADC maps, with sharper tissue boundaries (green and orange arrows, respectively) being demonstrated. In a few cases, some bias is observed in the ADC calculations obtained by using the DNIF lung model (red arrow); this occurs in regions of motion (eg, near the diaphragm) where the NOA 12 image signal will average out in regions that move (effective acquisition time on the order of minutes), whereas NOA 1 images represent more of a snapshot in time (acquisition time on the order of tens of milliseconds).
Accelerating WBDWI with DNIFs due to the application of the DNIFs. In all cases, application of the DNIFs (NOA 1-DNIF images) reduced the presence of ADC measurement outliers in filtered images compared with NOA 1 images. The NOA 1-DNIF ADC maps also had a smaller average difference from the ground truth ADC coefficient of variation and had a smaller mean voxel-wise difference in most cases (Table 4).

Discussion
Our DNIF improved image quality in subsampled WBDWI acquisitions as demonstrated within our test datasets of images from patients with metastatic prostate, breast, or myeloma-related bone disease. Initial results indicate that ADC measurements made by using DNIF-processed images fall within the typical limits of repeatability for mean extracranial ADC measurements (28) and are therefore comparable with those made by using fully sampled WBDWI images (in tumors for which isotropic water diffusion can be assumed). This indicates that DNIF-derived ADC estimates in bone disease might have a level of clinical image quality that is sufficient for monitoring the treatment response (26,29); repeat baseline measurements acquired by using our method would be required to fully test this hypothesis. In our blinded study based only on anatomic images from an independent set of 22 patients, two expert radiologists deemed the majority of DNIF-processed images as "usable" for the clinical setting, whereas the original noisy images from which they were derived were mostly "not usable." A major advantage of our approach is that the acquisition of training data needed for deriving the DNIF can be adopted by any imaging center, providing adaptable solutions that are trained to a particular manufacturer and/or imager. We have demonstrated that our method can be adapted to other diseases investigated by using DWI, such as MPM. Although the WBDWI-trained DNIF can be used to improve image quality of single-acquisition DWI images obtained in the context of MPM, the technique can be improved by acquiring diseasespecific training data.
Understanding the inner workings of any deep learning algorithm is critical if such technologies are to be embraced in the health care sector, and this understanding is required to support application for medical regulatory approval. In Appendix E1 (supplement), we provide some evidence for how our DNIF may be working; we provide preliminary evidence that the DNIF is nonlinear, spatially variant, nonlocal, and edge preserving. We posit on the basis of these results that the DNIF is learning about the complex relationships among pixels within the image in terms of their relative position and relative intensity. Moreover, we suggest that the DNIF learns about anatomic position to tune the degree of smoothing it performs at a particular body location. This is evidenced by the improvements observed when retraining the DNIF for our MPM data; because of respiratory motion within the thoracic cage, the algorithm tended to oversmooth images in this region when using the WBDWItrained DNIF.
During training, the neural network minimizes a cost function that measures the similarity between the DNIF-processed images and the clinical-standard images used as the ground truth. The correct assessment of image similarity by algorithms is an ongoing problem in the computer vision field. The default choice, the MSE, is predominantly used for its simplicity and well-understood properties but has limitations, including the assumption that noise has a Gaussian distribution and is not dependent on local image characteristics (30). Furthermore, this metric, although valid for other applications, produces images that do not correlate well with human perception of image quality (two images with a very low MSE can look quite different to a human observer) (24). In this study, we investigated the MAE and combined it with a metric that can be used to more closely resemble human perception, the SSIM (25). In our future studies, we aim to further explore other approaches, such as the use of a perceptual loss (as deep features have been shown to Note.-Data are shown as the median values with ranges in parentheses calculated over all b = 50 sec/mm 2 , b = 600 sec/mm 2 , and b = 900 sec/mm 2 images for each patient. ADC = apparent diffusion coefficient, CoV = difference from the ground truth ADC coefficient of variation, DNIF = deep learning-based denoising image filter, MAVD = mean absolute voxel-wise difference, MSE = mean-squared error, NOA = number of acquisitions, NOA 1-DNIF = DNIF-processed NOA 1 , PSNR = peak signal-to-noise ratio, RDM = relative difference of means, SSIM = structural similarity index, WBDWI = whole-body diffusion-weighted MRI. Zormpas-Petridis et al  Figure 5. Some differences were observed in these distributions, particularly at ADCs greater than 2 × 10 -3 mm 2 /sec (patients 3 and 4, for example). Further investigation revealed that this was likely due to bulk motion, because NOA 1 (and hence deep learning-based denoising image filter [DNIF]-processed NOA 1 [NOA 1-DNIF ]) images are effectively snapshots in time (acquisition time on the order of tens of milliseconds), whereas the NOA 12 image signal averages out motion over the 12 repeat measurements. In regions of pleural mesothelioma, where bulk free water flows as a result of convection from one imaging section to another, this could result in incomplete T1 relaxation of the water as it flows from one section to the next, leading to regions of spurious signal suppression on each section excitation. WBDWI = whole-body diffusion-weighted MRI.
correlate better with human perception than do manual metrics [31,32]) and generative adversarial network architectures (33), while also comparing these approaches with traditional denoising algorithms (34,35). The encouraging findings of our proof-of-concept study warrant further investigation through multicenter studies comprising larger patient populations to understand the effect of the technique on diagnostic accuracy. Deep neural networks typically benefit from the addition of training data from other institutions, MRI vendors, and different protocols and would offer a filter that produces images that are of clinical quality such that it would enable evaluation of any WBDWI study. Our approach could exploit the concept of "transfer learning." By using the weights from our DNIF as an initialization, an individual site may not need to acquire much data to train a network specific to that site. Future studies could also investigate the value of working directly with acquired raw k-space data for improving single-shot WBDWI image quality by using contemporary methods in machine learning, such as Automated Transform by Manifold Approximation (36,37). In a few patients, we found some differences between the calculated ADCs from DNIF images and the calculated ADCs from clinical images, especially for images acquired at b values greater than 2 3 10 -3 mm 2 / sec. This appears to be due to the fact that the DNIF images capture a snapshot in time (tens of milliseconds per b-value image), whereas the clinical images comprise an average of nine or 12 repeat acquisitions obtained over approximately 5 minutes, thus averaging out motion effects. In some respects, this is encouraging, because it warrants further exploration of the use of DNIF for fast-acquisition, breath-hold ADC measurements in the abdomen and chest.
We conclude that deep learning methods, such as our quick-DWI approach, are able to improve the quality of WBDWI images from subsampled data, potentially reducing acquisition times by a significant amount (from approximately 25 minutes to 5 minutes in our test study). Such time savings would reduce imaging costs, rendering WBDWI appropriate for screening studies and reducing patient imaging time and/or discomfort, which could aid in the widespread adoption of WBDWI.