Reviews and CommentaryFree Access

The Need for Medical Artificial Intelligence That Incorporates Prior Images

Published Online:https://doi.org/10.1148/radiol.212830

Abstract

The use of artificial intelligence (AI) has grown dramatically in the past few years in the United States and worldwide, with more than 300 AI-enabled devices approved by the U.S. Food and Drug Administration (FDA). Most of these AI-enabled applications focus on helping radiologists with detection, triage, and prioritization of tasks by using data from a single point, but clinical practice often encompasses a dynamic scenario wherein physicians make decisions on the basis of longitudinal information. Unfortunately, benchmark data sets incorporating clinical and radiologic data from several points are scarce, and, therefore, the machine learning community has not focused on developing methods and architectures suitable for these tasks. Current AI algorithms are not suited to tackle key image interpretation tasks that require comparisons to previous examinations. Focusing on the curation of data sets and algorithm development that allow for comparisons at different points will be required to advance the range of relevant tasks covered by future AI-enabled FDA-cleared devices.

© RSNA, 2022

Summary

The state of medical artificial intelligence–enabled devices would benefit from the curation of data sets and algorithm development, allowing for comparison to prior examinations, which is crucial for several key imaging interpretation tasks in everyday clinical practice.

Essentials

  • ■ Most artificial intelligence (AI)–enabled devices that are approved by the U.S. Food and Drug Administration (FDA) and are available to date address tasks by considering only a single point.

  • ■ Clinical tasks involve dynamic scenarios, and diagnostic and prognostic decisions often rely on the combination of prior and current information.

  • ■ The development of benchmark data sets and algorithms that leverage prior examinations have the potential to improve the range of tasks covered by FDA-approved medical AI devices.

Introduction

The use of artificial intelligence (AI) in health care has grown remarkably in the last few years. More than 300 devices use AI approved by the U.S. Food and Drug Administration (FDA) to date (1,2).

Many of these algorithms were developed to help clinicians with detection, triage, and prioritization of tasks at a single point. Clinical care, however, represents a dynamic scenario wherein the treating clinicians, other health professionals, and patients have clinical encounters across time. Preventive, diagnostic, prognostic, and therapeutic interventions happen within this continuous relationship, and clinicians often use information from multiple points to improve diagnostic and prognostic accuracy. This is particularly true for imaging tasks, where comparisons with a baseline scan are often performed during the same hospitalization and during long-term follow-up to depict changes and/or understand the disease state at any point (3).

In this report, we argue that current AI algorithms are not suited to tackle key image interpretation tasks that require comparisons to prior examinations by using FDA-cleared algorithms as guiding examples. Future AI algorithms that address this scenario will require a focus on the curation of data sets and algorithm development that allow for comparisons to prior examinations.

Current Machine Learning–based Models and Imaging Tasks

Although several machine learning algorithms and AI-enabled devices have been developed and some have been approved by the FDA, not many represent longitudinal comparison tasks performed by clinicians (Fig 1).

Image interpretation tasks that benefit from comparison to prior                     examinations. (A) Comparison to baseline points (eg, in patients with ischemic                     stroke and hemorrhagic transformation). (B) Assessment of disease progression                     and treatment response (eg, in patients with brain tumors). Adapted, under a CC                     BY license, from reference 4. (C) Disease burden quantification (cumulative                     probability of abnormalities in brain MRI from patients with multiple                     sclerosis), and disease subtyping (multiple sclerosis subtypes cortical-led                     disease, normal-appearing white matter [WM], and lesion-led subtypes). Adapted,                     under a CC BY license, from reference 5.

Figure 1: Image interpretation tasks that benefit from comparison to prior examinations. (A) Comparison to baseline points (eg, in patients with ischemic stroke and hemorrhagic transformation). (B) Assessment of disease progression and treatment response (eg, in patients with brain tumors). Adapted, under a CC BY license, from reference 4. (C) Disease burden quantification (cumulative probability of abnormalities in brain MRI from patients with multiple sclerosis), and disease subtyping (multiple sclerosis subtypes cortical-led disease, normal-appearing white matter [WM], and lesion-led subtypes). Adapted, under a CC BY license, from reference 5.

Comparison with previous radiologic examinations is a key component of reporting radiology results. As a result, it is emphatically taught to radiology residents and has been incorporated into practical guidelines (6,7). Failure to compare current studies with previous examinations is classified as a specific type of reporting error (8,9), which might result in missing pathologic findings and delaying diagnoses (10), with important ethical and legal considerations (11). Previous trials have found radiologists guilty of negligence because of missing malignancies in chest radiographs from not reviewing all previous examinations (12,13). Reviewing previous images is paramount to appropriately evaluating incidental findings, assessing treatment response (particularly tumor response to therapy), ascertaining disease progression, and suggesting or confirming the suspected diagnosis or staging. Comparison with previous images has been suggested to avoid recommending unneeded additional imaging studies (14) and to improve the screening and diagnostic yield of mammography (1517).

The evaluation of patients in critical care and stroke units is a prime example of everyday comparison tasks performed by clinicians. In many clinical scenarios, patients undergo regular imaging (eg, chest radiography) to screen for common complications, assess physiologic states, and evaluate response to treatment (18). However, current AI-enabled devices focus only on a single point to detect pathologic abnormalities or triage images (19). Arguably, leveraging prior information can potentially improve performance. Similarly, several AI algorithms have been approved by the FDA for the detection of stroke, a condition that needs fast and accurate diagnosis and treatment (20). However, beyond detection, other interesting imaging-related tasks to target require the analysis of longitudinally acquired neuroimaging. For example, patients with ischemic stroke who are treated with thrombolytic therapy are assessed at CT after 24 hours to check for possible hemorrhagic and infarct extension (Fig 2) (21). In addition, patients with intracranial hemorrhage are also routinely followed up with head scanning at 24 hours and sometimes 72 hours (22). The application of machine learning to these tasks has not been explored widely, mostly because of the lack of open-source benchmark data sets with well-organized longitudinal imaging data.

Comparison of the current paradigm versus medical artificial intelligence                     (AI) by using prior images. With the current paradigm, head CT images would be                     processed by the AI model at each point, outputting the probability of ischemic                     or hemorrhagic stroke accordingly. By leveraging prior images, AI models would                     be better positioned to arrive at the correct diagnosis of ischemic stroke with                     hemorrhagic transformation.

Figure 2: Comparison of the current paradigm versus medical artificial intelligence (AI) by using prior images. With the current paradigm, head CT images would be processed by the AI model at each point, outputting the probability of ischemic or hemorrhagic stroke accordingly. By leveraging prior images, AI models would be better positioned to arrive at the correct diagnosis of ischemic stroke with hemorrhagic transformation.

Beyond applications focused on acute clinical settings, the radiologic evaluation of disease burden and progression is one of the cornerstones of clinical monitoring in patients with chronic conditions, requiring frequent comparisons with previous examinations. Several FDA-cleared algorithms help quantify multiple sclerosis lesion burden and brain volume, and these algorithms have been used for more than 10 years (23). However, these methods usually require images to be obtained with the same MRI scanners by using the same acquisition protocol (24). Chronic cerebrovascular disease burden is a strong risk factor of recurrent stroke events and cognitive decline (25). Precise quantification of change over time (eg, white matter hyperintensities volume, cerebral microbleeds, lacunes, and enlarged perivascular spaces) may constitute a useful tool to provide appropriate therapeutic options. Disease quantification by using machine learning models better capture pain in patients with osteoarthritis compared with previously used scales (26). The addition of prior information to these models helps to identify subtypes of disease trajectories that may benefit from different therapeutic strategies. AI enables systems to help detect tuberculosis on chest radiographs (27). Imaging follow-up after completion of the treatment plan determines response to therapy, possible radiographic worsening, and presence of persistent lesions. All of these tasks may be augmented by AI-enabled devices, allowing for better resource management, particularly in underserved areas where medical expertise is not widely available.

Evaluation of prior images is also important for identifying, assessing the treatment response of, and managing primary and metastatic tumors. Follow-up in patients with lung nodules would benefit from the longitudinal evaluation of chest radiographs or chest CT scans because the comparison of multiple points helps in identifying malignant lung tumors (28). Many FDA-approved devices currently available help provide lung nodule detection and characterization. These applications synchronize prior and current scans to simplify comparisons made by radiologists (29). A deep learning model is capable of predicting complete pathologic response in rectal cancer from MRI images before and after therapy (30). Similarly, the evaluation of longitudinal mammography scans, already performed in clinical practice, would undoubtedly benefit from AI models that can leverage prior information to make predictions (31).

Beyond these common problems, most other oncologic conditions require regular imaging comparisons with prior points. Fully automated implementation of the Response Evaluation Criteria in Solid Tumors to detect change (32) would greatly improve efficiency and decrease lesion measurement variability of current approaches with cancers with tumor heterogeneity (33,34). Researchers have already proposed the use of deep learning to automatically locate and measure relevant metrics required for objective and robust longitudinal evaluation of tumors (35). As another example, patients with glioblastoma undergo routine follow-up imaging at several points after the initiation of treatment (36,37). Several nontrivial tasks for radiologists, such as differentiating progression from pseudoprogression, may benefit from considering information from several points (38). Emerging research has been performed on algorithms by analyzing these scenarios for other oncologic diseases (39,40). More recent work has focused on predicting treatment response in brain tumors by analyzing an arbitrary number of previous images (41).

Building Data Sets for Comparison across Points

There are many challenges to the curation of data sets for comparison tasks. The first necessary but insufficient requisite is that these images are stored. Whereas this is usually true in practice, these images are often stored within complex electronic health records systems that are optimized for the clinical workflow. Making the images available for research is not a trivial task. Another important challenge to building these data sets is the variable locations where these images are stored. Although follow-up images are usually ordered by and reported to the same clinician in charge of the patient, multiple radiology practices exist, and both patients and insurers may choose to undergo or cover the required imaging at different institutions.

Some efforts have already been made toward the development of these longitudinal imaging data sets. A clear example is the UK Biobank study, which has acquired images from more than 40 000 participants, and a subset of those (around 5000) have already undergone follow-up imaging (42,43). However, individuals enrolled in the UK Biobank and who consent to imaging are mostly healthy participants, and therefore comparison tasks can be developed for phenotypes such as small vessel disease burden and brain volume and atrophy, but follow-up of specific diseases is limited.

Image registration is an essential preprocessing step in medical imaging that enables automated comparisons between distinct medical images. Registration involves geometrically aligning two images into one coordinate system, thereby correctly overlaying the same anatomic structures or region of interest across images (44). The need for registration across different points is important when building and deploying these data sets. The most common approach used in practice is to coregister images to the original space of the first point. Alternative approaches include registration of all images to global templates (ie, a template brain MRI scan constructed by averaging hundreds of images) (45). Importantly, both of these approaches have positive and negative aspects, and data sets should provide original images and registered images to improve their usability if possible. As an example, the UK Biobank provides different MRI sequences coregistered to the original T1 space and in their raw original space and the transformation matrices used by FSL (ie, a well-known software package for medical image processing) (46).

Machine Learning Models and Architectures Optimized for Comparison Tasks

Compared with the abundance of machine learning models and architectures designed for classification, detection, and segmentation at a single point, models that address comparison tasks received limited attention. With the introduction of standard benchmarks focused on these problems, we expect that the number of algorithms designed for this particular application will increase in the near future.

Several different AI approaches have been proposed to compare or change detection tasks. Deep learning architectures used include convolutional neural networks, recurrent neural networks, Siamese neural networks, pulse-coupled neural networks, generative adversarial networks, and attention-based models (47). However, only a few of these have been tested within the radiology field.

The most common and straightforward approach involves calculating changes in segmentation masks, derived region diameters, or volumes across different points (48). This approach is already used by some medical AI devices currently approved by the FDA. As an example, icobrain (Icometrix) and Quantib Brain (Quantib) calculate multiple sclerosis–related white matter lesion volume and raw brain volume across several points and provide feedback about percentage changes (49). View LCS (Coreline Soft), NinesMeasure (Nines), and Philips Lung Nodule Assessment and Comparison Option (Philips Medical Systems) provide measurements and simplify comparisons of lung nodules across multiple points. Research has also supported other similar-use cases including aortic measurements (50) and treatment response and follow-up of brain tumors (51). This approach can also be easily extended to four-dimensional physiologic imaging in which models that are able to segment images from these four-dimensional studies can potentially enhance clinical workflows (52).

Whereas this approach is useful and simple, more complicated tasks are not amenable to it. For example, comparing disease stages in chronic conditions such as osteoarthritis are often qualitative time-consuming tasks that require interpretation beyond the quantification of segmented structures. A few other approaches have been tested for comparison tasks in medical imaging. Siamese networks use identical subnetworks to process two different points into two encoded vectors that are then compared and analyzed by a final classification layer. These models have been used to interpret treatment response in rectal cancer (40) and to automate treatment response analysis from fluorodeoxyglucose PET/CT scans (39). Researchers have used hybrid architecture that extracts features by using convolutional neural networks and then leverages recurrent neural networks to detect changes across time with reasonable performance (53,54). Another group also leveraged convolutional neural networks to extract imaging features and constructed geometric correlation maps to finally classify images as having change or no change (55).

In the future, we expect that architectures with attention mechanisms will be able to incorporate both spatial and temporal context and therefore provide a powerful tool for tasks that use information from multiple points (41). Importantly, several challenges arise when designing end-to-end machine learning systems aimed at performing the entire follow-up assessment of disease. These algorithms need to address not only technical variability (eg, vendor, scanner, and scan protocol) between the follow-up examinations but also clinical heterogeneity (because different providers may use different follow-up timing and strategy). These and other potential challenges may limit the generalizability of these models and need to be considered early in the development process.

Prospects

AI systems that incorporate information from previous points could facilitate tedious and time-consuming comparisons performed by radiologists for frequent studies such as chest radiography and mammography. This would potentially improve the standard of care by avoiding delayed diagnoses and reducing costs associated with unnecessary recommendations for follow-up examinations.

In using AI for longitudinal comparison, we will be able to quantify at a detailed level the amount of improvement in a patient. This may lead to more granular decisions regarding treatment regimens, including doses and timing of therapeutic interventions. Additionally, we would be able to better estimate which patients will benefit from treatment by observing small changes between images at different points, which would be useful both for clinical practice and as end points of randomized clinical trials.

Detailed quantification of disease burden would also allow us to characterize patient trajectories over time and identify phenotypic subgroups. This approach has been applied to sepsis subgrouping (56) and endometriosis subtyping (57) by using unsupervised learning. Following this approach may allow us to find subtypes for diseases that we have understood to be a heterogeneous group of different disease subtypes. As an example, imaging subtypes of mild cognitive impairment may predict whether patients will or will not develop any type of dementia, such as Alzheimer disease or vascular dementia. Rare and heterogeneous diseases such as primary vasculitis of the central nervous system may also be possible to subgroup on the basis of these approaches. Finally, subtypes of patients that respond to different therapeutic options may be identified for other conditions, including neuroimmunologic diseases such as multiple sclerosis (23).

In summary, even though physicians routinely perform comparisons with prior examinations when interpreting images in clinical practice, only a few artificial intelligence (AI) algorithms currently available are able to incorporate information from more than one point to help in these critical tasks. The curation of high-quality data sets with longitudinal clinical and imaging data, and the development of AI algorithms capable of solving a wider range of problems, will be essential to provide meaningful improvements in clinical workflows.

Disclosures of conflicts of interest: J.N.A. No relevant relationships. G.J.F. No relevant relationships. P.R. No relevant relationships.

J.N.A. supported by the American Heart Association Bugher Fellowship in hemorrhagic stroke research.

References

  • 1. Benjamens S, Dhunnoo P, Meskó B. The state of artificial intelligence-based FDA-approved medical devices and algorithms: an online database. NPJ Digit Med 2020;3(1):118. Crossref, MedlineGoogle Scholar
  • 2. Artificial Intelligence and Machine Learning (AI/ML)-Enabled Medical Devices. Food and Drug Administration. https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-aiml-enabled-medical-devices. Accessed October 14, 2021. Google Scholar
  • 3. Sistrom CL, Dreyer KJ, Dang PP, et al. Recommendations for additional imaging in radiology reports: multifactorial analysis of 5.9 million examinations. Radiology 2009;253(2):453–461. LinkGoogle Scholar
  • 4. Ko CC, Yeh LR, Kuo YT, Chen JH. Imaging biomarkers for evaluating tumor response: RECIST and beyond. Biomark Res. 2021;9(1):52. Crossref, MedlineGoogle Scholar
  • 5. Eshaghi A, Young AL, Wijeratne PA, et al. Identifying multiple sclerosis subtypes using unsupervised machine learning and MRI data. Nat Commun 2021;12(1):2078.[Published correction appears in Nat Commun 2021;12(1):3169.] Crossref, MedlineGoogle Scholar
  • 6. European Society of Radiology (ESR). Good practice for radiological reporting. Guidelines from the European Society of Radiology (ESR). Insights Imaging 2011;2(2):93–96. Google Scholar
  • 7. Hartung MP, Bickle IC, Gaillard F, Kanne JP. How to Create a Great Radiology Report. RadioGraphics 2020;40(6):1658–1670. LinkGoogle Scholar
  • 8. Bruno MA, Walker EA, Abujudeh HH. Understanding and Confronting Our Mistakes: The Epidemiology of Error in Radiology and Strategies for Error Reduction. RadioGraphics 2015;35(6):1668–1676. LinkGoogle Scholar
  • 9. Onder O, Yarasir Y, Azizova A, Durhan G, Onur MR, Ariyurek OM. Errors, discrepancies and underlying bias in radiology with case examples: a pictorial review. Insights Imaging 2021;12(1):51. Crossref, MedlineGoogle Scholar
  • 10. Kim YW, Mansfield LT. Fool me twice: delayed diagnoses in radiology with emphasis on perpetuated errors. AJR Am J Roentgenol 2014;202(3):465–470. Crossref, MedlineGoogle Scholar
  • 11. Berlin L. Reporting the "missed" radiologic diagnosis: medicolegal and ethical considerations. Radiology 1994;192(1):183–187. LinkGoogle Scholar
  • 12. Berlin L. Comparing new radiographs with those obtained previously. AJR Am J Roentgenol 1999;172(1):3–6. Crossref, MedlineGoogle Scholar
  • 13. Berlin L. Must new radiographs be compared with all previous radiographs, or only with the most recently obtained radiographs? AJR Am J Roentgenol 2000;174(3):611–615. Crossref, MedlineGoogle Scholar
  • 14. Doshi AM, Kiritsy M, Rosenkrantz AB. Strategies for Avoiding Recommendations for Additional Imaging Through a Comprehensive Comparison With Prior Studies. J Am Coll Radiol 2015;12(7):657–663. Crossref, MedlineGoogle Scholar
  • 15. Roelofs AAJ, Karssemeijer N, Wedekind N, et al. Importance of comparison of current and prior mammograms in breast cancer screening. Radiology 2007;242(1):70–77. LinkGoogle Scholar
  • 16. Yankaskas BC, May RC, Matuszewski J, Bowling JM, Jarman MP, Schroeder BF. Effect of observing change from comparison mammograms on performance of screening mammography in a large community-based population. Radiology 2011;261(3):762–770. LinkGoogle Scholar
  • 17. Burnside ES, Sickles EA, Sohlich RE, Dee KE. Differential value of comparison with previous examinations in diagnostic versus screening mammography. AJR Am J Roentgenol 2002;179(5):1173–1177. Crossref, MedlineGoogle Scholar
  • 18. Ganapathy A, Adhikari NKJ, Spiegelman J, Scales DC. Routine chest x-rays in intensive care units: a systematic review and meta-analysis. Crit Care 2012;16(2):R68. Crossref, MedlineGoogle Scholar
  • 19. Sjoding MW, Taylor D, Motyka J, et al. Deep learning to detect acute respiratory distress syndrome on chest radiographs: a retrospective study with external validation. Lancet Digit Health 2021;3(6):e340–e348. Crossref, MedlineGoogle Scholar
  • 20. Campbell BCV, Khatri P. Stroke. Lancet 2020;396(10244):129–142. Crossref, MedlineGoogle Scholar
  • 21. Powers WJ, Rabinstein AA, Ackerson T, et al. Guidelines for the Early Management of Patients With Acute Ischemic Stroke: 2019 Update to the 2018 Guidelines for the Early Management of Acute Ischemic Stroke: A Guideline for Healthcare Professionals From the American Heart Association/American Stroke Association. Stroke 2019;50(12):e344–e418. Crossref, MedlineGoogle Scholar
  • 22. Hemphill JC 3rd, Greenberg SM, Anderson CS, et al. Guidelines for the Management of Spontaneous Intracerebral Hemorrhage: A Guideline for Healthcare Professionals From the American Heart Association/American Stroke Association. Stroke 2015;46(7):2032–2060. Crossref, MedlineGoogle Scholar
  • 23. Vrenken H, Jenkinson M, Pham DL, et al. Opportunities for Understanding MS Mechanisms and Progression With MRI Using Large-Scale Data Sharing and Artificial Intelligence. Neurology 2021;97(21):989–999. Crossref, MedlineGoogle Scholar
  • 24. Jain S, Ribbens A, Sima DM, et al. Two Time Point MS Lesion Segmentation in Brain MRI: An Expectation-Maximization Framework. Front Neurosci 2016;10:576. Crossref, MedlineGoogle Scholar
  • 25. Koton S, Schneider ALC, Windham BG, Mosley TH, Gottesman RF, Coresh J. Microvascular Brain Disease Progression and Risk of Stroke: The ARIC Study. Stroke 2020;51(11):3264–3270. Crossref, MedlineGoogle Scholar
  • 26. Pierson E, Cutler DM, Leskovec J, Mullainathan S, Obermeyer Z. An algorithmic approach to reducing unexplained pain disparities in underserved populations. Nat Med 2021;27(1):136–140. Crossref, MedlineGoogle Scholar
  • 27. Lee S, Yim JJ, Kwak N, et al. Deep Learning to Determine the Activity of Pulmonary Tuberculosis on Chest Radiographs. Radiology 2021;301(2):435–442. LinkGoogle Scholar
  • 28. Song YS, Park CM, Park SJ, Lee SM, Jeon YK, Goo JM. Volume and mass doubling times of persistent pulmonary subsolid nodules detected in patients without known malignancy. Radiology 2014;273(1):276–284. LinkGoogle Scholar
  • 29. Hwang EJ, Goo JM, Kim HY, Yi J, Yoon SH, Kim Y. Implementation of the cloud-based computerized interpretation system in a nationwide lung cancer screening with low-dose CT: comparison with the conventional reading system. Eur Radiol 2021;31(1):475–485. Crossref, MedlineGoogle Scholar
  • 30. Zhang XY, Wang L, Zhu HT, et al. Predicting Rectal Cancer Response to Neoadjuvant Chemoradiotherapy Using Deep Learning of Diffusion Kurtosis MRI. Radiology 2020;296(1):56–64. LinkGoogle Scholar
  • 31. Mandelblatt JS, Stout NK, Schechter CB, et al. Collaborative Modeling of the Benefits and Harms Associated With Different U.S. Breast Cancer Screening Strategies. Ann Intern Med 2016;164(4):215–225. Crossref, MedlineGoogle Scholar
  • 32. Eisenhauer EA, Therasse P, Bogaerts J, et al. New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1). Eur J Cancer 2009;45(2):228–247. Crossref, MedlineGoogle Scholar
  • 33. Chen DT, Chan W, Thompson ZJ, et al. Utilization of target lesion heterogeneity for treatment efficacy assessment in late stage lung cancer. PLoS One 2021;16(7):e0252041.[Published correction appears in PLoS One 2021;16(7):e0255429.] Crossref, MedlineGoogle Scholar
  • 34. Bi WL, Hosny A, Schabath MB, et al. Artificial intelligence in cancer imaging: Clinical challenges and applications. CA Cancer J Clin 2019;69(2):127–157. Crossref, MedlineGoogle Scholar
  • 35. Cai J, Tang Y, Yan K, et al. Deep Lesion Tracker: Monitoring Lesions in 4D Longitudinal Imaging Studies. arXiv preprint arXiv:2012.04872. https://arxiv.org/abs/2012.04872. Posted December 9, 2020. Accessed October 20, 2021. Google Scholar
  • 36. Macdonald DR, Cascino TL, Schold SC Jr, Cairncross JG. Response criteria for phase II studies of supratentorial malignant glioma. J Clin Oncol 1990;8(7):1277–1280. Crossref, MedlineGoogle Scholar
  • 37. Wen PY, Macdonald DR, Reardon DA, et al. Updated response assessment criteria for high-grade gliomas: response assessment in neuro-oncology working group. J Clin Oncol 2010;28(11):1963–1972. Crossref, MedlineGoogle Scholar
  • 38. Taal W, Brandsma D, de Bruin HG, et al. Incidence of early pseudo-progression in a cohort of malignant glioma patients treated with chemoirradiation with temozolomide. Cancer 2008;113(2):405–410. Crossref, MedlineGoogle Scholar
  • 39. Joshi A, Eyuboglu S, Huang SC, et al. OncoNet: Weakly Supervised Siamese Network to automate cancer treatment response assessment between longitudinal FDG PET/CT examinations. arXiv preprint arXiv:2108.02016. https://arxiv.org/abs/2108.02016. Posted August 3, 2021. Accessed October 20, 2021. Google Scholar
  • 40. Jin C, Yu H, Ke J, et al. Predicting treatment response from longitudinal images using multi-task deep learning. Nat Commun 2021;12(1):1851. Crossref, MedlineGoogle Scholar
  • 41. Petersen J, Isensee F, Köhler G, et al. Continuous-Time Deep Glioma Growth Models. arXiv preprint arXiv:2106.12917. http://arxiv.org/abs/2106.12917. Posted June 23, 2021. Accessed October 24, 2021. Google Scholar
  • 42. Bycroft C, Freeman C, Petkova D, et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 2018;562(7726):203–209. Crossref, MedlineGoogle Scholar
  • 43. Littlejohns TJ, Holliday J, Gibson LM, et al. The UK Biobank imaging enhancement of 100,000 participants: rationale, data collection, management and future directions. Nat Commun 2020;11(1):2624. Crossref, MedlineGoogle Scholar
  • 44. Toga AW, Thompson PM. The role of image registration in brain mapping. Image Vis Comput 2001;19(1-2):3–24. Crossref, MedlineGoogle Scholar
  • 45. Mazziotta J, Toga A, Evans A, et al. A probabilistic atlas and reference system for the human brain: International Consortium for Brain Mapping (ICBM). Philos Trans R Soc Lond B Biol Sci 2001;356(1412):1293–1322. Crossref, MedlineGoogle Scholar
  • 46. Smith SM, Jenkinson M, Woolrich MW, et al. Advances in functional and structural MR image analysis and implementation as FSL. Neuroimage 2004;23(Suppl 1):S208–S219. Crossref, MedlineGoogle Scholar
  • 47. Shi W, Zhang M, Zhang R, Chen S, Zhan Z. Change Detection Based on Artificial Intelligence: State-of-the-Art and Challenges. Remote Sens 2020;12(10):1688. CrossrefGoogle Scholar
  • 48. McKinley R, Wepfer R, Grunder L, et al. Automatic detection of lesion load change in Multiple Sclerosis using convolutional neural networks with segmentation confidence. Neuroimage Clin 2020;25:102104. Crossref, MedlineGoogle Scholar
  • 49. Lysandropoulos AP, Absil J, Metens T, et al. Quantifying brain volumes for Multiple Sclerosis patients follow-up in clinical practice - comparison of 1.5 and 3 Tesla magnetic resonance imaging. Brain Behav 2016;6(2):e00422. Crossref, MedlineGoogle Scholar
  • 50. Bratt A, Blezek DJ, Ryan WJ, et al. Deep Learning Improves the Temporal Reproducibility of Aortic Measurement. J Digit Imaging 2021;34(5):1183–1189. Crossref, MedlineGoogle Scholar
  • 51. Kickingereder P, Isensee F, Tursunova I, et al. Automated quantitative tumour response assessment of MRI in neuro-oncology with artificial neural networks: a multicentre, retrospective study. Lancet Oncol 2019;20(5):728–740. Crossref, MedlineGoogle Scholar
  • 52. Berhane H, Scott M, Elbaz M, et al. Fully automated 3D aortic segmentation of 4D flow MRI for hemodynamic analysis using deep learning. Magn Reson Med 2020;84(4):2204–2218. Crossref, MedlineGoogle Scholar
  • 53. Santeramo R, Withey S, Montana G. Longitudinal Detection of Radiological Abnormalities with Time-Modulated LSTM. In: Stoyanov D, Taylor Z, Carneiro G, et al, eds. Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. DLMIA 2018, ML-CDS 2018. Lecture Notes in Computer Science, vol 11045.Cham, Switzerland:Springer,2018;326–333. CrossrefGoogle Scholar
  • 54. Xu Y, Hosny A, Zeleznik R, et al. Deep Learning Predicts Lung Cancer Treatment Response from Serial Medical Imaging. Clin Cancer Res 2019;25(11):3266–3275. Crossref, MedlineGoogle Scholar
  • 55. Oh DY, Kim J, Lee KJ. Longitudinal Change Detection on Chest X-rays Using Geometric Correlation Maps. In: Shen D, Liu T, Peters TM, et al, eds. Medical Image Computing and Computer Assisted Intervention – MICCAI 2019. MICCAI 2019. Lecture Notes in Computer Science, vol 11769.Cham, Switzerland:Springer,2019;748–756. CrossrefGoogle Scholar
  • 56. Seymour CW, Kennedy JN, Wang S, et al. Derivation, Validation, and Potential Treatment Implications of Novel Clinical Phenotypes for Sepsis. JAMA 2019;321(20):2003–2017. Crossref, MedlineGoogle Scholar
  • 57. Urteaga I, McKillop M, Elhadad N. Learning endometriosis phenotypes from patient-generated data. NPJ Digit Med 2020;3:88. Crossref, MedlineGoogle Scholar

Article History

Received: Nov 9 2021
Revision requested: Dec 13 2021
Revision received: Jan 25 2022
Accepted: Jan 28 2022
Published online: Apr 19 2022
Published in print: Aug 2022