-
BraTS-PEDs: Results of the Multi-Consortium International Pediatric Brain Tumor Segmentation Challenge 2023
Authors:
Anahita Fathi Kazerooni,
Nastaran Khalili,
Xinyang Liu,
Debanjan Haldar,
Zhifan Jiang,
Anna Zapaishchykova,
Julija Pavaine,
Lubdha M. Shah,
Blaise V. Jones,
Nakul Sheth,
Sanjay P. Prabhu,
Aaron S. McAllister,
Wenxin Tu,
Khanak K. Nandolia,
Andres F. Rodriguez,
Ibraheem Salman Shaikh,
Mariana Sanchez Montano,
Hollie Anne Lai,
Maruf Adewole,
Jake Albrecht,
Udunna Anazodo,
Hannah Anderson,
Syed Muhammed Anwar,
Alejandro Aristizabal,
Sina Bagheri
, et al. (54 additional authors not shown)
Abstract:
Pediatric central nervous system tumors are the leading cause of cancer-related deaths in children. The five-year survival rate for high-grade glioma in children is less than 20%. The development of new treatments is dependent upon multi-institutional collaborative clinical trials requiring reproducible and accurate centralized response assessment. We present the results of the BraTS-PEDs 2023 cha…
▽ More
Pediatric central nervous system tumors are the leading cause of cancer-related deaths in children. The five-year survival rate for high-grade glioma in children is less than 20%. The development of new treatments is dependent upon multi-institutional collaborative clinical trials requiring reproducible and accurate centralized response assessment. We present the results of the BraTS-PEDs 2023 challenge, the first Brain Tumor Segmentation (BraTS) challenge focused on pediatric brain tumors. This challenge utilized data acquired from multiple international consortia dedicated to pediatric neuro-oncology and clinical trials. BraTS-PEDs 2023 aimed to evaluate volumetric segmentation algorithms for pediatric brain gliomas from magnetic resonance imaging using standardized quantitative performance evaluation metrics employed across the BraTS 2023 challenges. The top-performing AI approaches for pediatric tumor analysis included ensembles of nnU-Net and Swin UNETR, Auto3DSeg, or nnU-Net with a self-supervised framework. The BraTSPEDs 2023 challenge fostered collaboration between clinicians (neuro-oncologists, neuroradiologists) and AI/imaging scientists, promoting faster data sharing and the development of automated volumetric analysis techniques. These advancements could significantly benefit clinical trials and improve the care of children with brain tumors.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
QUBIQ: Uncertainty Quantification for Biomedical Image Segmentation Challenge
Authors:
Hongwei Bran Li,
Fernando Navarro,
Ivan Ezhov,
Amirhossein Bayat,
Dhritiman Das,
Florian Kofler,
Suprosanna Shit,
Diana Waldmannstetter,
Johannes C. Paetzold,
Xiaobin Hu,
Benedikt Wiestler,
Lucas Zimmer,
Tamaz Amiranashvili,
Chinmay Prabhakar,
Christoph Berger,
Jonas Weidner,
Michelle Alonso-Basant,
Arif Rashid,
Ujjwal Baid,
Wesam Adel,
Deniz Ali,
Bhakti Baheti,
Yingbin Bai,
Ishaan Bhatt,
Sabri Can Cetindag
, et al. (55 additional authors not shown)
Abstract:
Uncertainty in medical image segmentation tasks, especially inter-rater variability, arising from differences in interpretations and annotations by various experts, presents a significant challenge in achieving consistent and reliable image segmentation. This variability not only reflects the inherent complexity and subjective nature of medical image interpretation but also directly impacts the de…
▽ More
Uncertainty in medical image segmentation tasks, especially inter-rater variability, arising from differences in interpretations and annotations by various experts, presents a significant challenge in achieving consistent and reliable image segmentation. This variability not only reflects the inherent complexity and subjective nature of medical image interpretation but also directly impacts the development and evaluation of automated segmentation algorithms. Accurately modeling and quantifying this variability is essential for enhancing the robustness and clinical applicability of these algorithms. We report the set-up and summarize the benchmark results of the Quantification of Uncertainties in Biomedical Image Quantification Challenge (QUBIQ), which was organized in conjunction with International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2020 and 2021. The challenge focuses on the uncertainty quantification of medical image segmentation which considers the omnipresence of inter-rater variability in imaging datasets. The large collection of images with multi-rater annotations features various modalities such as MRI and CT; various organs such as the brain, prostate, kidney, and pancreas; and different image dimensions 2D-vs-3D. A total of 24 teams submitted different solutions to the problem, combining various baseline models, Bayesian neural networks, and ensemble model techniques. The obtained results indicate the importance of the ensemble models, as well as the need for further research to develop efficient 3D methods for uncertainty quantification methods in 3D segmentation tasks.
△ Less
Submitted 24 June, 2024; v1 submitted 19 March, 2024;
originally announced May 2024.
-
Brain Tumor Segmentation (BraTS) Challenge 2024: Meningioma Radiotherapy Planning Automated Segmentation
Authors:
Dominic LaBella,
Katherine Schumacher,
Michael Mix,
Kevin Leu,
Shan McBurney-Lin,
Pierre Nedelec,
Javier Villanueva-Meyer,
Jonathan Shapey,
Tom Vercauteren,
Kazumi Chia,
Omar Al-Salihi,
Justin Leu,
Lia Halasz,
Yury Velichko,
Chunhao Wang,
John Kirkpatrick,
Scott Floyd,
Zachary J. Reitman,
Trey Mullikin,
Ulas Bagci,
Sean Sachdev,
Jona A. Hattangadi-Gluth,
Tyler Seibert,
Nikdokht Farid,
Connor Puett
, et al. (45 additional authors not shown)
Abstract:
The 2024 Brain Tumor Segmentation Meningioma Radiotherapy (BraTS-MEN-RT) challenge aims to advance automated segmentation algorithms using the largest known multi-institutional dataset of radiotherapy planning brain MRIs with expert-annotated target labels for patients with intact or post-operative meningioma that underwent either conventional external beam radiotherapy or stereotactic radiosurger…
▽ More
The 2024 Brain Tumor Segmentation Meningioma Radiotherapy (BraTS-MEN-RT) challenge aims to advance automated segmentation algorithms using the largest known multi-institutional dataset of radiotherapy planning brain MRIs with expert-annotated target labels for patients with intact or post-operative meningioma that underwent either conventional external beam radiotherapy or stereotactic radiosurgery. Each case includes a defaced 3D post-contrast T1-weighted radiotherapy planning MRI in its native acquisition space, accompanied by a single-label "target volume" representing the gross tumor volume (GTV) and any at-risk post-operative site. Target volume annotations adhere to established radiotherapy planning protocols, ensuring consistency across cases and institutions. For pre-operative meningiomas, the target volume encompasses the entire GTV and associated nodular dural tail, while for post-operative cases, it includes at-risk resection cavity margins as determined by the treating institution. Case annotations were reviewed and approved by expert neuroradiologists and radiation oncologists. Participating teams will develop, containerize, and evaluate automated segmentation models using this comprehensive dataset. Model performance will be assessed using the lesion-wise Dice Similarity Coefficient and the 95% Hausdorff distance. The top-performing teams will be recognized at the Medical Image Computing and Computer Assisted Intervention Conference in October 2024. BraTS-MEN-RT is expected to significantly advance automated radiotherapy planning by enabling precise tumor segmentation and facilitating tailored treatment, ultimately improving patient outcomes.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
The 2024 Brain Tumor Segmentation (BraTS) Challenge: Glioma Segmentation on Post-treatment MRI
Authors:
Maria Correia de Verdier,
Rachit Saluja,
Louis Gagnon,
Dominic LaBella,
Ujjwall Baid,
Nourel Hoda Tahon,
Martha Foltyn-Dumitru,
Jikai Zhang,
Maram Alafif,
Saif Baig,
Ken Chang,
Gennaro D'Anna,
Lisa Deptula,
Diviya Gupta,
Muhammad Ammar Haider,
Ali Hussain,
Michael Iv,
Marinos Kontzialis,
Paul Manning,
Farzan Moodi,
Teresa Nunes,
Aaron Simon,
Nico Sollmann,
David Vu,
Maruf Adewole
, et al. (60 additional authors not shown)
Abstract:
Gliomas are the most common malignant primary brain tumors in adults and one of the deadliest types of cancer. There are many challenges in treatment and monitoring due to the genetic diversity and high intrinsic heterogeneity in appearance, shape, histology, and treatment response. Treatments include surgery, radiation, and systemic therapies, with magnetic resonance imaging (MRI) playing a key r…
▽ More
Gliomas are the most common malignant primary brain tumors in adults and one of the deadliest types of cancer. There are many challenges in treatment and monitoring due to the genetic diversity and high intrinsic heterogeneity in appearance, shape, histology, and treatment response. Treatments include surgery, radiation, and systemic therapies, with magnetic resonance imaging (MRI) playing a key role in treatment planning and post-treatment longitudinal assessment. The 2024 Brain Tumor Segmentation (BraTS) challenge on post-treatment glioma MRI will provide a community standard and benchmark for state-of-the-art automated segmentation models based on the largest expert-annotated post-treatment glioma MRI dataset. Challenge competitors will develop automated segmentation models to predict four distinct tumor sub-regions consisting of enhancing tissue (ET), surrounding non-enhancing T2/fluid-attenuated inversion recovery (FLAIR) hyperintensity (SNFH), non-enhancing tumor core (NETC), and resection cavity (RC). Models will be evaluated on separate validation and test datasets using standardized performance metrics utilized across the BraTS 2024 cluster of challenges, including lesion-wise Dice Similarity Coefficient and Hausdorff Distance. Models developed during this challenge will advance the field of automated MRI segmentation and contribute to their integration into clinical practice, ultimately enhancing patient care.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Analysis of the BraTS 2023 Intracranial Meningioma Segmentation Challenge
Authors:
Dominic LaBella,
Ujjwal Baid,
Omaditya Khanna,
Shan McBurney-Lin,
Ryan McLean,
Pierre Nedelec,
Arif Rashid,
Nourel Hoda Tahon,
Talissa Altes,
Radhika Bhalerao,
Yaseen Dhemesh,
Devon Godfrey,
Fathi Hilal,
Scott Floyd,
Anastasia Janas,
Anahita Fathi Kazerooni,
John Kirkpatrick,
Collin Kent,
Florian Kofler,
Kevin Leu,
Nazanin Maleki,
Bjoern Menze,
Maxence Pajot,
Zachary J. Reitman,
Jeffrey D. Rudie
, et al. (96 additional authors not shown)
Abstract:
We describe the design and results from the BraTS 2023 Intracranial Meningioma Segmentation Challenge. The BraTS Meningioma Challenge differed from prior BraTS Glioma challenges in that it focused on meningiomas, which are typically benign extra-axial tumors with diverse radiologic and anatomical presentation and a propensity for multiplicity. Nine participating teams each developed deep-learning…
▽ More
We describe the design and results from the BraTS 2023 Intracranial Meningioma Segmentation Challenge. The BraTS Meningioma Challenge differed from prior BraTS Glioma challenges in that it focused on meningiomas, which are typically benign extra-axial tumors with diverse radiologic and anatomical presentation and a propensity for multiplicity. Nine participating teams each developed deep-learning automated segmentation models using image data from the largest multi-institutional systematically expert annotated multilabel multi-sequence meningioma MRI dataset to date, which included 1000 training set cases, 141 validation set cases, and 283 hidden test set cases. Each case included T2, T2/FLAIR, T1, and T1Gd brain MRI sequences with associated tumor compartment labels delineating enhancing tumor, non-enhancing tumor, and surrounding non-enhancing T2/FLAIR hyperintensity. Participant automated segmentation models were evaluated and ranked based on a scoring system evaluating lesion-wise metrics including dice similarity coefficient (DSC) and 95% Hausdorff Distance. The top ranked team had a lesion-wise median dice similarity coefficient (DSC) of 0.976, 0.976, and 0.964 for enhancing tumor, tumor core, and whole tumor, respectively and a corresponding average DSC of 0.899, 0.904, and 0.871, respectively. These results serve as state-of-the-art benchmarks for future pre-operative meningioma automated segmentation algorithms. Additionally, we found that 1286 of 1424 cases (90.3%) had at least 1 compartment voxel abutting the edge of the skull-stripped image edge, which requires further investigation into optimal pre-processing face anonymization steps.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
The Brain Tumor Segmentation in Pediatrics (BraTS-PEDs) Challenge: Focus on Pediatrics (CBTN-CONNECT-DIPGR-ASNR-MICCAI BraTS-PEDs)
Authors:
Anahita Fathi Kazerooni,
Nastaran Khalili,
Xinyang Liu,
Deep Gandhi,
Zhifan Jiang,
Syed Muhammed Anwar,
Jake Albrecht,
Maruf Adewole,
Udunna Anazodo,
Hannah Anderson,
Ujjwal Baid,
Timothy Bergquist,
Austin J. Borja,
Evan Calabrese,
Verena Chung,
Gian-Marco Conte,
Farouk Dako,
James Eddy,
Ivan Ezhov,
Ariana Familiar,
Keyvan Farahani,
Andrea Franson,
Anurag Gottipati,
Shuvanjan Haldar,
Juan Eugenio Iglesias
, et al. (46 additional authors not shown)
Abstract:
Pediatric tumors of the central nervous system are the most common cause of cancer-related death in children. The five-year survival rate for high-grade gliomas in children is less than 20%. Due to their rarity, the diagnosis of these entities is often delayed, their treatment is mainly based on historic treatment concepts, and clinical trials require multi-institutional collaborations. Here we pr…
▽ More
Pediatric tumors of the central nervous system are the most common cause of cancer-related death in children. The five-year survival rate for high-grade gliomas in children is less than 20%. Due to their rarity, the diagnosis of these entities is often delayed, their treatment is mainly based on historic treatment concepts, and clinical trials require multi-institutional collaborations. Here we present the CBTN-CONNECT-DIPGR-ASNR-MICCAI BraTS-PEDs challenge, focused on pediatric brain tumors with data acquired across multiple international consortia dedicated to pediatric neuro-oncology and clinical trials. The CBTN-CONNECT-DIPGR-ASNR-MICCAI BraTS-PEDs challenge brings together clinicians and AI/imaging scientists to lead to faster development of automated segmentation techniques that could benefit clinical trials, and ultimately the care of children with brain tumors.
△ Less
Submitted 11 July, 2024; v1 submitted 23 April, 2024;
originally announced April 2024.
-
A Robust Ensemble Algorithm for Ischemic Stroke Lesion Segmentation: Generalizability and Clinical Utility Beyond the ISLES Challenge
Authors:
Ezequiel de la Rosa,
Mauricio Reyes,
Sook-Lei Liew,
Alexandre Hutton,
Roland Wiest,
Johannes Kaesmacher,
Uta Hanning,
Arsany Hakim,
Richard Zubal,
Waldo Valenzuela,
David Robben,
Diana M. Sima,
Vincenzo Anania,
Arne Brys,
James A. Meakin,
Anne Mickan,
Gabriel Broocks,
Christian Heitkamp,
Shengbo Gao,
Kongming Liang,
Ziji Zhang,
Md Mahfuzur Rahman Siddiquee,
Andriy Myronenko,
Pooya Ashtari,
Sabine Van Huffel
, et al. (33 additional authors not shown)
Abstract:
Diffusion-weighted MRI (DWI) is essential for stroke diagnosis, treatment decisions, and prognosis. However, image and disease variability hinder the development of generalizable AI algorithms with clinical value. We address this gap by presenting a novel ensemble algorithm derived from the 2022 Ischemic Stroke Lesion Segmentation (ISLES) challenge. ISLES'22 provided 400 patient scans with ischemi…
▽ More
Diffusion-weighted MRI (DWI) is essential for stroke diagnosis, treatment decisions, and prognosis. However, image and disease variability hinder the development of generalizable AI algorithms with clinical value. We address this gap by presenting a novel ensemble algorithm derived from the 2022 Ischemic Stroke Lesion Segmentation (ISLES) challenge. ISLES'22 provided 400 patient scans with ischemic stroke from various medical centers, facilitating the development of a wide range of cutting-edge segmentation algorithms by the research community. Through collaboration with leading teams, we combined top-performing algorithms into an ensemble model that overcomes the limitations of individual solutions. Our ensemble model achieved superior ischemic lesion detection and segmentation accuracy on our internal test set compared to individual algorithms. This accuracy generalized well across diverse image and disease variables. Furthermore, the model excelled in extracting clinical biomarkers. Notably, in a Turing-like test, neuroradiologists consistently preferred the algorithm's segmentations over manual expert efforts, highlighting increased comprehensiveness and precision. Validation using a real-world external dataset (N=1686) confirmed the model's generalizability. The algorithm's outputs also demonstrated strong correlations with clinical scores (admission NIHSS and 90-day mRS) on par with or exceeding expert-derived results, underlining its clinical relevance. This study offers two key findings. First, we present an ensemble algorithm (https://github.com/Tabrisrei/ISLES22_Ensemble) that detects and segments ischemic stroke lesions on DWI across diverse scenarios on par with expert (neuro)radiologists. Second, we show the potential for biomedical challenge outputs to extend beyond the challenge's initial objectives, demonstrating their real-world clinical applicability.
△ Less
Submitted 3 April, 2024; v1 submitted 28 March, 2024;
originally announced March 2024.
-
Denoising Diffusion Models for 3D Healthy Brain Tissue Inpainting
Authors:
Alicia Durrer,
Julia Wolleb,
Florentin Bieder,
Paul Friedrich,
Lester Melie-Garcia,
Mario Ocampo-Pineda,
Cosmin I. Bercea,
Ibrahim E. Hamamci,
Benedikt Wiestler,
Marie Piraud,
Özgür Yaldizli,
Cristina Granziera,
Bjoern H. Menze,
Philippe C. Cattin,
Florian Kofler
Abstract:
Monitoring diseases that affect the brain's structural integrity requires automated analysis of magnetic resonance (MR) images, e.g., for the evaluation of volumetric changes. However, many of the evaluation tools are optimized for analyzing healthy tissue. To enable the evaluation of scans containing pathological tissue, it is therefore required to restore healthy tissue in the pathological areas…
▽ More
Monitoring diseases that affect the brain's structural integrity requires automated analysis of magnetic resonance (MR) images, e.g., for the evaluation of volumetric changes. However, many of the evaluation tools are optimized for analyzing healthy tissue. To enable the evaluation of scans containing pathological tissue, it is therefore required to restore healthy tissue in the pathological areas. In this work, we explore and extend denoising diffusion models for consistent inpainting of healthy 3D brain tissue. We modify state-of-the-art 2D, pseudo-3D, and 3D methods working in the image space, as well as 3D latent and 3D wavelet diffusion models, and train them to synthesize healthy brain tissue. Our evaluation shows that the pseudo-3D model performs best regarding the structural-similarity index, peak signal-to-noise ratio, and mean squared error. To emphasize the clinical relevance, we fine-tune this model on data containing synthetic MS lesions and evaluate it on a downstream brain tissue segmentation task, whereby it outperforms the established FMRIB Software Library (FSL) lesion-filling method.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
A Learnable Prior Improves Inverse Tumor Growth Modeling
Authors:
Jonas Weidner,
Ivan Ezhov,
Michal Balcerak,
Marie-Christin Metz,
Sergey Litvinov,
Sebastian Kaltenbach,
Leonhard Feiner,
Laurin Lux,
Florian Kofler,
Jana Lipkova,
Jonas Latz,
Daniel Rueckert,
Bjoern Menze,
Benedikt Wiestler
Abstract:
Biophysical modeling, particularly involving partial differential equations (PDEs), offers significant potential for tailoring disease treatment protocols to individual patients. However, the inverse problem-solving aspect of these models presents a substantial challenge, either due to the high computational requirements of model-based approaches or the limited robustness of deep learning (DL) met…
▽ More
Biophysical modeling, particularly involving partial differential equations (PDEs), offers significant potential for tailoring disease treatment protocols to individual patients. However, the inverse problem-solving aspect of these models presents a substantial challenge, either due to the high computational requirements of model-based approaches or the limited robustness of deep learning (DL) methods. We propose a novel framework that leverages the unique strengths of both approaches in a synergistic manner. Our method incorporates a DL ensemble for initial parameter estimation, facilitating efficient downstream evolutionary sampling initialized with this DL-based prior. We showcase the effectiveness of integrating a rapid deep-learning algorithm with a high-precision evolution strategy in estimating brain tumor cell concentrations from magnetic resonance images. The DL-Prior plays a pivotal role, significantly constraining the effective sampling-parameter space. This reduction results in a fivefold convergence acceleration and a Dice-score of 95%
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
SPINEPS -- Automatic Whole Spine Segmentation of T2-weighted MR images using a Two-Phase Approach to Multi-class Semantic and Instance Segmentation
Authors:
Hendrik Möller,
Robert Graf,
Joachim Schmitt,
Benjamin Keinert,
Matan Atad,
Anjany Sekuboyina,
Felix Streckenbach,
Hanna Schön,
Florian Kofler,
Thomas Kroencke,
Stefanie Bette,
Stefan Willich,
Thomas Keil,
Thoralf Niendorf,
Tobias Pischon,
Beate Endemann,
Bjoern Menze,
Daniel Rueckert,
Jan S. Kirschke
Abstract:
Purpose. To present SPINEPS, an open-source deep learning approach for semantic and instance segmentation of 14 spinal structures (ten vertebra substructures, intervertebral discs, spinal cord, spinal canal, and sacrum) in whole body T2w MRI.
Methods. During this HIPPA-compliant, retrospective study, we utilized the public SPIDER dataset (218 subjects, 63% female) and a subset of the German Nati…
▽ More
Purpose. To present SPINEPS, an open-source deep learning approach for semantic and instance segmentation of 14 spinal structures (ten vertebra substructures, intervertebral discs, spinal cord, spinal canal, and sacrum) in whole body T2w MRI.
Methods. During this HIPPA-compliant, retrospective study, we utilized the public SPIDER dataset (218 subjects, 63% female) and a subset of the German National Cohort (1423 subjects, mean age 53, 49% female) for training and evaluation. We combined CT and T2w segmentations to train models that segment 14 spinal structures in T2w sagittal scans both semantically and instance-wise. Performance evaluation metrics included Dice similarity coefficient, average symmetrical surface distance, panoptic quality, segmentation quality, and recognition quality. Statistical significance was assessed using the Wilcoxon signed-rank test. An in-house dataset was used to qualitatively evaluate out-of-distribution samples.
Results. On the public dataset, our approach outperformed the baseline (instance-wise vertebra dice score 0.929 vs. 0.907, p-value<0.001). Training on auto-generated annotations and evaluating on manually corrected test data from the GNC yielded global dice scores of 0.900 for vertebrae, 0.960 for intervertebral discs, and 0.947 for the spinal canal. Incorporating the SPIDER dataset during training increased these scores to 0.920, 0.967, 0.958, respectively.
Conclusions. The proposed segmentation approach offers robust segmentation of 14 spinal structures in T2w sagittal images, including the spinal cord, spinal canal, intervertebral discs, endplate, sacrum, and vertebrae. The approach yields both a semantic and instance mask as output, thus being easy to utilize. This marks the first publicly available algorithm for whole spine segmentation in sagittal T2w MR imaging.
△ Less
Submitted 22 April, 2024; v1 submitted 26 February, 2024;
originally announced February 2024.
-
Benchmarking the CoW with the TopCoW Challenge: Topology-Aware Anatomical Segmentation of the Circle of Willis for CTA and MRA
Authors:
Kaiyuan Yang,
Fabio Musio,
Yihui Ma,
Norman Juchler,
Johannes C. Paetzold,
Rami Al-Maskari,
Luciano Höher,
Hongwei Bran Li,
Ibrahim Ethem Hamamci,
Anjany Sekuboyina,
Suprosanna Shit,
Houjing Huang,
Chinmay Prabhakar,
Ezequiel de la Rosa,
Diana Waldmannstetter,
Florian Kofler,
Fernando Navarro,
Martin Menten,
Ivan Ezhov,
Daniel Rueckert,
Iris Vos,
Ynte Ruigrok,
Birgitta Velthuis,
Hugo Kuijf,
Julien Hämmerli
, et al. (59 additional authors not shown)
Abstract:
The Circle of Willis (CoW) is an important network of arteries connecting major circulations of the brain. Its vascular architecture is believed to affect the risk, severity, and clinical outcome of serious neuro-vascular diseases. However, characterizing the highly variable CoW anatomy is still a manual and time-consuming expert task. The CoW is usually imaged by two angiographic imaging modaliti…
▽ More
The Circle of Willis (CoW) is an important network of arteries connecting major circulations of the brain. Its vascular architecture is believed to affect the risk, severity, and clinical outcome of serious neuro-vascular diseases. However, characterizing the highly variable CoW anatomy is still a manual and time-consuming expert task. The CoW is usually imaged by two angiographic imaging modalities, magnetic resonance angiography (MRA) and computed tomography angiography (CTA), but there exist limited public datasets with annotations on CoW anatomy, especially for CTA. Therefore we organized the TopCoW Challenge in 2023 with the release of an annotated CoW dataset. The TopCoW dataset was the first public dataset with voxel-level annotations for thirteen possible CoW vessel components, enabled by virtual-reality (VR) technology. It was also the first large dataset with paired MRA and CTA from the same patients. TopCoW challenge formalized the CoW characterization problem as a multiclass anatomical segmentation task with an emphasis on topological metrics. We invited submissions worldwide for the CoW segmentation task, which attracted over 140 registered participants from four continents. The top performing teams managed to segment many CoW components to Dice scores around 90%, but with lower scores for communicating arteries and rare variants. There were also topological mistakes for predictions with high Dice scores. Additional topological analysis revealed further areas for improvement in detecting certain CoW components and matching CoW variant topology accurately. TopCoW represented a first attempt at benchmarking the CoW anatomical segmentation task for MRA and CTA, both morphologically and topologically.
△ Less
Submitted 29 April, 2024; v1 submitted 29 December, 2023;
originally announced December 2023.
-
Panoptica -- instance-wise evaluation of 3D semantic and instance segmentation maps
Authors:
Florian Kofler,
Hendrik Möller,
Josef A. Buchner,
Ezequiel de la Rosa,
Ivan Ezhov,
Marcel Rosier,
Isra Mekki,
Suprosanna Shit,
Moritz Negwer,
Rami Al-Maskari,
Ali Ertürk,
Shankeeth Vinayahalingam,
Fabian Isensee,
Sarthak Pati,
Daniel Rueckert,
Jan S. Kirschke,
Stefan K. Ehrlich,
Annika Reinke,
Bjoern Menze,
Benedikt Wiestler,
Marie Piraud
Abstract:
This paper introduces panoptica, a versatile and performance-optimized package designed for computing instance-wise segmentation quality metrics from 2D and 3D segmentation maps. panoptica addresses the limitations of existing metrics and provides a modular framework that complements the original intersection over union-based panoptic quality with other metrics, such as the distance metric Average…
▽ More
This paper introduces panoptica, a versatile and performance-optimized package designed for computing instance-wise segmentation quality metrics from 2D and 3D segmentation maps. panoptica addresses the limitations of existing metrics and provides a modular framework that complements the original intersection over union-based panoptic quality with other metrics, such as the distance metric Average Symmetric Surface Distance. The package is open-source, implemented in Python, and accompanied by comprehensive documentation and tutorials. panoptica employs a three-step metrics computation process to cover diverse use cases. The efficacy of panoptica is demonstrated on various real-world biomedical datasets, where an instance-wise evaluation is instrumental for an accurate representation of the underlying clinical task. Overall, we envision panoptica as a valuable tool facilitating in-depth evaluation of segmentation methods.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
Framing image registration as a landmark detection problem for label-noise-aware task representation (HitR)
Authors:
Diana Waldmannstetter,
Ivan Ezhov,
Benedikt Wiestler,
Francesco Campi,
Ivan Kukuljan,
Stefan Ehrlich,
Shankeeth Vinayahalingam,
Bhakti Baheti,
Satrajit Chakrabarty,
Ujjwal Baid,
Spyridon Bakas,
Julian Schwarting,
Marie Metz,
Jan S. Kirschke,
Daniel Rueckert,
Rolf A. Heckemann,
Marie Piraud,
Bjoern H. Menze,
Florian Kofler
Abstract:
Accurate image registration is pivotal in biomedical image analysis, where selecting suitable registration algorithms demands careful consideration. While numerous algorithms are available, the evaluation metrics to assess their performance have remained relatively static. This study addresses this challenge by introducing a novel evaluation metric termed Landmark Hit Rate (HitR), which focuses on…
▽ More
Accurate image registration is pivotal in biomedical image analysis, where selecting suitable registration algorithms demands careful consideration. While numerous algorithms are available, the evaluation metrics to assess their performance have remained relatively static. This study addresses this challenge by introducing a novel evaluation metric termed Landmark Hit Rate (HitR), which focuses on the clinical relevance of image registration accuracy. Unlike traditional metrics such as Target Registration Error, which emphasize subresolution differences, HitR considers whether registration algorithms successfully position landmarks within defined confidence zones. This paradigm shift acknowledges the inherent annotation noise in medical images, allowing for more meaningful assessments. To equip HitR with label-noise-awareness, we propose defining these confidence zones based on an Inter-rater Variance analysis. Consequently, hit rate curves are computed for varying landmark zone sizes, enabling performance measurement for a task-specific level of accuracy. Our approach offers a more realistic and meaningful assessment of image registration algorithms, reflecting their suitability for clinical and biomedical applications.
△ Less
Submitted 1 July, 2024; v1 submitted 31 July, 2023;
originally announced August 2023.
-
The Brain Tumor Segmentation (BraTS) Challenge 2023: Glioma Segmentation in Sub-Saharan Africa Patient Population (BraTS-Africa)
Authors:
Maruf Adewole,
Jeffrey D. Rudie,
Anu Gbadamosi,
Oluyemisi Toyobo,
Confidence Raymond,
Dong Zhang,
Olubukola Omidiji,
Rachel Akinola,
Mohammad Abba Suwaid,
Adaobi Emegoakor,
Nancy Ojo,
Kenneth Aguh,
Chinasa Kalaiwo,
Gabriel Babatunde,
Afolabi Ogunleye,
Yewande Gbadamosi,
Kator Iorpagher,
Evan Calabrese,
Mariam Aboian,
Marius Linguraru,
Jake Albrecht,
Benedikt Wiestler,
Florian Kofler,
Anastasia Janas,
Dominic LaBella
, et al. (26 additional authors not shown)
Abstract:
Gliomas are the most common type of primary brain tumors. Although gliomas are relatively rare, they are among the deadliest types of cancer, with a survival rate of less than 2 years after diagnosis. Gliomas are challenging to diagnose, hard to treat and inherently resistant to conventional therapy. Years of extensive research to improve diagnosis and treatment of gliomas have decreased mortality…
▽ More
Gliomas are the most common type of primary brain tumors. Although gliomas are relatively rare, they are among the deadliest types of cancer, with a survival rate of less than 2 years after diagnosis. Gliomas are challenging to diagnose, hard to treat and inherently resistant to conventional therapy. Years of extensive research to improve diagnosis and treatment of gliomas have decreased mortality rates across the Global North, while chances of survival among individuals in low- and middle-income countries (LMICs) remain unchanged and are significantly worse in Sub-Saharan Africa (SSA) populations. Long-term survival with glioma is associated with the identification of appropriate pathological features on brain MRI and confirmation by histopathology. Since 2012, the Brain Tumor Segmentation (BraTS) Challenge have evaluated state-of-the-art machine learning methods to detect, characterize, and classify gliomas. However, it is unclear if the state-of-the-art methods can be widely implemented in SSA given the extensive use of lower-quality MRI technology, which produces poor image contrast and resolution and more importantly, the propensity for late presentation of disease at advanced stages as well as the unique characteristics of gliomas in SSA (i.e., suspected higher rates of gliomatosis cerebri). Thus, the BraTS-Africa Challenge provides a unique opportunity to include brain MRI glioma cases from SSA in global efforts through the BraTS Challenge to develop and evaluate computer-aided-diagnostic (CAD) methods for the detection and characterization of glioma in resource-limited settings, where the potential for CAD tools to transform healthcare are more likely.
△ Less
Submitted 30 May, 2023;
originally announced May 2023.
-
The Brain Tumor Segmentation (BraTS) Challenge 2023: Focus on Pediatrics (CBTN-CONNECT-DIPGR-ASNR-MICCAI BraTS-PEDs)
Authors:
Anahita Fathi Kazerooni,
Nastaran Khalili,
Xinyang Liu,
Debanjan Haldar,
Zhifan Jiang,
Syed Muhammed Anwar,
Jake Albrecht,
Maruf Adewole,
Udunna Anazodo,
Hannah Anderson,
Sina Bagheri,
Ujjwal Baid,
Timothy Bergquist,
Austin J. Borja,
Evan Calabrese,
Verena Chung,
Gian-Marco Conte,
Farouk Dako,
James Eddy,
Ivan Ezhov,
Ariana Familiar,
Keyvan Farahani,
Shuvanjan Haldar,
Juan Eugenio Iglesias,
Anastasia Janas
, et al. (48 additional authors not shown)
Abstract:
Pediatric tumors of the central nervous system are the most common cause of cancer-related death in children. The five-year survival rate for high-grade gliomas in children is less than 20\%. Due to their rarity, the diagnosis of these entities is often delayed, their treatment is mainly based on historic treatment concepts, and clinical trials require multi-institutional collaborations. The MICCA…
▽ More
Pediatric tumors of the central nervous system are the most common cause of cancer-related death in children. The five-year survival rate for high-grade gliomas in children is less than 20\%. Due to their rarity, the diagnosis of these entities is often delayed, their treatment is mainly based on historic treatment concepts, and clinical trials require multi-institutional collaborations. The MICCAI Brain Tumor Segmentation (BraTS) Challenge is a landmark community benchmark event with a successful history of 12 years of resource creation for the segmentation and analysis of adult glioma. Here we present the CBTN-CONNECT-DIPGR-ASNR-MICCAI BraTS-PEDs 2023 challenge, which represents the first BraTS challenge focused on pediatric brain tumors with data acquired across multiple international consortia dedicated to pediatric neuro-oncology and clinical trials. The BraTS-PEDs 2023 challenge focuses on benchmarking the development of volumentric segmentation algorithms for pediatric brain glioma through standardized quantitative performance evaluation metrics utilized across the BraTS 2023 cluster of challenges. Models gaining knowledge from the BraTS-PEDs multi-parametric structural MRI (mpMRI) training data will be evaluated on separate validation and unseen test mpMRI dataof high-grade pediatric glioma. The CBTN-CONNECT-DIPGR-ASNR-MICCAI BraTS-PEDs 2023 challenge brings together clinicians and AI/imaging scientists to lead to faster development of automated segmentation techniques that could benefit clinical trials, and ultimately the care of children with brain tumors.
△ Less
Submitted 23 May, 2024; v1 submitted 26 May, 2023;
originally announced May 2023.
-
The Brain Tumor Segmentation (BraTS) Challenge 2023: Brain MR Image Synthesis for Tumor Segmentation (BraSyn)
Authors:
Hongwei Bran Li,
Gian Marco Conte,
Syed Muhammad Anwar,
Florian Kofler,
Ivan Ezhov,
Koen van Leemput,
Marie Piraud,
Maria Diaz,
Byrone Cole,
Evan Calabrese,
Jeff Rudie,
Felix Meissen,
Maruf Adewole,
Anastasia Janas,
Anahita Fathi Kazerooni,
Dominic LaBella,
Ahmed W. Moawad,
Keyvan Farahani,
James Eddy,
Timothy Bergquist,
Verena Chung,
Russell Takeshi Shinohara,
Farouk Dako,
Walter Wiggins,
Zachary Reitman
, et al. (43 additional authors not shown)
Abstract:
Automated brain tumor segmentation methods have become well-established and reached performance levels offering clear clinical utility. These methods typically rely on four input magnetic resonance imaging (MRI) modalities: T1-weighted images with and without contrast enhancement, T2-weighted images, and FLAIR images. However, some sequences are often missing in clinical practice due to time const…
▽ More
Automated brain tumor segmentation methods have become well-established and reached performance levels offering clear clinical utility. These methods typically rely on four input magnetic resonance imaging (MRI) modalities: T1-weighted images with and without contrast enhancement, T2-weighted images, and FLAIR images. However, some sequences are often missing in clinical practice due to time constraints or image artifacts, such as patient motion. Consequently, the ability to substitute missing modalities and gain segmentation performance is highly desirable and necessary for the broader adoption of these algorithms in the clinical routine. In this work, we present the establishment of the Brain MR Image Synthesis Benchmark (BraSyn) in conjunction with the Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2023. The primary objective of this challenge is to evaluate image synthesis methods that can realistically generate missing MRI modalities when multiple available images are provided. The ultimate aim is to facilitate automated brain tumor segmentation pipelines. The image dataset used in the benchmark is diverse and multi-modal, created through collaboration with various hospitals and research institutions.
△ Less
Submitted 28 June, 2023; v1 submitted 15 May, 2023;
originally announced May 2023.
-
The Brain Tumor Segmentation (BraTS) Challenge 2023: Local Synthesis of Healthy Brain Tissue via Inpainting
Authors:
Florian Kofler,
Felix Meissen,
Felix Steinbauer,
Robert Graf,
Eva Oswald,
Ezequiel de da Rosa,
Hongwei Bran Li,
Ujjwal Baid,
Florian Hoelzl,
Oezguen Turgut,
Izabela Horvath,
Diana Waldmannstetter,
Christina Bukas,
Maruf Adewole,
Syed Muhammad Anwar,
Anastasia Janas,
Anahita Fathi Kazerooni,
Dominic LaBella,
Ahmed W Moawad,
Keyvan Farahani,
James Eddy,
Timothy Bergquist,
Verena Chung,
Russell Takeshi Shinohara,
Farouk Dako
, et al. (43 additional authors not shown)
Abstract:
A myriad of algorithms for the automatic analysis of brain MR images is available to support clinicians in their decision-making. For brain tumor patients, the image acquisition time series typically starts with a scan that is already pathological. This poses problems, as many algorithms are designed to analyze healthy brains and provide no guarantees for images featuring lesions. Examples include…
▽ More
A myriad of algorithms for the automatic analysis of brain MR images is available to support clinicians in their decision-making. For brain tumor patients, the image acquisition time series typically starts with a scan that is already pathological. This poses problems, as many algorithms are designed to analyze healthy brains and provide no guarantees for images featuring lesions. Examples include but are not limited to algorithms for brain anatomy parcellation, tissue segmentation, and brain extraction. To solve this dilemma, we introduce the BraTS 2023 inpainting challenge. Here, the participants' task is to explore inpainting techniques to synthesize healthy brain scans from lesioned ones. The following manuscript contains the task formulation, dataset, and submission procedure. Later it will be updated to summarize the findings of the challenge. The challenge is organized as part of the BraTS 2023 challenge hosted at the MICCAI 2023 conference in Vancouver, Canada.
△ Less
Submitted 9 August, 2023; v1 submitted 15 May, 2023;
originally announced May 2023.
-
The ASNR-MICCAI Brain Tumor Segmentation (BraTS) Challenge 2023: Intracranial Meningioma
Authors:
Dominic LaBella,
Maruf Adewole,
Michelle Alonso-Basanta,
Talissa Altes,
Syed Muhammad Anwar,
Ujjwal Baid,
Timothy Bergquist,
Radhika Bhalerao,
Sully Chen,
Verena Chung,
Gian-Marco Conte,
Farouk Dako,
James Eddy,
Ivan Ezhov,
Devon Godfrey,
Fathi Hilal,
Ariana Familiar,
Keyvan Farahani,
Juan Eugenio Iglesias,
Zhifan Jiang,
Elaine Johanson,
Anahita Fathi Kazerooni,
Collin Kent,
John Kirkpatrick,
Florian Kofler
, et al. (35 additional authors not shown)
Abstract:
Meningiomas are the most common primary intracranial tumor in adults and can be associated with significant morbidity and mortality. Radiologists, neurosurgeons, neuro-oncologists, and radiation oncologists rely on multiparametric MRI (mpMRI) for diagnosis, treatment planning, and longitudinal treatment monitoring; yet automated, objective, and quantitative tools for non-invasive assessment of men…
▽ More
Meningiomas are the most common primary intracranial tumor in adults and can be associated with significant morbidity and mortality. Radiologists, neurosurgeons, neuro-oncologists, and radiation oncologists rely on multiparametric MRI (mpMRI) for diagnosis, treatment planning, and longitudinal treatment monitoring; yet automated, objective, and quantitative tools for non-invasive assessment of meningiomas on mpMRI are lacking. The BraTS meningioma 2023 challenge will provide a community standard and benchmark for state-of-the-art automated intracranial meningioma segmentation models based on the largest expert annotated multilabel meningioma mpMRI dataset to date. Challenge competitors will develop automated segmentation models to predict three distinct meningioma sub-regions on MRI including enhancing tumor, non-enhancing tumor core, and surrounding nonenhancing T2/FLAIR hyperintensity. Models will be evaluated on separate validation and held-out test datasets using standardized metrics utilized across the BraTS 2023 series of challenges including the Dice similarity coefficient and Hausdorff distance. The models developed during the course of this challenge will aid in incorporation of automated meningioma MRI segmentation into clinical practice, which will ultimately improve care of patients with meningioma.
△ Less
Submitted 12 May, 2023;
originally announced May 2023.
-
Primitive Simultaneous Optimization of Similarity Metrics for Image Registration
Authors:
Diana Waldmannstetter,
Benedikt Wiestler,
Julian Schwarting,
Ivan Ezhov,
Marie Metz,
Spyridon Bakas,
Bhakti Baheti,
Satrajit Chakrabarty,
Daniel Rueckert,
Jan S. Kirschke,
Rolf A. Heckemann,
Marie Piraud,
Bjoern H. Menze,
Florian Kofler
Abstract:
Even though simultaneous optimization of similarity metrics is a standard procedure in the field of semantic segmentation, surprisingly, this is much less established for image registration. To help closing this gap in the literature, we investigate in a complex multi-modal 3D setting whether simultaneous optimization of registration metrics, here implemented by means of primitive summation, can b…
▽ More
Even though simultaneous optimization of similarity metrics is a standard procedure in the field of semantic segmentation, surprisingly, this is much less established for image registration. To help closing this gap in the literature, we investigate in a complex multi-modal 3D setting whether simultaneous optimization of registration metrics, here implemented by means of primitive summation, can benefit image registration. We evaluate two challenging datasets containing collections of pre- to post-operative and pre- to intra-operative MR images of glioma. Employing the proposed optimization, we demonstrate improved registration accuracy in terms of TRE on expert neuroradiologists' landmark annotations.
△ Less
Submitted 12 October, 2023; v1 submitted 4 April, 2023;
originally announced April 2023.
-
Why is the winner the best?
Authors:
Matthias Eisenmann,
Annika Reinke,
Vivienn Weru,
Minu Dietlinde Tizabi,
Fabian Isensee,
Tim J. Adler,
Sharib Ali,
Vincent Andrearczyk,
Marc Aubreville,
Ujjwal Baid,
Spyridon Bakas,
Niranjan Balu,
Sophia Bano,
Jorge Bernal,
Sebastian Bodenstedt,
Alessandro Casella,
Veronika Cheplygina,
Marie Daum,
Marleen de Bruijne,
Adrien Depeursinge,
Reuben Dorent,
Jan Egger,
David G. Ellis,
Sandy Engelhardt,
Melanie Ganz
, et al. (100 additional authors not shown)
Abstract:
International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To addre…
▽ More
International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To address this gap in the literature, we performed a multi-center study with all 80 competitions that were conducted in the scope of IEEE ISBI 2021 and MICCAI 2021. Statistical analyses performed based on comprehensive descriptions of the submitted algorithms linked to their rank as well as the underlying participation strategies revealed common characteristics of winning solutions. These typically include the use of multi-task learning (63%) and/or multi-stage pipelines (61%), and a focus on augmentation (100%), image preprocessing (97%), data curation (79%), and postprocessing (66%). The "typical" lead of a winning team is a computer scientist with a doctoral degree, five years of experience in biomedical image analysis, and four years of experience in deep learning. Two core general development strategies stood out for highly-ranked teams: the reflection of the metrics in the method design and the focus on analyzing and handling failure cases. According to the organizers, 43% of the winning algorithms exceeded the state of the art but only 11% completely solved the respective domain problem. The insights of our study could help researchers (1) improve algorithm development strategies when approaching new problems, and (2) focus on open research questions revealed by this work.
△ Less
Submitted 30 March, 2023;
originally announced March 2023.
-
Understanding metric-related pitfalls in image analysis validation
Authors:
Annika Reinke,
Minu D. Tizabi,
Michael Baumgartner,
Matthias Eisenmann,
Doreen Heckmann-Nötzel,
A. Emre Kavur,
Tim Rädsch,
Carole H. Sudre,
Laura Acion,
Michela Antonelli,
Tal Arbel,
Spyridon Bakas,
Arriel Benis,
Matthew Blaschko,
Florian Buettner,
M. Jorge Cardoso,
Veronika Cheplygina,
Jianxu Chen,
Evangelia Christodoulou,
Beth A. Cimini,
Gary S. Collins,
Keyvan Farahani,
Luciana Ferrer,
Adrian Galdran,
Bram van Ginneken
, et al. (53 additional authors not shown)
Abstract:
Validation metrics are key for the reliable tracking of scientific progress and for bridging the current chasm between artificial intelligence (AI) research and its translation into practice. However, increasing evidence shows that particularly in image analysis, metrics are often chosen inadequately in relation to the underlying research problem. This could be attributed to a lack of accessibilit…
▽ More
Validation metrics are key for the reliable tracking of scientific progress and for bridging the current chasm between artificial intelligence (AI) research and its translation into practice. However, increasing evidence shows that particularly in image analysis, metrics are often chosen inadequately in relation to the underlying research problem. This could be attributed to a lack of accessibility of metric-related knowledge: While taking into account the individual strengths, weaknesses, and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multi-stage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides the first reliable and comprehensive common point of access to information on pitfalls related to validation metrics in image analysis. Focusing on biomedical image analysis but with the potential of transfer to other fields, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy. To facilitate comprehension, illustrations and specific examples accompany each pitfall. As a structured body of information accessible to researchers of all levels of expertise, this work enhances global comprehension of a key topic in image analysis validation.
△ Less
Submitted 23 February, 2024; v1 submitted 3 February, 2023;
originally announced February 2023.
-
Approaching Peak Ground Truth
Authors:
Florian Kofler,
Johannes Wahle,
Ivan Ezhov,
Sophia Wagner,
Rami Al-Maskari,
Emilia Gryska,
Mihail Todorov,
Christina Bukas,
Felix Meissen,
Tingying Peng,
Ali Ertürk,
Daniel Rueckert,
Rolf Heckemann,
Jan Kirschke,
Claus Zimmer,
Benedikt Wiestler,
Bjoern Menze,
Marie Piraud
Abstract:
Machine learning models are typically evaluated by computing similarity with reference annotations and trained by maximizing similarity with such. Especially in the biomedical domain, annotations are subjective and suffer from low inter- and intra-rater reliability. Since annotations only reflect one interpretation of the real world, this can lead to sub-optimal predictions even though the model a…
▽ More
Machine learning models are typically evaluated by computing similarity with reference annotations and trained by maximizing similarity with such. Especially in the biomedical domain, annotations are subjective and suffer from low inter- and intra-rater reliability. Since annotations only reflect one interpretation of the real world, this can lead to sub-optimal predictions even though the model achieves high similarity scores. Here, the theoretical concept of PGT is introduced. PGT marks the point beyond which an increase in similarity with the \emph{reference annotation} stops translating to better RWMP. Additionally, a quantitative technique to approximate PGT by computing inter- and intra-rater reliability is proposed. Finally, four categories of PGT-aware strategies to evaluate and improve model performance are reviewed.
△ Less
Submitted 18 March, 2023; v1 submitted 31 December, 2022;
originally announced January 2023.
-
Where is VALDO? VAscular Lesions Detection and segmentatiOn challenge at MICCAI 2021
Authors:
Carole H. Sudre,
Kimberlin Van Wijnen,
Florian Dubost,
Hieab Adams,
David Atkinson,
Frederik Barkhof,
Mahlet A. Birhanu,
Esther E. Bron,
Robin Camarasa,
Nish Chaturvedi,
Yuan Chen,
Zihao Chen,
Shuai Chen,
Qi Dou,
Tavia Evans,
Ivan Ezhov,
Haojun Gao,
Marta Girones Sanguesa,
Juan Domingo Gispert,
Beatriz Gomez Anson,
Alun D. Hughes,
M. Arfan Ikram,
Silvia Ingala,
H. Rolf Jaeger,
Florian Kofler
, et al. (24 additional authors not shown)
Abstract:
Imaging markers of cerebral small vessel disease provide valuable information on brain health, but their manual assessment is time-consuming and hampered by substantial intra- and interrater variability. Automated rating may benefit biomedical research, as well as clinical assessment, but diagnostic reliability of existing algorithms is unknown. Here, we present the results of the \textit{VAscular…
▽ More
Imaging markers of cerebral small vessel disease provide valuable information on brain health, but their manual assessment is time-consuming and hampered by substantial intra- and interrater variability. Automated rating may benefit biomedical research, as well as clinical assessment, but diagnostic reliability of existing algorithms is unknown. Here, we present the results of the \textit{VAscular Lesions DetectiOn and Segmentation} (\textit{Where is VALDO?}) challenge that was run as a satellite event at the international conference on Medical Image Computing and Computer Aided Intervention (MICCAI) 2021. This challenge aimed to promote the development of methods for automated detection and segmentation of small and sparse imaging markers of cerebral small vessel disease, namely enlarged perivascular spaces (EPVS) (Task 1), cerebral microbleeds (Task 2) and lacunes of presumed vascular origin (Task 3) while leveraging weak and noisy labels. Overall, 12 teams participated in the challenge proposing solutions for one or more tasks (4 for Task 1 - EPVS, 9 for Task 2 - Microbleeds and 6 for Task 3 - Lacunes). Multi-cohort data was used in both training and evaluation. Results showed a large variability in performance both across teams and across tasks, with promising results notably for Task 1 - EPVS and Task 2 - Microbleeds and not practically useful results yet for Task 3 - Lacunes. It also highlighted the performance inconsistency across cases that may deter use at an individual level, while still proving useful at a population level.
△ Less
Submitted 15 August, 2022;
originally announced August 2022.
-
ISLES 2022: A multi-center magnetic resonance imaging stroke lesion segmentation dataset
Authors:
Moritz Roman Hernandez Petzsche,
Ezequiel de la Rosa,
Uta Hanning,
Roland Wiest,
Waldo Enrique Valenzuela Pinilla,
Mauricio Reyes,
Maria Ines Meyer,
Sook-Lei Liew,
Florian Kofler,
Ivan Ezhov,
David Robben,
Alexander Hutton,
Tassilo Friedrich,
Teresa Zarth,
Johannes Bürkle,
The Anh Baran,
Bjoern Menze,
Gabriel Broocks,
Lukas Meyer,
Claus Zimmer,
Tobias Boeckh-Behrens,
Maria Berndt,
Benno Ikenberg,
Benedikt Wiestler,
Jan S. Kirschke
Abstract:
Magnetic resonance imaging (MRI) is a central modality for stroke imaging. It is used upon patient admission to make treatment decisions such as selecting patients for intravenous thrombolysis or endovascular therapy. MRI is later used in the duration of hospital stay to predict outcome by visualizing infarct core size and location. Furthermore, it may be used to characterize stroke etiology, e.g.…
▽ More
Magnetic resonance imaging (MRI) is a central modality for stroke imaging. It is used upon patient admission to make treatment decisions such as selecting patients for intravenous thrombolysis or endovascular therapy. MRI is later used in the duration of hospital stay to predict outcome by visualizing infarct core size and location. Furthermore, it may be used to characterize stroke etiology, e.g. differentiation between (cardio)-embolic and non-embolic stroke. Computer based automated medical image processing is increasingly finding its way into clinical routine. Previous iterations of the Ischemic Stroke Lesion Segmentation (ISLES) challenge have aided in the generation of identifying benchmark methods for acute and sub-acute ischemic stroke lesion segmentation. Here we introduce an expert-annotated, multicenter MRI dataset for segmentation of acute to subacute stroke lesions. This dataset comprises 400 multi-vendor MRI cases with high variability in stroke lesion size, quantity and location. It is split into a training dataset of n=250 and a test dataset of n=150. All training data will be made publicly available. The test dataset will be used for model validation only and will not be released to the public. This dataset serves as the foundation of the ISLES 2022 challenge with the goal of finding algorithmic methods to enable the development and benchmarking of robust and accurate segmentation algorithms for ischemic stroke.
△ Less
Submitted 14 June, 2022;
originally announced June 2022.
-
Metrics reloaded: Recommendations for image analysis validation
Authors:
Lena Maier-Hein,
Annika Reinke,
Patrick Godau,
Minu D. Tizabi,
Florian Buettner,
Evangelia Christodoulou,
Ben Glocker,
Fabian Isensee,
Jens Kleesiek,
Michal Kozubek,
Mauricio Reyes,
Michael A. Riegler,
Manuel Wiesenfarth,
A. Emre Kavur,
Carole H. Sudre,
Michael Baumgartner,
Matthias Eisenmann,
Doreen Heckmann-Nötzel,
Tim Rädsch,
Laura Acion,
Michela Antonelli,
Tal Arbel,
Spyridon Bakas,
Arriel Benis,
Matthew Blaschko
, et al. (49 additional authors not shown)
Abstract:
Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. Particularly in automatic biomedical image analysis, chosen performance metrics often do not reflect the domain interest, thus failing to adequately measure scientific progress and hindering translation of ML techniques into practice. To overcome this, our large international ex…
▽ More
Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. Particularly in automatic biomedical image analysis, chosen performance metrics often do not reflect the domain interest, thus failing to adequately measure scientific progress and hindering translation of ML techniques into practice. To overcome this, our large international expert consortium created Metrics Reloaded, a comprehensive framework guiding researchers in the problem-aware selection of metrics. Following the convergence of ML methodology across application domains, Metrics Reloaded fosters the convergence of validation methodology. The framework was developed in a multi-stage Delphi process and is based on the novel concept of a problem fingerprint - a structured representation of the given problem that captures all aspects that are relevant for metric selection, from the domain interest to the properties of the target structure(s), data set and algorithm output. Based on the problem fingerprint, users are guided through the process of choosing and applying appropriate validation metrics while being made aware of potential pitfalls. Metrics Reloaded targets image analysis problems that can be interpreted as a classification task at image, object or pixel level, namely image-level classification, object detection, semantic segmentation, and instance segmentation tasks. To improve the user experience, we implemented the framework in the Metrics Reloaded online tool, which also provides a point of access to explore weaknesses, strengths and specific recommendations for the most common validation metrics. The broad applicability of our framework across domains is demonstrated by an instantiation for various biological and medical image analysis use cases.
△ Less
Submitted 23 February, 2024; v1 submitted 3 June, 2022;
originally announced June 2022.
-
Deep Quality Estimation: Creating Surrogate Models for Human Quality Ratings
Authors:
Florian Kofler,
Ivan Ezhov,
Lucas Fidon,
Izabela Horvath,
Ezequiel de la Rosa,
John LaMaster,
Hongwei Li,
Tom Finck,
Suprosanna Shit,
Johannes Paetzold,
Spyridon Bakas,
Marie Piraud,
Jan Kirschke,
Tom Vercauteren,
Claus Zimmer,
Benedikt Wiestler,
Bjoern Menze
Abstract:
Human ratings are abstract representations of segmentation quality. To approximate human quality ratings on scarce expert data, we train surrogate quality estimation models. We evaluate on a complex multi-class segmentation problem, specifically glioma segmentation, following the BraTS annotation protocol. The training data features quality ratings from 15 expert neuroradiologists on a scale rangi…
▽ More
Human ratings are abstract representations of segmentation quality. To approximate human quality ratings on scarce expert data, we train surrogate quality estimation models. We evaluate on a complex multi-class segmentation problem, specifically glioma segmentation, following the BraTS annotation protocol. The training data features quality ratings from 15 expert neuroradiologists on a scale ranging from 1 to 6 stars for various computer-generated and manual 3D annotations. Even though the networks operate on 2D images and with scarce training data, we can approximate segmentation quality within a margin of error comparable to human intra-rater reliability. Segmentation quality prediction has broad applications. While an understanding of segmentation quality is imperative for successful clinical translation of automatic segmentation quality algorithms, it can play an essential role in training new segmentation models. Due to the split-second inference times, it can be directly applied within a loss function or as a fully-automatic dataset curation mechanism in a federated learning setting.
△ Less
Submitted 30 August, 2022; v1 submitted 17 May, 2022;
originally announced May 2022.
-
blob loss: instance imbalance aware loss functions for semantic segmentation
Authors:
Florian Kofler,
Suprosanna Shit,
Ivan Ezhov,
Lucas Fidon,
Izabela Horvath,
Rami Al-Maskari,
Hongwei Li,
Harsharan Bhatia,
Timo Loehr,
Marie Piraud,
Ali Erturk,
Jan Kirschke,
Jan C. Peeken,
Tom Vercauteren,
Claus Zimmer,
Benedikt Wiestler,
Bjoern Menze
Abstract:
Deep convolutional neural networks (CNN) have proven to be remarkably effective in semantic segmentation tasks. Most popular loss functions were introduced targeting improved volumetric scores, such as the Dice coefficient (DSC). By design, DSC can tackle class imbalance, however, it does not recognize instance imbalance within a class. As a result, a large foreground instance can dominate minor i…
▽ More
Deep convolutional neural networks (CNN) have proven to be remarkably effective in semantic segmentation tasks. Most popular loss functions were introduced targeting improved volumetric scores, such as the Dice coefficient (DSC). By design, DSC can tackle class imbalance, however, it does not recognize instance imbalance within a class. As a result, a large foreground instance can dominate minor instances and still produce a satisfactory DSC. Nevertheless, detecting tiny instances is crucial for many applications, such as disease monitoring. For example, it is imperative to locate and surveil small-scale lesions in the follow-up of multiple sclerosis patients. We propose a novel family of loss functions, \emph{blob loss}, primarily aimed at maximizing instance-level detection metrics, such as F1 score and sensitivity. \emph{Blob loss} is designed for semantic segmentation problems where detecting multiple instances matters. We extensively evaluate a DSC-based \emph{blob loss} in five complex 3D semantic segmentation tasks featuring pronounced instance heterogeneity in terms of texture and morphology. Compared to soft Dice loss, we achieve 5% improvement for MS lesions, 3% improvement for liver tumor, and an average 2% improvement for microscopy segmentation tasks considering F1 score.
△ Less
Submitted 6 June, 2023; v1 submitted 17 May, 2022;
originally announced May 2022.
-
A for-loop is all you need. For solving the inverse problem in the case of personalized tumor growth modeling
Authors:
Ivan Ezhov,
Marcel Rosier,
Lucas Zimmer,
Florian Kofler,
Suprosanna Shit,
Johannes Paetzold,
Kevin Scibilia,
Leon Maechler,
Katharina Franitza,
Tamaz Amiranashvili,
Martin J. Menten,
Marie Metz,
Sailesh Conjeti,
Benedikt Wiestler,
Bjoern Menze
Abstract:
Solving the inverse problem is the key step in evaluating the capacity of a physical model to describe real phenomena. In medical image computing, it aligns with the classical theme of image-based model personalization. Traditionally, a solution to the problem is obtained by performing either sampling or variational inference based methods. Both approaches aim to identify a set of free physical mo…
▽ More
Solving the inverse problem is the key step in evaluating the capacity of a physical model to describe real phenomena. In medical image computing, it aligns with the classical theme of image-based model personalization. Traditionally, a solution to the problem is obtained by performing either sampling or variational inference based methods. Both approaches aim to identify a set of free physical model parameters that results in a simulation best matching an empirical observation. When applied to brain tumor modeling, one of the instances of image-based model personalization in medical image computing, the overarching drawback of the methods is the time complexity for finding such a set. In a clinical setting with limited time between imaging and diagnosis or even intervention, this time complexity may prove critical. As the history of quantitative science is the history of compression, we align in this paper with the historical tendency and propose a method compressing complex traditional strategies for solving an inverse problem into a simple database query task. We evaluated different ways of performing the database query task assessing the trade-off between accuracy and execution time. On the exemplary task of brain tumor growth modeling, we prove that the proposed method achieves one order speed-up compared to existing approaches for solving the inverse problem. The resulting compute time offers critical means for relying on more complex and, hence, realistic models, for integrating image preprocessing and inverse modeling even deeper, or for implementing the current model into a clinical workflow.
△ Less
Submitted 11 July, 2022; v1 submitted 9 May, 2022;
originally announced May 2022.
-
Federated Learning Enables Big Data for Rare Cancer Boundary Detection
Authors:
Sarthak Pati,
Ujjwal Baid,
Brandon Edwards,
Micah Sheller,
Shih-Han Wang,
G Anthony Reina,
Patrick Foley,
Alexey Gruzdev,
Deepthi Karkada,
Christos Davatzikos,
Chiharu Sako,
Satyam Ghodasara,
Michel Bilello,
Suyash Mohan,
Philipp Vollmuth,
Gianluca Brugnara,
Chandrakanth J Preetha,
Felix Sahm,
Klaus Maier-Hein,
Maximilian Zenk,
Martin Bendszus,
Wolfgang Wick,
Evan Calabrese,
Jeffrey Rudie,
Javier Villanueva-Meyer
, et al. (254 additional authors not shown)
Abstract:
Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train acc…
▽ More
Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train accurate and generalizable ML models, by only sharing numerical model updates. Here we present findings from the largest FL study to-date, involving data from 71 healthcare institutions across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, utilizing the largest dataset of such patients ever used in the literature (25,256 MRI scans from 6,314 patients). We demonstrate a 33% improvement over a publicly trained model to delineate the surgically targetable tumor, and 23% improvement over the tumor's entire extent. We anticipate our study to: 1) enable more studies in healthcare informed by large and diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further quantitative analyses for glioblastoma via performance optimization of our consensus model for eventual public release, and 3) demonstrate the effectiveness of FL at such scale and task complexity as a paradigm shift for multi-site collaborations, alleviating the need for data sharing.
△ Less
Submitted 25 April, 2022; v1 submitted 22 April, 2022;
originally announced April 2022.
-
A Dempster-Shafer approach to trustworthy AI with application to fetal brain MRI segmentation
Authors:
Lucas Fidon,
Michael Aertsen,
Florian Kofler,
Andrea Bink,
Anna L. David,
Thomas Deprest,
Doaa Emam,
Frédéric Guffens,
András Jakab,
Gregor Kasprian,
Patric Kienast,
Andrew Melbourne,
Bjoern Menze,
Nada Mufti,
Ivana Pogledic,
Daniela Prayer,
Marlene Stuempflen,
Esther Van Elslander,
Sébastien Ourselin,
Jan Deprest,
Tom Vercauteren
Abstract:
Deep learning models for medical image segmentation can fail unexpectedly and spectacularly for pathological cases and images acquired at different centers than training images, with labeling errors that violate expert knowledge. Such errors undermine the trustworthiness of deep learning models for medical image segmentation. Mechanisms for detecting and correcting such failures are essential for…
▽ More
Deep learning models for medical image segmentation can fail unexpectedly and spectacularly for pathological cases and images acquired at different centers than training images, with labeling errors that violate expert knowledge. Such errors undermine the trustworthiness of deep learning models for medical image segmentation. Mechanisms for detecting and correcting such failures are essential for safely translating this technology into clinics and are likely to be a requirement of future regulations on artificial intelligence (AI). In this work, we propose a trustworthy AI theoretical framework and a practical system that can augment any backbone AI system using a fallback method and a fail-safe mechanism based on Dempster-Shafer theory. Our approach relies on an actionable definition of trustworthy AI. Our method automatically discards the voxel-level labeling predicted by the backbone AI that violate expert knowledge and relies on a fallback for those voxels. We demonstrate the effectiveness of the proposed trustworthy AI approach on the largest reported annotated dataset of fetal MRI consisting of 540 manually annotated fetal brain 3D T2w MRIs from 13 centers. Our trustworthy AI method improves the robustness of a state-of-the-art backbone AI for fetal brain MRIs acquired across various centers and for fetuses with various brain abnormalities.
△ Less
Submitted 17 January, 2024; v1 submitted 5 April, 2022;
originally announced April 2022.
-
The Brain Tumor Sequence Registration (BraTS-Reg) Challenge: Establishing Correspondence Between Pre-Operative and Follow-up MRI Scans of Diffuse Glioma Patients
Authors:
Bhakti Baheti,
Satrajit Chakrabarty,
Hamed Akbari,
Michel Bilello,
Benedikt Wiestler,
Julian Schwarting,
Evan Calabrese,
Jeffrey Rudie,
Syed Abidi,
Mina Mousa,
Javier Villanueva-Meyer,
Brandon K. K. Fields,
Florian Kofler,
Russell Takeshi Shinohara,
Juan Eugenio Iglesias,
Tony C. W. Mok,
Albert C. S. Chung,
Marek Wodzinski,
Artur Jurgas,
Niccolo Marini,
Manfredo Atzori,
Henning Muller,
Christoph Grobroehmer,
Hanna Siebert,
Lasse Hansen
, et al. (48 additional authors not shown)
Abstract:
Registration of longitudinal brain MRI scans containing pathologies is challenging due to dramatic changes in tissue appearance. Although there has been progress in developing general-purpose medical image registration techniques, they have not yet attained the requisite precision and reliability for this task, highlighting its inherent complexity. Here we describe the Brain Tumor Sequence Registr…
▽ More
Registration of longitudinal brain MRI scans containing pathologies is challenging due to dramatic changes in tissue appearance. Although there has been progress in developing general-purpose medical image registration techniques, they have not yet attained the requisite precision and reliability for this task, highlighting its inherent complexity. Here we describe the Brain Tumor Sequence Registration (BraTS-Reg) challenge, as the first public benchmark environment for deformable registration algorithms focusing on estimating correspondences between pre-operative and follow-up scans of the same patient diagnosed with a diffuse brain glioma. The BraTS-Reg data comprise de-identified multi-institutional multi-parametric MRI (mpMRI) scans, curated for size and resolution according to a canonical anatomical template, and divided into training, validation, and testing sets. Clinical experts annotated ground truth (GT) landmark points of anatomical locations distinct across the temporal domain. Quantitative evaluation and ranking were based on the Median Euclidean Error (MEE), Robustness, and the determinant of the Jacobian of the displacement field. The top-ranked methodologies yielded similar performance across all evaluation metrics and shared several methodological commonalities, including pre-alignment, deep neural networks, inverse consistency analysis, and test-time instance optimization per-case basis as a post-processing step. The top-ranked method attained the MEE at or below that of the inter-rater variability for approximately 60% of the evaluated landmarks, underscoring the scope for further accuracy and robustness improvements, especially relative to human experts. The aim of BraTS-Reg is to continue to serve as an active resource for research, with the data and online evaluation tools accessible at https://bratsreg.github.io/.
△ Less
Submitted 17 April, 2024; v1 submitted 13 December, 2021;
originally announced December 2021.
-
FedCostWAvg: A new averaging for better Federated Learning
Authors:
Leon Mächler,
Ivan Ezhov,
Florian Kofler,
Suprosanna Shit,
Johannes C. Paetzold,
Timo Loehr,
Benedikt Wiestler,
Bjoern Menze
Abstract:
We propose a simple new aggregation strategy for federated learning that won the MICCAI Federated Tumor Segmentation Challenge 2021 (FETS), the first ever challenge on Federated Learning in the Machine Learning community. Our method addresses the problem of how to aggregate multiple models that were trained on different data sets. Conceptually, we propose a new way to choose the weights when avera…
▽ More
We propose a simple new aggregation strategy for federated learning that won the MICCAI Federated Tumor Segmentation Challenge 2021 (FETS), the first ever challenge on Federated Learning in the Machine Learning community. Our method addresses the problem of how to aggregate multiple models that were trained on different data sets. Conceptually, we propose a new way to choose the weights when averaging the different models, thereby extending the current state of the art (FedAvg). Empirical validation demonstrates that our approach reaches a notable improvement in segmentation performance compared to FedAvg.
△ Less
Submitted 16 November, 2021;
originally announced November 2021.
-
Learn-Morph-Infer: a new way of solving the inverse problem for brain tumor modeling
Authors:
Ivan Ezhov,
Kevin Scibilia,
Katharina Franitza,
Felix Steinbauer,
Suprosanna Shit,
Lucas Zimmer,
Jana Lipkova,
Florian Kofler,
Johannes Paetzold,
Luca Canalini,
Diana Waldmannstetter,
Martin Menten,
Marie Metz,
Benedikt Wiestler,
Bjoern Menze
Abstract:
Current treatment planning of patients diagnosed with a brain tumor, such as glioma, could significantly benefit by accessing the spatial distribution of tumor cell concentration. Existing diagnostic modalities, e.g. magnetic resonance imaging (MRI), contrast sufficiently well areas of high cell density. In gliomas, however, they do not portray areas of low cell concentration, which can often serv…
▽ More
Current treatment planning of patients diagnosed with a brain tumor, such as glioma, could significantly benefit by accessing the spatial distribution of tumor cell concentration. Existing diagnostic modalities, e.g. magnetic resonance imaging (MRI), contrast sufficiently well areas of high cell density. In gliomas, however, they do not portray areas of low cell concentration, which can often serve as a source for the secondary appearance of the tumor after treatment. To estimate tumor cell densities beyond the visible boundaries of the lesion, numerical simulations of tumor growth could complement imaging information by providing estimates of full spatial distributions of tumor cells. Over recent years a corpus of literature on medical image-based tumor modeling was published. It includes different mathematical formalisms describing the forward tumor growth model. Alongside, various parametric inference schemes were developed to perform an efficient tumor model personalization, i.e. solving the inverse problem. However, the unifying drawback of all existing approaches is the time complexity of the model personalization which prohibits a potential integration of the modeling into clinical settings. In this work, we introduce a deep learning based methodology for inferring the patient-specific spatial distribution of brain tumors from T1Gd and FLAIR MRI medical scans. Coined as Learn-Morph-Infer the method achieves real-time performance in the order of minutes on widely available hardware and the compute time is stable across tumor models of different complexity, such as reaction-diffusion and reaction-advection-diffusion models. We believe the proposed inverse solution approach not only bridges the way for clinical translation of brain tumor personalization but can also be adopted to other scientific and engineering domains.
△ Less
Submitted 25 October, 2022; v1 submitted 7 November, 2021;
originally announced November 2021.
-
Semi-Implicit Neural Solver for Time-dependent Partial Differential Equations
Authors:
Suprosanna Shit,
Ivan Ezhov,
Leon Mächler,
Abinav R.,
Jana Lipkova,
Johannes C. Paetzold,
Florian Kofler,
Marie Piraud,
Bjoern H. Menze
Abstract:
Fast and accurate solutions of time-dependent partial differential equations (PDEs) are of pivotal interest to many research fields, including physics, engineering, and biology. Generally, implicit/semi-implicit schemes are preferred over explicit ones to improve stability and correctness. However, existing semi-implicit methods are usually iterative and employ a general-purpose solver, which may…
▽ More
Fast and accurate solutions of time-dependent partial differential equations (PDEs) are of pivotal interest to many research fields, including physics, engineering, and biology. Generally, implicit/semi-implicit schemes are preferred over explicit ones to improve stability and correctness. However, existing semi-implicit methods are usually iterative and employ a general-purpose solver, which may be sub-optimal for a specific class of PDEs. In this paper, we propose a neural solver to learn an optimal iterative scheme in a data-driven fashion for any class of PDEs. Specifically, we modify a single iteration of a semi-implicit solver using a deep neural network. We provide theoretical guarantees for the correctness and convergence of neural solvers analogous to conventional iterative solvers. In addition to the commonly used Dirichlet boundary condition, we adopt a diffuse domain approach to incorporate a diverse type of boundary conditions, e.g., Neumann. We show that the proposed neural solver can go beyond linear PDEs and applies to a class of non-linear PDEs, where the non-linear component is non-stiff. We demonstrate the efficacy of our method on 2D and 3D scenarios. To this end, we show how our model generalizes to parameter settings, which are different from training; and achieves faster convergence than semi-implicit schemes.
△ Less
Submitted 3 September, 2021;
originally announced September 2021.
-
Common Limitations of Image Processing Metrics: A Picture Story
Authors:
Annika Reinke,
Minu D. Tizabi,
Carole H. Sudre,
Matthias Eisenmann,
Tim Rädsch,
Michael Baumgartner,
Laura Acion,
Michela Antonelli,
Tal Arbel,
Spyridon Bakas,
Peter Bankhead,
Arriel Benis,
Matthew Blaschko,
Florian Buettner,
M. Jorge Cardoso,
Jianxu Chen,
Veronika Cheplygina,
Evangelia Christodoulou,
Beth Cimini,
Gary S. Collins,
Sandy Engelhardt,
Keyvan Farahani,
Luciana Ferrer,
Adrian Galdran,
Bram van Ginneken
, et al. (68 additional authors not shown)
Abstract:
While the importance of automatic image analysis is continuously increasing, recent meta-research revealed major flaws with respect to algorithm validation. Performance metrics are particularly key for meaningful, objective, and transparent performance assessment and validation of the used automatic algorithms, but relatively little attention has been given to the practical pitfalls when using spe…
▽ More
While the importance of automatic image analysis is continuously increasing, recent meta-research revealed major flaws with respect to algorithm validation. Performance metrics are particularly key for meaningful, objective, and transparent performance assessment and validation of the used automatic algorithms, but relatively little attention has been given to the practical pitfalls when using specific metrics for a given image analysis task. These are typically related to (1) the disregard of inherent metric properties, such as the behaviour in the presence of class imbalance or small target structures, (2) the disregard of inherent data set properties, such as the non-independence of the test cases, and (3) the disregard of the actual biomedical domain interest that the metrics should reflect. This living dynamically document has the purpose to illustrate important limitations of performance metrics commonly applied in the field of image analysis. In this context, it focuses on biomedical image analysis problems that can be phrased as image-level classification, semantic segmentation, instance segmentation, or object detection task. The current version is based on a Delphi process on metrics conducted by an international consortium of image analysis experts from more than 60 institutions worldwide.
△ Less
Submitted 6 December, 2023; v1 submitted 12 April, 2021;
originally announced April 2021.
-
Are we using appropriate segmentation metrics? Identifying correlates of human expert perception for CNN training beyond rolling the DICE coefficient
Authors:
Florian Kofler,
Ivan Ezhov,
Fabian Isensee,
Fabian Balsiger,
Christoph Berger,
Maximilian Koerner,
Beatrice Demiray,
Julia Rackerseder,
Johannes Paetzold,
Hongwei Li,
Suprosanna Shit,
Richard McKinley,
Marie Piraud,
Spyridon Bakas,
Claus Zimmer,
Nassir Navab,
Jan Kirschke,
Benedikt Wiestler,
Bjoern Menze
Abstract:
Metrics optimized in complex machine learning tasks are often selected in an ad-hoc manner. It is unknown how they align with human expert perception. We explore the correlations between established quantitative segmentation quality metrics and qualitative evaluations by professionally trained human raters. Therefore, we conduct psychophysical experiments for two complex biomedical semantic segmen…
▽ More
Metrics optimized in complex machine learning tasks are often selected in an ad-hoc manner. It is unknown how they align with human expert perception. We explore the correlations between established quantitative segmentation quality metrics and qualitative evaluations by professionally trained human raters. Therefore, we conduct psychophysical experiments for two complex biomedical semantic segmentation problems. We discover that current standard metrics and loss functions correlate only moderately with the segmentation quality assessment of experts. Importantly, this effect is particularly pronounced for clinically relevant structures, such as the enhancing tumor compartment of glioma in brain magnetic resonance and grey matter in ultrasound imaging. It is often unclear how to optimize abstract metrics, such as human expert perception, in convolutional neural network (CNN) training. To cope with this challenge, we propose a novel strategy employing techniques of classical statistics to create complementary compound loss functions to better approximate human expert perception. Across all rating experiments, human experts consistently scored computer-generated segmentations better than the human-curated reference labels. Our results, therefore, strongly question many current practices in medical image segmentation and provide meaningful cues for future research.
△ Less
Submitted 2 May, 2023; v1 submitted 10 March, 2021;
originally announced March 2021.
-
Geometry-aware neural solver for fast Bayesian calibration of brain tumor models
Authors:
Ivan Ezhov,
Tudor Mot,
Suprosanna Shit,
Jana Lipkova,
Johannes C. Paetzold,
Florian Kofler,
Fernando Navarro,
Chantal Pellegrini,
Marcel Kollovieh,
Marie Metz,
Benedikt Wiestler,
Bjoern Menze
Abstract:
Modeling of brain tumor dynamics has the potential to advance therapeutic planning. Current modeling approaches resort to numerical solvers that simulate the tumor progression according to a given differential equation. Using highly-efficient numerical solvers, a single forward simulation takes up to a few minutes of compute. At the same time, clinical applications of tumor modeling often imply so…
▽ More
Modeling of brain tumor dynamics has the potential to advance therapeutic planning. Current modeling approaches resort to numerical solvers that simulate the tumor progression according to a given differential equation. Using highly-efficient numerical solvers, a single forward simulation takes up to a few minutes of compute. At the same time, clinical applications of tumor modeling often imply solving an inverse problem, requiring up to tens of thousands forward model evaluations when used for a Bayesian model personalization via sampling. This results in a total inference time prohibitively expensive for clinical translation. While recent data-driven approaches become capable of emulating physics simulation, they tend to fail in generalizing over the variability of the boundary conditions imposed by the patient-specific anatomy. In this paper, we propose a learnable surrogate for simulating tumor growth which maps the biophysical model parameters directly to simulation outputs, i.e. the local tumor cell densities, whilst respecting patient geometry. We test the neural solver on Bayesian tumor model personalization for a cohort of glioma patients. Bayesian inference using the proposed surrogate yields estimates analogous to those obtained by solving the forward model with a regular numerical solver. The near-real-time computation cost renders the proposed method suitable for clinical settings. The code is available at https://github.com/IvanEz/tumor-surrogate.
△ Less
Submitted 14 April, 2021; v1 submitted 9 September, 2020;
originally announced September 2020.
-
A distance-based loss for smooth and continuous skin layer segmentation in optoacoustic images
Authors:
Stefan Gerl,
Johannes C. Paetzold,
Hailong He,
Ivan Ezhov,
Suprosanna Shit,
Florian Kofler,
Amirhossein Bayat,
Giles Tetteh,
Vasilis Ntziachristos,
Bjoern Menze
Abstract:
Raster-scan optoacoustic mesoscopy (RSOM) is a powerful, non-invasive optical imaging technique for functional, anatomical, and molecular skin and tissue analysis. However, both the manual and the automated analysis of such images are challenging, because the RSOM images have very low contrast, poor signal to noise ratio, and systematic overlaps between the absorption spectra of melanin and hemogl…
▽ More
Raster-scan optoacoustic mesoscopy (RSOM) is a powerful, non-invasive optical imaging technique for functional, anatomical, and molecular skin and tissue analysis. However, both the manual and the automated analysis of such images are challenging, because the RSOM images have very low contrast, poor signal to noise ratio, and systematic overlaps between the absorption spectra of melanin and hemoglobin. Nonetheless, the segmentation of the epidermis layer is a crucial step for many downstream medical and diagnostic tasks, such as vessel segmentation or monitoring of cancer progression. We propose a novel, shape-specific loss function that overcomes discontinuous segmentations and achieves smooth segmentation surfaces while preserving the same volumetric Dice and IoU. Further, we validate our epidermis segmentation through the sensitivity of vessel segmentation. We found a 20 $\%$ improvement in Dice for vessel segmentation tasks when the epidermis mask is provided as additional information to the vessel segmentation network.
△ Less
Submitted 10 July, 2020;
originally announced July 2020.
-
Red-GAN: Attacking class imbalance via conditioned generation. Yet another perspective on medical image synthesis for skin lesion dermoscopy and brain tumor MRI
Authors:
Ahmad B Qasim,
Ivan Ezhov,
Suprosanna Shit,
Oliver Schoppe,
Johannes C Paetzold,
Anjany Sekuboyina,
Florian Kofler,
Jana Lipkova,
Hongwei Li,
Bjoern Menze
Abstract:
Exploiting learning algorithms under scarce data regimes is a limitation and a reality of the medical imaging field. In an attempt to mitigate the problem, we propose a data augmentation protocol based on generative adversarial networks. We condition the networks at a pixel-level (segmentation mask) and at a global-level information (acquisition environment or lesion type). Such conditioning provi…
▽ More
Exploiting learning algorithms under scarce data regimes is a limitation and a reality of the medical imaging field. In an attempt to mitigate the problem, we propose a data augmentation protocol based on generative adversarial networks. We condition the networks at a pixel-level (segmentation mask) and at a global-level information (acquisition environment or lesion type). Such conditioning provides immediate access to the image-label pairs while controlling global class specific appearance of the synthesized images. To stimulate synthesis of the features relevant for the segmentation task, an additional passive player in a form of segmentor is introduced into the adversarial game. We validate the approach on two medical datasets: BraTS, ISIC. By controlling the class distribution through injection of synthetic images into the training set we achieve control over the accuracy levels of the datasets' classes.
△ Less
Submitted 27 March, 2021; v1 submitted 22 April, 2020;
originally announced April 2020.
-
Neural parameters estimation for brain tumor growth modeling
Authors:
Ivan Ezhov,
Jana Lipkova,
Suprosanna Shit,
Florian Kofler,
Nore Collomb,
Benjamin Lemasson,
Emmanuel Barbier,
Bjoern Menze
Abstract:
Understanding the dynamics of brain tumor progression is essential for optimal treatment planning. Cast in a mathematical formulation, it is typically viewed as evaluation of a system of partial differential equations, wherein the physiological processes that govern the growth of the tumor are considered. To personalize the model, i.e. find a relevant set of parameters, with respect to the tumor d…
▽ More
Understanding the dynamics of brain tumor progression is essential for optimal treatment planning. Cast in a mathematical formulation, it is typically viewed as evaluation of a system of partial differential equations, wherein the physiological processes that govern the growth of the tumor are considered. To personalize the model, i.e. find a relevant set of parameters, with respect to the tumor dynamics of a particular patient, the model is informed from empirical data, e.g., medical images obtained from diagnostic modalities, such as magnetic-resonance imaging. Existing model-observation coupling schemes require a large number of forward integrations of the biophysical model and rely on simplifying assumption on the functional form, linking the output of the model with the image information. In this work, we propose a learning-based technique for the estimation of tumor growth model parameters from medical scans. The technique allows for explicit evaluation of the posterior distribution of the parameters by sequentially training a mixture-density network, relaxing the constraint on the functional form and reducing the number of samples necessary to propagate through the forward model for the estimation. We test the method on synthetic and real scans of rats injected with brain tumors to calibrate the model and to predict tumor progression.
△ Less
Submitted 9 January, 2020; v1 submitted 1 July, 2019;
originally announced July 2019.
-
DiamondGAN: Unified Multi-Modal Generative Adversarial Networks for MRI Sequences Synthesis
Authors:
Hongwei Li,
Johannes C. Paetzold,
Anjany Sekuboyina,
Florian Kofler,
Jianguo Zhang,
Jan S. Kirschke,
Benedikt Wiestler,
Bjoern Menze
Abstract:
Synthesizing MR imaging sequences is highly relevant in clinical practice, as single sequences are often missing or are of poor quality (e.g. due to motion). Naturally, the idea arises that a target modality would benefit from multi-modal input, as proprietary information of individual modalities can be synergistic. However, existing methods fail to scale up to multiple non-aligned imaging modalit…
▽ More
Synthesizing MR imaging sequences is highly relevant in clinical practice, as single sequences are often missing or are of poor quality (e.g. due to motion). Naturally, the idea arises that a target modality would benefit from multi-modal input, as proprietary information of individual modalities can be synergistic. However, existing methods fail to scale up to multiple non-aligned imaging modalities, facing common drawbacks of complex imaging sequences. We propose a novel, scalable and multi-modal approach called DiamondGAN. Our model is capable of performing exible non-aligned cross-modality synthesis and data infill, when given multiple modalities or any of their arbitrary subsets, learning structured information in an end-to-end fashion. We synthesize two MRI sequences with clinical relevance (i.e., double inversion recovery (DIR) and contrast-enhanced T1 (T1-c)), reconstructed from three common sequences. In addition, we perform a multi-rater visual evaluation experiment and find that trained radiologists are unable to distinguish synthetic DIR images from real ones.
△ Less
Submitted 26 July, 2019; v1 submitted 29 April, 2019;
originally announced April 2019.
-
The Liver Tumor Segmentation Benchmark (LiTS)
Authors:
Patrick Bilic,
Patrick Christ,
Hongwei Bran Li,
Eugene Vorontsov,
Avi Ben-Cohen,
Georgios Kaissis,
Adi Szeskin,
Colin Jacobs,
Gabriel Efrain Humpire Mamani,
Gabriel Chartrand,
Fabian Lohöfer,
Julian Walter Holch,
Wieland Sommer,
Felix Hofmann,
Alexandre Hostettler,
Naama Lev-Cohain,
Michal Drozdzal,
Michal Marianne Amitai,
Refael Vivantik,
Jacob Sosna,
Ivan Ezhov,
Anjany Sekuboyina,
Fernando Navarro,
Florian Kofler,
Johannes C. Paetzold
, et al. (84 additional authors not shown)
Abstract:
In this work, we report the set-up and results of the Liver Tumor Segmentation Benchmark (LiTS), which was organized in conjunction with the IEEE International Symposium on Biomedical Imaging (ISBI) 2017 and the International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2017 and 2018. The image dataset is diverse and contains primary and secondary tumors with…
▽ More
In this work, we report the set-up and results of the Liver Tumor Segmentation Benchmark (LiTS), which was organized in conjunction with the IEEE International Symposium on Biomedical Imaging (ISBI) 2017 and the International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2017 and 2018. The image dataset is diverse and contains primary and secondary tumors with varied sizes and appearances with various lesion-to-background levels (hyper-/hypo-dense), created in collaboration with seven hospitals and research institutions. Seventy-five submitted liver and liver tumor segmentation algorithms were trained on a set of 131 computed tomography (CT) volumes and were tested on 70 unseen test images acquired from different patients. We found that not a single algorithm performed best for both liver and liver tumors in the three events. The best liver segmentation algorithm achieved a Dice score of 0.963, whereas, for tumor segmentation, the best algorithms achieved Dices scores of 0.674 (ISBI 2017), 0.702 (MICCAI 2017), and 0.739 (MICCAI 2018). Retrospectively, we performed additional analysis on liver tumor detection and revealed that not all top-performing segmentation algorithms worked well for tumor detection. The best liver tumor detection method achieved a lesion-wise recall of 0.458 (ISBI 2017), 0.515 (MICCAI 2017), and 0.554 (MICCAI 2018), indicating the need for further research. LiTS remains an active benchmark and resource for research, e.g., contributing the liver-related segmentation tasks in \url{http://medicaldecathlon.com/}. In addition, both data and online evaluation are accessible via \url{www.lits-challenge.com}.
△ Less
Submitted 25 November, 2022; v1 submitted 13 January, 2019;
originally announced January 2019.