-
Benchmarking the CoW with the TopCoW Challenge: Topology-Aware Anatomical Segmentation of the Circle of Willis for CTA and MRA
Authors:
Kaiyuan Yang,
Fabio Musio,
Yihui Ma,
Norman Juchler,
Johannes C. Paetzold,
Rami Al-Maskari,
Luciano Höher,
Hongwei Bran Li,
Ibrahim Ethem Hamamci,
Anjany Sekuboyina,
Suprosanna Shit,
Houjing Huang,
Chinmay Prabhakar,
Ezequiel de la Rosa,
Diana Waldmannstetter,
Florian Kofler,
Fernando Navarro,
Martin Menten,
Ivan Ezhov,
Daniel Rueckert,
Iris Vos,
Ynte Ruigrok,
Birgitta Velthuis,
Hugo Kuijf,
Julien Hämmerli
, et al. (59 additional authors not shown)
Abstract:
The Circle of Willis (CoW) is an important network of arteries connecting major circulations of the brain. Its vascular architecture is believed to affect the risk, severity, and clinical outcome of serious neuro-vascular diseases. However, characterizing the highly variable CoW anatomy is still a manual and time-consuming expert task. The CoW is usually imaged by two angiographic imaging modaliti…
▽ More
The Circle of Willis (CoW) is an important network of arteries connecting major circulations of the brain. Its vascular architecture is believed to affect the risk, severity, and clinical outcome of serious neuro-vascular diseases. However, characterizing the highly variable CoW anatomy is still a manual and time-consuming expert task. The CoW is usually imaged by two angiographic imaging modalities, magnetic resonance angiography (MRA) and computed tomography angiography (CTA), but there exist limited public datasets with annotations on CoW anatomy, especially for CTA. Therefore we organized the TopCoW Challenge in 2023 with the release of an annotated CoW dataset. The TopCoW dataset was the first public dataset with voxel-level annotations for thirteen possible CoW vessel components, enabled by virtual-reality (VR) technology. It was also the first large dataset with paired MRA and CTA from the same patients. TopCoW challenge formalized the CoW characterization problem as a multiclass anatomical segmentation task with an emphasis on topological metrics. We invited submissions worldwide for the CoW segmentation task, which attracted over 140 registered participants from four continents. The top performing teams managed to segment many CoW components to Dice scores around 90%, but with lower scores for communicating arteries and rare variants. There were also topological mistakes for predictions with high Dice scores. Additional topological analysis revealed further areas for improvement in detecting certain CoW components and matching CoW variant topology accurately. TopCoW represented a first attempt at benchmarking the CoW anatomical segmentation task for MRA and CTA, both morphologically and topologically.
△ Less
Submitted 29 April, 2024; v1 submitted 29 December, 2023;
originally announced December 2023.
-
Panoptica -- instance-wise evaluation of 3D semantic and instance segmentation maps
Authors:
Florian Kofler,
Hendrik Möller,
Josef A. Buchner,
Ezequiel de la Rosa,
Ivan Ezhov,
Marcel Rosier,
Isra Mekki,
Suprosanna Shit,
Moritz Negwer,
Rami Al-Maskari,
Ali Ertürk,
Shankeeth Vinayahalingam,
Fabian Isensee,
Sarthak Pati,
Daniel Rueckert,
Jan S. Kirschke,
Stefan K. Ehrlich,
Annika Reinke,
Bjoern Menze,
Benedikt Wiestler,
Marie Piraud
Abstract:
This paper introduces panoptica, a versatile and performance-optimized package designed for computing instance-wise segmentation quality metrics from 2D and 3D segmentation maps. panoptica addresses the limitations of existing metrics and provides a modular framework that complements the original intersection over union-based panoptic quality with other metrics, such as the distance metric Average…
▽ More
This paper introduces panoptica, a versatile and performance-optimized package designed for computing instance-wise segmentation quality metrics from 2D and 3D segmentation maps. panoptica addresses the limitations of existing metrics and provides a modular framework that complements the original intersection over union-based panoptic quality with other metrics, such as the distance metric Average Symmetric Surface Distance. The package is open-source, implemented in Python, and accompanied by comprehensive documentation and tutorials. panoptica employs a three-step metrics computation process to cover diverse use cases. The efficacy of panoptica is demonstrated on various real-world biomedical datasets, where an instance-wise evaluation is instrumental for an accurate representation of the underlying clinical task. Overall, we envision panoptica as a valuable tool facilitating in-depth evaluation of segmentation methods.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
Approaching Peak Ground Truth
Authors:
Florian Kofler,
Johannes Wahle,
Ivan Ezhov,
Sophia Wagner,
Rami Al-Maskari,
Emilia Gryska,
Mihail Todorov,
Christina Bukas,
Felix Meissen,
Tingying Peng,
Ali Ertürk,
Daniel Rueckert,
Rolf Heckemann,
Jan Kirschke,
Claus Zimmer,
Benedikt Wiestler,
Bjoern Menze,
Marie Piraud
Abstract:
Machine learning models are typically evaluated by computing similarity with reference annotations and trained by maximizing similarity with such. Especially in the biomedical domain, annotations are subjective and suffer from low inter- and intra-rater reliability. Since annotations only reflect one interpretation of the real world, this can lead to sub-optimal predictions even though the model a…
▽ More
Machine learning models are typically evaluated by computing similarity with reference annotations and trained by maximizing similarity with such. Especially in the biomedical domain, annotations are subjective and suffer from low inter- and intra-rater reliability. Since annotations only reflect one interpretation of the real world, this can lead to sub-optimal predictions even though the model achieves high similarity scores. Here, the theoretical concept of PGT is introduced. PGT marks the point beyond which an increase in similarity with the \emph{reference annotation} stops translating to better RWMP. Additionally, a quantitative technique to approximate PGT by computing inter- and intra-rater reliability is proposed. Finally, four categories of PGT-aware strategies to evaluate and improve model performance are reviewed.
△ Less
Submitted 18 March, 2023; v1 submitted 31 December, 2022;
originally announced January 2023.
-
blob loss: instance imbalance aware loss functions for semantic segmentation
Authors:
Florian Kofler,
Suprosanna Shit,
Ivan Ezhov,
Lucas Fidon,
Izabela Horvath,
Rami Al-Maskari,
Hongwei Li,
Harsharan Bhatia,
Timo Loehr,
Marie Piraud,
Ali Erturk,
Jan Kirschke,
Jan C. Peeken,
Tom Vercauteren,
Claus Zimmer,
Benedikt Wiestler,
Bjoern Menze
Abstract:
Deep convolutional neural networks (CNN) have proven to be remarkably effective in semantic segmentation tasks. Most popular loss functions were introduced targeting improved volumetric scores, such as the Dice coefficient (DSC). By design, DSC can tackle class imbalance, however, it does not recognize instance imbalance within a class. As a result, a large foreground instance can dominate minor i…
▽ More
Deep convolutional neural networks (CNN) have proven to be remarkably effective in semantic segmentation tasks. Most popular loss functions were introduced targeting improved volumetric scores, such as the Dice coefficient (DSC). By design, DSC can tackle class imbalance, however, it does not recognize instance imbalance within a class. As a result, a large foreground instance can dominate minor instances and still produce a satisfactory DSC. Nevertheless, detecting tiny instances is crucial for many applications, such as disease monitoring. For example, it is imperative to locate and surveil small-scale lesions in the follow-up of multiple sclerosis patients. We propose a novel family of loss functions, \emph{blob loss}, primarily aimed at maximizing instance-level detection metrics, such as F1 score and sensitivity. \emph{Blob loss} is designed for semantic segmentation problems where detecting multiple instances matters. We extensively evaluate a DSC-based \emph{blob loss} in five complex 3D semantic segmentation tasks featuring pronounced instance heterogeneity in terms of texture and morphology. Compared to soft Dice loss, we achieve 5% improvement for MS lesions, 3% improvement for liver tumor, and an average 2% improvement for microscopy segmentation tasks considering F1 score.
△ Less
Submitted 6 June, 2023; v1 submitted 17 May, 2022;
originally announced May 2022.
-
METGAN: Generative Tumour Inpainting and Modality Synthesis in Light Sheet Microscopy
Authors:
Izabela Horvath,
Johannes C. Paetzold,
Oliver Schoppe,
Rami Al-Maskari,
Ivan Ezhov,
Suprosanna Shit,
Hongwei Li,
Ali Ertuerk,
Bjoern H. Menze
Abstract:
Novel multimodal imaging methods are capable of generating extensive, super high resolution datasets for preclinical research. Yet, a massive lack of annotations prevents the broad use of deep learning to analyze such data. So far, existing generative models fail to mitigate this problem because of frequent labeling errors. In this paper, we introduce a novel generative method which leverages real…
▽ More
Novel multimodal imaging methods are capable of generating extensive, super high resolution datasets for preclinical research. Yet, a massive lack of annotations prevents the broad use of deep learning to analyze such data. So far, existing generative models fail to mitigate this problem because of frequent labeling errors. In this paper, we introduce a novel generative method which leverages real anatomical information to generate realistic image-label pairs of tumours. We construct a dual-pathway generator, for the anatomical image and label, trained in a cycle-consistent setup, constrained by an independent, pretrained segmentor. The generated images yield significant quantitative improvement compared to existing methods. To validate the quality of synthesis, we train segmentation networks on a dataset augmented with the synthetic data, substantially improving the segmentation over baseline.
△ Less
Submitted 23 April, 2021; v1 submitted 22 April, 2021;
originally announced April 2021.