subscribe to arXiv mailings

doi 10.1038/s41598-024-54164-z

Assessment of Deep Learning Segmentation for Real-Time Free-Breathing Cardiac Magnetic Resonance Imaging at Rest and Under Exercise Stress

Authors: Martin Schilling, Christina Unterberg-Buchwald, Joachim Lotz, Martin Uecker

Abstract: In recent years, a variety of deep learning networks for cardiac MRI (CMR) segmentation have been developed and analyzed. However, nearly all of them are focused on cine CMR under breathold. In this work, accuracy of deep learning methods is assessed for volumetric analysis (via segmentation) of the left ventricle in real-time free-breathing CMR at rest and under exercise stress. Data from healthy… ▽ More In recent years, a variety of deep learning networks for cardiac MRI (CMR) segmentation have been developed and analyzed. However, nearly all of them are focused on cine CMR under breathold. In this work, accuracy of deep learning methods is assessed for volumetric analysis (via segmentation) of the left ventricle in real-time free-breathing CMR at rest and under exercise stress. Data from healthy volunteers (n=15) for cine and real-time free-breathing CMR at rest and under exercise stress were analyzed retrospectively. Segmentations of a commercial software (comDL) and a freely available neural network (nnU-Net), were compared to a reference created via the manual correction of comDL segmentation. Segmentation of left ventricular endocardium (LV), left ventricular myocardium (MYO), and right ventricle (RV) is evaluated for both end-systolic and end-diastolic phases and analyzed with Dice's coefficient (DC). The volumetric analysis includes LV end-diastolic volume (EDV), LV end-systolic volume (ESV), and LV ejection fraction (EF). For cine CMR, nnU-Net and comDL achieve a DC above 0.95 for LV and 0.9 for MYO, and RV. For real-time CMR, the accuracy of nnU-Net exceeds that of comDL overall. For real-time CMR at rest, nnU-Net achieves a DC of 0.94 for LV, 0.89 for MYO, and 0.90 for RV; mean absolute differences between nnU-Net and reference are 2.9mL for EDV, 3.5mL for ESV and 2.6% for EF. For real-time CMR under exercise stress, nnU-Net achieves a DC of 0.92 for LV, 0.85 for MYO, and 0.83 for RV; mean absolute differences between nnU-Net and reference are 11.4mL for EDV, 2.9mL for ESV and 3.6% for EF. Deep learning methods designed or trained for cine CMR segmentation can perform well on real-time CMR. For real-time free-breathing CMR at rest, the performance of deep learning methods is comparable to inter-observer variability in cine CMR and is usable or fully automatic segmentation. △ Less

Submitted 9 February, 2024; v1 submitted 23 November, 2023; originally announced November 2023.

Comments: Martin Schilling and Christina Unterberg-Buchwald contributed equally to this work

Journal ref: Scientific Reports 2024;14:3754

arXiv:2311.09847 [pdf, other]

Overcoming Data Scarcity in Biomedical Imaging with a Foundational Multi-Task Model

Authors: Raphael Schäfer, Till Nicke, Henning Höfener, Annkristin Lange, Dorit Merhof, Friedrich Feuerhake, Volkmar Schulz, Johannes Lotz, Fabian Kiessling

Abstract: Foundational models, pretrained on a large scale, have demonstrated substantial success across non-medical domains. However, training these models typically requires large, comprehensive datasets, which contrasts with the smaller and more heterogeneous datasets common in biomedical imaging. Here, we propose a multi-task learning strategy that decouples the number of training tasks from memory requ… ▽ More Foundational models, pretrained on a large scale, have demonstrated substantial success across non-medical domains. However, training these models typically requires large, comprehensive datasets, which contrasts with the smaller and more heterogeneous datasets common in biomedical imaging. Here, we propose a multi-task learning strategy that decouples the number of training tasks from memory requirements. We trained a Universal bioMedical PreTrained model (UMedPT) on a multi-task database including tomographic, microscopic, and X-ray images, with various labelling strategies such as classification, segmentation, and object detection. The UMedPT foundational model outperformed ImageNet pretraining and the previous state-of-the-art models. For tasks related to the pretraining database, it maintained its performance with only 1% of the original training data and without fine-tuning. For out-of-domain tasks it required not more than 50% of the original training data. In an external independent validation imaging features extracted using UMedPT proved to be a new standard for cross-center transferability. △ Less

Submitted 16 November, 2023; originally announced November 2023.

Comments: 29 pages, 5 figures

arXiv:2311.00522 [pdf, other]

Text Rendering Strategies for Pixel Language Models

Authors: Jonas F. Lotz, Elizabeth Salesky, Phillip Rust, Desmond Elliott

Abstract: Pixel-based language models process text rendered as images, which allows them to handle any script, making them a promising approach to open vocabulary language modelling. However, recent approaches use text renderers that produce a large set of almost-equivalent input patches, which may prove sub-optimal for downstream tasks, due to redundancy in the input representations. In this paper, we inve… ▽ More Pixel-based language models process text rendered as images, which allows them to handle any script, making them a promising approach to open vocabulary language modelling. However, recent approaches use text renderers that produce a large set of almost-equivalent input patches, which may prove sub-optimal for downstream tasks, due to redundancy in the input representations. In this paper, we investigate four approaches to rendering text in the PIXEL model (Rust et al., 2023), and find that simple character bigram rendering brings improved performance on sentence-level tasks without compromising performance on token-level or multilingual tasks. This new rendering strategy also makes it possible to train a more compact model with only 22M parameters that performs on par with the original 86M parameter model. Our analyses show that character bigram rendering leads to a consistently better model but with an anisotropic patch embedding space, driven by a patch frequency bias, highlighting the connections between image patch- and tokenization-based language models. △ Less

Submitted 1 November, 2023; originally announced November 2023.

Comments: EMNLP 2023

arXiv:2305.18033 [pdf]

doi 10.1038/s41597-023-02422-6

The ACROBAT 2022 Challenge: Automatic Registration Of Breast Cancer Tissue

Authors: Philippe Weitz, Masi Valkonen, Leslie Solorzano, Circe Carr, Kimmo Kartasalo, Constance Boissin, Sonja Koivukoski, Aino Kuusela, Dusan Rasic, Yanbo Feng, Sandra Sinius Pouplier, Abhinav Sharma, Kajsa Ledesma Eriksson, Stephanie Robertson, Christian Marzahl, Chandler D. Gatenbee, Alexander R. A. Anderson, Marek Wodzinski, Artur Jurgas, Niccolò Marini, Manfredo Atzori, Henning Müller, Daniel Budelmann, Nick Weiss, Stefan Heldmann , et al. (16 additional authors not shown)

Abstract: The alignment of tissue between histopathological whole-slide-images (WSI) is crucial for research and clinical applications. Advances in computing, deep learning, and availability of large WSI datasets have revolutionised WSI analysis. Therefore, the current state-of-the-art in WSI registration is unclear. To address this, we conducted the ACROBAT challenge, based on the largest WSI registration… ▽ More The alignment of tissue between histopathological whole-slide-images (WSI) is crucial for research and clinical applications. Advances in computing, deep learning, and availability of large WSI datasets have revolutionised WSI analysis. Therefore, the current state-of-the-art in WSI registration is unclear. To address this, we conducted the ACROBAT challenge, based on the largest WSI registration dataset to date, including 4,212 WSIs from 1,152 breast cancer patients. The challenge objective was to align WSIs of tissue that was stained with routine diagnostic immunohistochemistry to its H&E-stained counterpart. We compare the performance of eight WSI registration algorithms, including an investigation of the impact of different WSI properties and clinical covariates. We find that conceptually distinct WSI registration methods can lead to highly accurate registration performances and identify covariates that impact performances across methods. These results establish the current state-of-the-art in WSI registration and guide researchers in selecting and developing methods. △ Less

Submitted 29 May, 2023; originally announced May 2023.

arXiv:2305.03610 [pdf, other]

The Role of Data Curation in Image Captioning

Authors: Wenyan Li, Jonas F. Lotz, Chen Qiu, Desmond Elliott

Abstract: Image captioning models are typically trained by treating all samples equally, neglecting to account for mismatched or otherwise difficult data points. In contrast, recent work has shown the effectiveness of training models by scheduling the data using curriculum learning strategies. This paper contributes to this direction by actively curating difficult samples in datasets without increasing the… ▽ More Image captioning models are typically trained by treating all samples equally, neglecting to account for mismatched or otherwise difficult data points. In contrast, recent work has shown the effectiveness of training models by scheduling the data using curriculum learning strategies. This paper contributes to this direction by actively curating difficult samples in datasets without increasing the total number of samples. We explore the effect of using three data curation methods within the training process: complete removal of an sample, caption replacement, or image replacement via a text-to-image generation model. Experiments on the Flickr30K and COCO datasets with the BLIP and BEiT-3 models demonstrate that these curation methods do indeed yield improved image captioning models, underscoring their efficacy. △ Less

Submitted 2 February, 2024; v1 submitted 5 May, 2023; originally announced May 2023.

arXiv:2207.06991 [pdf, other]

Language Modelling with Pixels

Authors: Phillip Rust, Jonas F. Lotz, Emanuele Bugliarello, Elizabeth Salesky, Miryam de Lhoneux, Desmond Elliott

Abstract: Language models are defined over a finite set of inputs, which creates a vocabulary bottleneck when we attempt to scale the number of supported languages. Tackling this bottleneck results in a trade-off between what can be represented in the embedding matrix and computational issues in the output layer. This paper introduces PIXEL, the Pixel-based Encoder of Language, which suffers from neither of… ▽ More Language models are defined over a finite set of inputs, which creates a vocabulary bottleneck when we attempt to scale the number of supported languages. Tackling this bottleneck results in a trade-off between what can be represented in the embedding matrix and computational issues in the output layer. This paper introduces PIXEL, the Pixel-based Encoder of Language, which suffers from neither of these issues. PIXEL is a pretrained language model that renders text as images, making it possible to transfer representations across languages based on orthographic similarity or the co-activation of pixels. PIXEL is trained to reconstruct the pixels of masked patches instead of predicting a distribution over tokens. We pretrain the 86M parameter PIXEL model on the same English data as BERT and evaluate on syntactic and semantic tasks in typologically diverse languages, including various non-Latin scripts. We find that PIXEL substantially outperforms BERT on syntactic and semantic processing tasks on scripts that are not found in the pretraining data, but PIXEL is slightly weaker than BERT when working with Latin scripts. Furthermore, we find that PIXEL is more robust than BERT to orthographic attacks and linguistic code-switching, further confirming the benefits of modelling language with pixels. △ Less

Submitted 26 April, 2023; v1 submitted 14 July, 2022; originally announced July 2022.

Comments: ICLR 2023

arXiv:2203.00449 [pdf, other]

Deep Learning based Prediction of MSI using MMR Markers in Colorectal Cancer

Authors: Ruqayya Awan, Mohammed Nimir, Shan E Ahmed Raza, Mohsin Bilal, Johannes Lotz, David Snead, Andrew Robinson, Nasir Rajpoot

Abstract: The accurate diagnosis and molecular profiling of colorectal cancers are critical for planning the best treatment options for patients. Microsatellite instability (MSI) or mismatch repair (MMR) status plays a vital role in appropriate treatment selection, has prognostic implications and is used to investigate the possibility of patients having underlying genetic disorders (Lynch syndrome). NICE re… ▽ More The accurate diagnosis and molecular profiling of colorectal cancers are critical for planning the best treatment options for patients. Microsatellite instability (MSI) or mismatch repair (MMR) status plays a vital role in appropriate treatment selection, has prognostic implications and is used to investigate the possibility of patients having underlying genetic disorders (Lynch syndrome). NICE recommends that all CRC patients should be offered MMR/MSI testing. Immunohistochemistry is commonly used to assess MMR status with subsequent molecular testing performed as required. This incurs significant extra costs and requires additional resources. The introduction of automated methods that can predict MSI or MMR status from a target image could substantially reduce the cost associated with MMR testing. Unlike previous studies on MSI prediction involving training a CNN using coarse labels (MSI vs Microsatellite Stable (MSS)), we have utilised fine-grain MMR labels for training purposes. In this paper, we present our work on predicting MSI status in a two-stage process using a single target slide either stained with CK8/18 or H&E. First, we trained a multi-headed convolutional neural network model where each head was responsible for predicting one of the MMR protein expressions. To this end, we performed the registration of MMR stained slides to the target slide as a pre-processing step. In the second stage, statistical features computed from the MMR prediction maps were used for the final MSI prediction. Our results demonstrated that MSI classification can be improved by incorporating fine-grained MMR labels in comparison to the previous approaches in which only coarse labels were utilised. △ Less

Submitted 26 April, 2022; v1 submitted 24 February, 2022; originally announced March 2022.

arXiv:2202.09971 [pdf, other]

Deep Feature based Cross-slide Registration

Authors: Ruqayya Awan, Shan E Ahmed Raza, Johannes Lotz, Nick Weiss, Nasir Rajpoot

Abstract: Cross-slide image analysis provides additional information by analysing the expression of different biomarkers as compared to a single slide analysis. These biomarker stained slides are analysed side by side, revealing unknown relations between them. During the slide preparation, a tissue section may be placed at an arbitrary orientation as compared to other sections of the same tissue block. The… ▽ More Cross-slide image analysis provides additional information by analysing the expression of different biomarkers as compared to a single slide analysis. These biomarker stained slides are analysed side by side, revealing unknown relations between them. During the slide preparation, a tissue section may be placed at an arbitrary orientation as compared to other sections of the same tissue block. The problem is compounded by the fact that tissue contents are likely to change from one section to the next and there may be unique artefacts on some of the slides. This makes registration of each section to a reference section of the same tissue block an important pre-requisite task before any cross-slide analysis. We propose a deep feature based registration (DFBR) method which utilises data-driven features to estimate the rigid transformation. We adopted a multi-stage strategy for improving the quality of registration. We also developed a visualisation tool to view registered pairs of WSIs at different magnifications. With the help of this tool, one can apply a transformation on the fly without the need to generate transformed source WSI in a pyramidal form. We compared the performance of data-driven features with that of hand-crafted features on the COMET dataset. Our approach can align the images with low registration errors. Generally, the success of non-rigid registration is dependent on the quality of rigid registration. To evaluate the efficacy of the DFBR method, the first two steps of the ANHIR winner's framework are replaced with our DFBR to register challenge provided image pairs. The modified framework produces comparable results to that of challenge winning team. △ Less

Submitted 25 April, 2022; v1 submitted 20 February, 2022; originally announced February 2022.

arXiv:2106.13150 [pdf, other]

doi 10.1117/1.JMI.10.6.067501

Comparison of Consecutive and Re-stained Sections for Image Registration in Histopathology

Authors: Johannes Lotz, Nick Weiss, Jeroen van der Laak, Stefan Heldmann

Abstract: Purpose: In digital histopathology, virtual multi-staining is important for diagnosis and biomarker research. Additionally, it provides accurate ground-truth for various deep-learning tasks. Virtual multi-staining can be obtained using different stains for consecutive sections or by re-staining the same section. Both approaches require image registration to compensate tissue deformations, but litt… ▽ More Purpose: In digital histopathology, virtual multi-staining is important for diagnosis and biomarker research. Additionally, it provides accurate ground-truth for various deep-learning tasks. Virtual multi-staining can be obtained using different stains for consecutive sections or by re-staining the same section. Both approaches require image registration to compensate tissue deformations, but little attention has been devoted to comparing their accuracy. Approach: We compare variational image registration of consecutive and re-stained sections and analyze the effect of the image resolution which influences accuracy and required computational resources. We present a new hybrid dataset of re-stained and consecutive sections (HyReCo, 81 slide pairs, approx. 3000 landmarks) that we made publicly available and compare its image registration results to the automatic non-rigid histological image registration (ANHIR) challenge data (230 consecutive slide pairs). Results: We obtain a median landmark error after registration of 7.1 μm (HyReCo) and 16.0 μm (ANHIR) between consecutive sections. Between re-stained sections, the median registration error is 2.3 μm and 0.9 μm in the two subsets of the HyReCo dataset. We observe that deformable registration leads to lower landmark errors than affine registration in both cases, though the effect is smaller in re-stained sections. Conclusion: Deformable registration of consecutive and re-stained sections is a valuable tool for the joint analysis of different stains. Significance: While the registration of re-stained sections allows nucleus-level alignment which allows for a direct analysis of interacting biomarkers, consecutive sections only allow the transfer of region-level annotations. The latter can be achieved at low computational cost using coarser image resolutions. △ Less

Submitted 29 September, 2022; v1 submitted 24 June, 2021; originally announced June 2021.

Comments: submitted, data available at https://dx.doi.org/10.21227/pzj5-bs61

arXiv:2003.07801 [pdf, other]

Virtual staining for mitosis detection in Breast Histopathology

Authors: Caner Mercan, Germonda Reijnen-Mooij, David Tellez Martin, Johannes Lotz, Nick Weiss, Marcel van Gerven, Francesco Ciompi

Abstract: We propose a virtual staining methodology based on Generative Adversarial Networks to map histopathology images of breast cancer tissue from H&E stain to PHH3 and vice versa. We use the resulting synthetic images to build Convolutional Neural Networks (CNN) for automatic detection of mitotic figures, a strong prognostic biomarker used in routine breast cancer diagnosis and grading. We propose seve… ▽ More We propose a virtual staining methodology based on Generative Adversarial Networks to map histopathology images of breast cancer tissue from H&E stain to PHH3 and vice versa. We use the resulting synthetic images to build Convolutional Neural Networks (CNN) for automatic detection of mitotic figures, a strong prognostic biomarker used in routine breast cancer diagnosis and grading. We propose several scenarios, in which CNN trained with synthetically generated histopathology images perform on par with or even better than the same baseline model trained with real images. We discuss the potential of this application to scale the number of training samples without the need for manual annotations. △ Less

Submitted 17 March, 2020; originally announced March 2020.

Comments: 5 pages, 4 figures. Accepted for publication at the IEEE International Symposium on Biomedical Imaging (ISBI), 2020

arXiv:1911.12604 [pdf, other]

doi 10.1007/978-3-030-50371-0_51

Eigen-AD: Algorithmic Differentiation of the Eigen Library

Authors: Patrick Peltzer, Johannes Lotz, Uwe Naumann

Abstract: In this work we present useful techniques and possible enhancements when applying an Algorithmic Differentiation (AD) tool to the linear algebra library Eigen using our in-house AD by overloading (AD-O) tool dco/c++ as a case study. After outlining performance and feasibility issues when calculating derivatives for the official Eigen release, we propose Eigen-AD, which enables different optimizati… ▽ More In this work we present useful techniques and possible enhancements when applying an Algorithmic Differentiation (AD) tool to the linear algebra library Eigen using our in-house AD by overloading (AD-O) tool dco/c++ as a case study. After outlining performance and feasibility issues when calculating derivatives for the official Eigen release, we propose Eigen-AD, which enables different optimization options for an AD-O tool by providing add-on modules for Eigen. The range of features includes a better handling of expression templates for general performance improvements, as well as implementations of symbolically derived expressions for calculating derivatives of certain core operations. The software design allows an AD-O tool to provide specializations to automatically include symbolic operations and thereby keep the look and feel of plain AD by overloading. As a showcase, dco/c++ is provided with such a module and its significant performance improvements are validated by benchmarks. △ Less

Submitted 22 June, 2020; v1 submitted 28 November, 2019; originally announced November 2019.

Comments: Updated with accepted version for ICCS 2020 conference proceedings. The final authenticated publication is available online at https://doi.org/10.1007/978-3-030-50371-0_51. See v1 for the original, extended preprint. 14 pages, 7 figures

Journal ref: Computational Science - ICCS 2020: 20th International Conference, Amsterdam, The Netherlands, June 3-5, 2020, Proceedings, Part I, 12137, 690-704

arXiv:1903.12063 [pdf, other]

Robust, fast and accurate: a 3-step method for automatic histological image registration

Authors: Johannes Lotz, Nick Weiss, Stefan Heldmann

Abstract: We present a 3-step registration pipeline for differently stained histological serial sections that consists of 1) a robust pre-alignment, 2) a parametric registration computed on coarse resolution images, and 3) an accurate nonlinear registration. In all three steps the NGF distance measure is minimized with respect to an increasingly flexible transformation. We apply the method in the ANHIR imag… ▽ More We present a 3-step registration pipeline for differently stained histological serial sections that consists of 1) a robust pre-alignment, 2) a parametric registration computed on coarse resolution images, and 3) an accurate nonlinear registration. In all three steps the NGF distance measure is minimized with respect to an increasingly flexible transformation. We apply the method in the ANHIR image registration challenge and evaluate its performance on the training data. The presented method is robust (error reduction in 99.6% of the cases), fast (runtime 4 seconds) and accurate (median relative target registration error 0.19%). △ Less

Submitted 29 March, 2019; v1 submitted 28 March, 2019; originally announced March 2019.

arXiv:1902.09140 [pdf, other]

doi 10.1109/ICSA-C.2019.00016

Microservice Architectures for Advanced Driver Assistance Systems: A Case-Study

Authors: Jannik Lotz, Andreas Vogelsang, Ola Benderius, Christian Berger

Abstract: The technological advancements of recent years have steadily increased the complexity of vehicle-internal software systems, and the ongoing development towards autonomous driving will further aggravate this situation. This is leading to a level of complexity that is pushing the limits of existing vehicle software architectures and system designs. By changing the software structure to a service-bas… ▽ More The technological advancements of recent years have steadily increased the complexity of vehicle-internal software systems, and the ongoing development towards autonomous driving will further aggravate this situation. This is leading to a level of complexity that is pushing the limits of existing vehicle software architectures and system designs. By changing the software structure to a service-based architecture, companies in other domains successfully managed the rising complexity and created a more agile and future-oriented development process. This paper presents a case-study investigating the feasibility and possible effects of changing the software architecture for a complex driver assistance function to a microservice architecture. The complete procedure is described, starting with the description of the software-environment and the corresponding requirements, followed by the implementation, and the final testing. In addition, this paper provides a high-level evaluation of the microservice architecture for the automotive use-case. The results show that microservice architectures can reduce complexity and time-consuming process steps and makes the automotive software systems prepared for upcoming challenges as long as the principles of microservice architectures are carefully followed. △ Less

Submitted 25 February, 2019; originally announced February 2019.

arXiv:1808.05883 [pdf, other]

doi 10.1038/s41598-018-37257-4

Epithelium segmentation using deep learning in H&E-stained prostate specimens with immunohistochemistry as reference standard

Authors: Wouter Bulten, Péter Bándi, Jeffrey Hoven, Rob van de Loo, Johannes Lotz, Nick Weiss, Jeroen van der Laak, Bram van Ginneken, Christina Hulsbergen-van de Kaa, Geert Litjens

Abstract: Prostate cancer (PCa) is graded by pathologists by examining the architectural pattern of cancerous epithelial tissue on hematoxylin and eosin (H&E) stained slides. Given the importance of gland morphology, automatically differentiating between glandular epithelial tissue and other tissues is an important prerequisite for the development of automated methods for detecting PCa. We propose a new met… ▽ More Prostate cancer (PCa) is graded by pathologists by examining the architectural pattern of cancerous epithelial tissue on hematoxylin and eosin (H&E) stained slides. Given the importance of gland morphology, automatically differentiating between glandular epithelial tissue and other tissues is an important prerequisite for the development of automated methods for detecting PCa. We propose a new method, using deep learning, for automatically segmenting epithelial tissue in digitized prostatectomy slides. We employed immunohistochemistry (IHC) to render the ground truth less subjective and more precise compared to manual outlining on H&E slides, especially in areas with high-grade and poorly differentiated PCa. Our dataset consisted of 102 tissue blocks, including both low and high grade PCa. From each block a single new section was cut, stained with H&E, scanned, restained using P63 and CK8/18 to highlight the epithelial structure, and scanned again. The H&E slides were co-registered to the IHC slides. On a subset of the IHC slides we applied color deconvolution, corrected stain errors manually, and trained a U-Net to perform segmentation of epithelial structures. Whole-slide segmentation masks generated by the IHC U-Net were used to train a second U-Net on H&E. Our system makes precise cell-level segmentations and segments both intact glands as well as individual (tumor) epithelial cells. We achieved an F1-score of 0.895 on a hold-out test set and 0.827 on an external reference set from a different center. We envision this segmentation as being the first part of a fully automated prostate cancer detection and grading pipeline. △ Less

Submitted 8 February, 2019; v1 submitted 17 August, 2018; originally announced August 2018.

Journal ref: Nature Scientific Reports 9, 864 (2019)

Showing 1–14 of 14 results for author: Lotz, J