-
Shadow and Light: Digitally Reconstructed Radiographs for Disease Classification
Authors:
Benjamin Hou,
Qingqing Zhu,
Tejas Sudarshan Mathai,
Qiao Jin,
Zhiyong Lu,
Ronald M. Summers
Abstract:
In this paper, we introduce DRR-RATE, a large-scale synthetic chest X-ray dataset derived from the recently released CT-RATE dataset. DRR-RATE comprises of 50,188 frontal Digitally Reconstructed Radiographs (DRRs) from 21,304 unique patients. Each image is paired with a corresponding radiology text report and binary labels for 18 pathology classes. Given the controllable nature of DRR generation,…
▽ More
In this paper, we introduce DRR-RATE, a large-scale synthetic chest X-ray dataset derived from the recently released CT-RATE dataset. DRR-RATE comprises of 50,188 frontal Digitally Reconstructed Radiographs (DRRs) from 21,304 unique patients. Each image is paired with a corresponding radiology text report and binary labels for 18 pathology classes. Given the controllable nature of DRR generation, it facilitates the inclusion of lateral view images and images from any desired viewing position. This opens up avenues for research into new and novel multimodal applications involving paired CT, X-ray images from various views, text, and binary labels. We demonstrate the applicability of DRR-RATE alongside existing large-scale chest X-ray resources, notably the CheXpert dataset and CheXnet model. Experiments demonstrate that CheXnet, when trained and tested on the DRR-RATE dataset, achieves sufficient to high AUC scores for the six common pathologies cited in common literature: Atelectasis, Cardiomegaly, Consolidation, Lung Lesion, Lung Opacity, and Pleural Effusion. Additionally, CheXnet trained on the CheXpert dataset can accurately identify several pathologies, even when operating out of distribution. This confirms that the generated DRR images effectively capture the essential pathology features from CT images. The dataset and labels are publicly accessible at https://huggingface.co/datasets/farrell236/DRR-RATE.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Automated classification of multi-parametric body MRI series
Authors:
Boah Kim,
Tejas Sudharshan Mathai,
Kimberly Helm,
Ronald M. Summers
Abstract:
Multi-parametric MRI (mpMRI) studies are widely available in clinical practice for the diagnosis of various diseases. As the volume of mpMRI exams increases yearly, there are concomitant inaccuracies that exist within the DICOM header fields of these exams. This precludes the use of the header information for the arrangement of the different series as part of the radiologist's hanging protocol, an…
▽ More
Multi-parametric MRI (mpMRI) studies are widely available in clinical practice for the diagnosis of various diseases. As the volume of mpMRI exams increases yearly, there are concomitant inaccuracies that exist within the DICOM header fields of these exams. This precludes the use of the header information for the arrangement of the different series as part of the radiologist's hanging protocol, and clinician oversight is needed for correction. In this pilot work, we propose an automated framework to classify the type of 8 different series in mpMRI studies. We used 1,363 studies acquired by three Siemens scanners to train a DenseNet-121 model with 5-fold cross-validation. Then, we evaluated the performance of the DenseNet-121 ensemble on a held-out test set of 313 mpMRI studies. Our method achieved an average precision of 96.6%, sensitivity of 96.6%, specificity of 99.6%, and F1 score of 96.6% for the MRI series classification task. To the best of our knowledge, we are the first to develop a method to classify the series type in mpMRI studies acquired at the level of the chest, abdomen, and pelvis. Our method has the capability for robust automation of hanging protocols in modern radiology practice.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
MRISegmentator-Abdomen: A Fully Automated Multi-Organ and Structure Segmentation Tool for T1-weighted Abdominal MRI
Authors:
Yan Zhuang,
Tejas Sudharshan Mathai,
Pritam Mukherjee,
Brandon Khoury,
Boah Kim,
Benjamin Hou,
Nusrat Rabbee,
Abhinav Suri,
Ronald M. Summers
Abstract:
Background: Segmentation of organs and structures in abdominal MRI is useful for many clinical applications, such as disease diagnosis and radiotherapy. Current approaches have focused on delineating a limited set of abdominal structures (13 types). To date, there is no publicly available abdominal MRI dataset with voxel-level annotations of multiple organs and structures. Consequently, a segmenta…
▽ More
Background: Segmentation of organs and structures in abdominal MRI is useful for many clinical applications, such as disease diagnosis and radiotherapy. Current approaches have focused on delineating a limited set of abdominal structures (13 types). To date, there is no publicly available abdominal MRI dataset with voxel-level annotations of multiple organs and structures. Consequently, a segmentation tool for multi-structure segmentation is also unavailable. Methods: We curated a T1-weighted abdominal MRI dataset consisting of 195 patients who underwent imaging at National Institutes of Health (NIH) Clinical Center. The dataset comprises of axial pre-contrast T1, arterial, venous, and delayed phases for each patient, thereby amounting to a total of 780 series (69,248 2D slices). Each series contains voxel-level annotations of 62 abdominal organs and structures. A 3D nnUNet model, dubbed as MRISegmentator-Abdomen (MRISegmentator in short), was trained on this dataset, and evaluation was conducted on an internal test set and two large external datasets: AMOS22 and Duke Liver. The predicted segmentations were compared against the ground-truth using the Dice Similarity Coefficient (DSC) and Normalized Surface Distance (NSD). Findings: MRISegmentator achieved an average DSC of 0.861$\pm$0.170 and a NSD of 0.924$\pm$0.163 in the internal test set. On the AMOS22 dataset, MRISegmentator attained an average DSC of 0.829$\pm$0.133 and a NSD of 0.908$\pm$0.067. For the Duke Liver dataset, an average DSC of 0.933$\pm$0.015 and a NSD of 0.929$\pm$0.021 was obtained. Interpretation: The proposed MRISegmentator provides automatic, accurate, and robust segmentations of 62 organs and structures in T1-weighted abdominal MRI sequences. The tool has the potential to accelerate research on various clinical topics, such as abnormality detection, radiotherapy, disease classification among others.
△ Less
Submitted 24 June, 2024; v1 submitted 9 May, 2024;
originally announced May 2024.
-
How Well Do Multi-modal LLMs Interpret CT Scans? An Auto-Evaluation Framework for Analyses
Authors:
Qingqing Zhu,
Benjamin Hou,
Tejas S. Mathai,
Pritam Mukherjee,
Qiao Jin,
Xiuying Chen,
Zhizheng Wang,
Ruida Cheng,
Ronald M. Summers,
Zhiyong Lu
Abstract:
Automatically interpreting CT scans can ease the workload of radiologists. However, this is challenging mainly due to the scarcity of adequate datasets and reference standards for evaluation. This study aims to bridge this gap by introducing a novel evaluation framework, named ``GPTRadScore''. This framework assesses the capabilities of multi-modal LLMs, such as GPT-4 with Vision (GPT-4V), Gemini…
▽ More
Automatically interpreting CT scans can ease the workload of radiologists. However, this is challenging mainly due to the scarcity of adequate datasets and reference standards for evaluation. This study aims to bridge this gap by introducing a novel evaluation framework, named ``GPTRadScore''. This framework assesses the capabilities of multi-modal LLMs, such as GPT-4 with Vision (GPT-4V), Gemini Pro Vision, LLaVA-Med, and RadFM, in generating descriptions for prospectively-identified findings. By employing a decomposition technique based on GPT-4, GPTRadScore compares these generated descriptions with gold-standard report sentences, analyzing their accuracy in terms of body part, location, and type of finding. Evaluations demonstrated a high correlation with clinician assessments and highlighted its potential over traditional metrics, such as BLEU, METEOR, and ROUGE. Furthermore, to contribute to future studies, we plan to release a benchmark dataset annotated by clinicians. Using GPTRadScore, we found that while GPT-4V and Gemini Pro Vision fare better, their performance revealed significant areas for improvement, primarily due to limitations in the dataset used for training these models. To demonstrate this potential, RadFM was fine-tuned and it resulted in significant accuracy improvements: location accuracy rose from 3.41\% to 12.8\%, body part accuracy from 29.12\% to 53\%, and type accuracy from 9.24\% to 30\%, thereby validating our hypothesis.
△ Less
Submitted 18 June, 2024; v1 submitted 8 March, 2024;
originally announced March 2024.
-
Automated Plaque Detection and Agatston Score Estimation on Non-Contrast CT Scans: A Multicenter Study
Authors:
Andrew M. Nguyen,
Jianfei Liu,
Tejas Sudharshan Mathai,
Peter C. Grayson,
Ronald M. Summers
Abstract:
Coronary artery calcification (CAC) is a strong and independent predictor of cardiovascular disease (CVD). However, manual assessment of CAC often requires radiological expertise, time, and invasive imaging techniques. The purpose of this multicenter study is to validate an automated cardiac plaque detection model using a 3D multiclass nnU-Net for gated and non-gated non-contrast chest CT volumes.…
▽ More
Coronary artery calcification (CAC) is a strong and independent predictor of cardiovascular disease (CVD). However, manual assessment of CAC often requires radiological expertise, time, and invasive imaging techniques. The purpose of this multicenter study is to validate an automated cardiac plaque detection model using a 3D multiclass nnU-Net for gated and non-gated non-contrast chest CT volumes. CT scans were performed at three tertiary care hospitals and collected as three datasets, respectively. Heart, aorta, and lung segmentations were determined using TotalSegmentator, while plaques in the coronary arteries and heart valves were manually labeled for 801 volumes. In this work we demonstrate how the nnU-Net semantic segmentation pipeline may be adapted to detect plaques in the coronary arteries and valves. With a linear correction, nnU-Net deep learning methods may also accurately estimate Agatston scores on chest non-contrast CT scans. Compared to manual Agatson scoring, automated Agatston scoring indicated a slope of the linear regression of 0.841 with an intercept of +16 HU (R2 = 0.97). These results are an improvement over previous work assessing automated Agatston score computation in non-gated CT scans.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Weakly Supervised Detection of Pheochromocytomas and Paragangliomas in CT
Authors:
David C. Oluigboa,
Bikash Santra,
Tejas Sudharshan Mathai,
Pritam Mukherjee,
Jianfei Liu,
Abhishek Jha,
Mayank Patel,
Karel Pacak,
Ronald M. Summers
Abstract:
Pheochromocytomas and Paragangliomas (PPGLs) are rare adrenal and extra-adrenal tumors which have the potential to metastasize. For the management of patients with PPGLs, CT is the preferred modality of choice for precise localization and estimation of their progression. However, due to the myriad variations in size, morphology, and appearance of the tumors in different anatomical regions, radiolo…
▽ More
Pheochromocytomas and Paragangliomas (PPGLs) are rare adrenal and extra-adrenal tumors which have the potential to metastasize. For the management of patients with PPGLs, CT is the preferred modality of choice for precise localization and estimation of their progression. However, due to the myriad variations in size, morphology, and appearance of the tumors in different anatomical regions, radiologists are posed with the challenge of accurate detection of PPGLs. Since clinicians also need to routinely measure their size and track their changes over time across patient visits, manual demarcation of PPGLs is quite a time-consuming and cumbersome process. To ameliorate the manual effort spent for this task, we propose an automated method to detect PPGLs in CT studies via a proxy segmentation task. As only weak annotations for PPGLs in the form of prospectively marked 2D bounding boxes on an axial slice were available, we extended these 2D boxes into weak 3D annotations and trained a 3D full-resolution nnUNet model to directly segment PPGLs. We evaluated our approach on a dataset consisting of chest-abdomen-pelvis CTs of 255 patients with confirmed PPGLs. We obtained a precision of 70% and sensitivity of 64.1% with our proposed approach when tested on 53 CT studies. Our findings highlight the promising nature of detecting PPGLs via segmentation, and furthers the state-of-the-art in this exciting yet challenging area of rare cancer management.
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
Automated Classification of Body MRI Sequence Type Using Convolutional Neural Networks
Authors:
Kimberly Helm,
Tejas Sudharshan Mathai,
Boah Kim,
Pritam Mukherjee,
Jianfei Liu,
Ronald M. Summers
Abstract:
Multi-parametric MRI of the body is routinely acquired for the identification of abnormalities and diagnosis of diseases. However, a standard naming convention for the MRI protocols and associated sequences does not exist due to wide variations in imaging practice at institutions and myriad MRI scanners from various manufacturers being used for imaging. The intensity distributions of MRI sequences…
▽ More
Multi-parametric MRI of the body is routinely acquired for the identification of abnormalities and diagnosis of diseases. However, a standard naming convention for the MRI protocols and associated sequences does not exist due to wide variations in imaging practice at institutions and myriad MRI scanners from various manufacturers being used for imaging. The intensity distributions of MRI sequences differ widely as a result, and there also exists information conflicts related to the sequence type in the DICOM headers. At present, clinician oversight is necessary to ensure that the correct sequence is being read and used for diagnosis. This poses a challenge when specific series need to be considered for building a cohort for a large clinical study or for developing AI algorithms. In order to reduce clinician oversight and ensure the validity of the DICOM headers, we propose an automated method to classify the 3D MRI sequence acquired at the levels of the chest, abdomen, and pelvis. In our pilot work, our 3D DenseNet-121 model achieved an F1 score of 99.5% at differentiating 5 common MRI sequences obtained by three Siemens scanners (Aera, Verio, Biograph mMR). To the best of our knowledge, we are the first to develop an automated method for the 3D classification of MRI sequences in the chest, abdomen, and pelvis, and our work has outperformed the previous state-of-the-art MRI series classifiers.
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
Weakly-Supervised Detection of Bone Lesions in CT
Authors:
Tao Sheng,
Tejas Sudharshan Mathai,
Alexander Shieh,
Ronald M. Summers
Abstract:
The skeletal region is one of the common sites of metastatic spread of cancer in the breast and prostate. CT is routinely used to measure the size of lesions in the bones. However, they can be difficult to spot due to the wide variations in their sizes, shapes, and appearances. Precise localization of such lesions would enable reliable tracking of interval changes (growth, shrinkage, or unchanged…
▽ More
The skeletal region is one of the common sites of metastatic spread of cancer in the breast and prostate. CT is routinely used to measure the size of lesions in the bones. However, they can be difficult to spot due to the wide variations in their sizes, shapes, and appearances. Precise localization of such lesions would enable reliable tracking of interval changes (growth, shrinkage, or unchanged status). To that end, an automated technique to detect bone lesions is highly desirable. In this pilot work, we developed a pipeline to detect bone lesions (lytic, blastic, and mixed) in CT volumes via a proxy segmentation task. First, we used the bone lesions that were prospectively marked by radiologists in a few 2D slices of CT volumes and converted them into weak 3D segmentation masks. Then, we trained a 3D full-resolution nnUNet model using these weak 3D annotations to segment the lesions and thereby detected them. Our automated method detected bone lesions in CT with a precision of 96.7% and recall of 47.3% despite the use of incomplete and partial training data. To the best of our knowledge, we are the first to attempt the direct detection of bone lesions in CT via a proxy segmentation task.
△ Less
Submitted 31 January, 2024;
originally announced February 2024.
-
Leveraging Professional Radiologists' Expertise to Enhance LLMs' Evaluation for Radiology Reports
Authors:
Qingqing Zhu,
Xiuying Chen,
Qiao Jin,
Benjamin Hou,
Tejas Sudharshan Mathai,
Pritam Mukherjee,
Xin Gao,
Ronald M Summers,
Zhiyong Lu
Abstract:
In radiology, Artificial Intelligence (AI) has significantly advanced report generation, but automatic evaluation of these AI-produced reports remains challenging. Current metrics, such as Conventional Natural Language Generation (NLG) and Clinical Efficacy (CE), often fall short in capturing the semantic intricacies of clinical contexts or overemphasize clinical details, undermining report clarit…
▽ More
In radiology, Artificial Intelligence (AI) has significantly advanced report generation, but automatic evaluation of these AI-produced reports remains challenging. Current metrics, such as Conventional Natural Language Generation (NLG) and Clinical Efficacy (CE), often fall short in capturing the semantic intricacies of clinical contexts or overemphasize clinical details, undermining report clarity. To overcome these issues, our proposed method synergizes the expertise of professional radiologists with Large Language Models (LLMs), like GPT-3.5 and GPT-4 1. Utilizing In-Context Instruction Learning (ICIL) and Chain of Thought (CoT) reasoning, our approach aligns LLM evaluations with radiologist standards, enabling detailed comparisons between human and AI generated reports. This is further enhanced by a Regression model that aggregates sentence evaluation scores. Experimental results show that our "Detailed GPT-4 (5-shot)" model achieves a 0.48 score, outperforming the METEOR metric by 0.19, while our "Regressed GPT-4" model shows even greater alignment with expert evaluations, exceeding the best existing metric by a 0.35 margin. Moreover, the robustness of our explanations has been validated through a thorough iterative strategy. We plan to publicly release annotations from radiology experts, setting a new standard for accuracy in future assessments. This underscores the potential of our approach in enhancing the quality assessment of AI-driven medical reports.
△ Less
Submitted 16 February, 2024; v1 submitted 29 January, 2024;
originally announced January 2024.
-
Segmentation of Mediastinal Lymph Nodes in CT with Anatomical Priors
Authors:
Tejas Sudharshan Mathai,
Bohan Liu,
Ronald M. Summers
Abstract:
Purpose: Lymph nodes (LNs) in the chest have a tendency to enlarge due to various pathologies, such as lung cancer or pneumonia. Clinicians routinely measure nodal size to monitor disease progression, confirm metastatic cancer, and assess treatment response. However, variations in their shapes and appearances make it cumbersome to identify LNs, which reside outside of most organs. Methods: We prop…
▽ More
Purpose: Lymph nodes (LNs) in the chest have a tendency to enlarge due to various pathologies, such as lung cancer or pneumonia. Clinicians routinely measure nodal size to monitor disease progression, confirm metastatic cancer, and assess treatment response. However, variations in their shapes and appearances make it cumbersome to identify LNs, which reside outside of most organs. Methods: We propose to segment LNs in the mediastinum by leveraging the anatomical priors of 28 different structures (e.g., lung, trachea etc.) generated by the public TotalSegmentator tool. The CT volumes from 89 patients available in the public NIH CT Lymph Node dataset were used to train three 3D nnUNet models to segment LNs. The public St. Olavs dataset containing 15 patients (out-of-training-distribution) was used to evaluate the segmentation performance. Results: For the 15 test patients, the 3D cascade nnUNet model obtained the highest Dice score of 72.2 +- 22.3 for mediastinal LNs with short axis diameter $\geq$ 8mm and 54.8 +- 23.8 for all LNs respectively. These results represent an improvement of 10 points over a current approach that was evaluated on the same test dataset. Conclusion: To our knowledge, we are the first to harness 28 distinct anatomical priors to segment mediastinal LNs, and our work can be extended to other nodal zones in the body. The proposed method has immense potential for improved patient outcomes through the identification of enlarged nodes in initial staging CT scans.
△ Less
Submitted 11 January, 2024;
originally announced January 2024.
-
Enhanced Muscle and Fat Segmentation for CT-Based Body Composition Analysis: A Comparative Study
Authors:
Benjamin Hou,
Tejas Sudharshan Mathai,
Jianfei Liu,
Christopher Parnell,
Ronald M. Summers
Abstract:
Purpose: Body composition measurements from routine abdominal CT can yield personalized risk assessments for asymptomatic and diseased patients. In particular, attenuation and volume measures of muscle and fat are associated with important clinical outcomes, such as cardiovascular events, fractures, and death. This study evaluates the reliability of an Internal tool for the segmentation of muscle…
▽ More
Purpose: Body composition measurements from routine abdominal CT can yield personalized risk assessments for asymptomatic and diseased patients. In particular, attenuation and volume measures of muscle and fat are associated with important clinical outcomes, such as cardiovascular events, fractures, and death. This study evaluates the reliability of an Internal tool for the segmentation of muscle and fat (subcutaneous and visceral) as compared to the well-established public TotalSegmentator tool.
Methods: We assessed the tools across 900 CT series from the publicly available SAROS dataset, focusing on muscle, subcutaneous fat, and visceral fat. The Dice score was employed to assess accuracy in subcutaneous fat and muscle segmentation. Due to the lack of ground truth segmentations for visceral fat, Cohen's Kappa was utilized to assess segmentation agreement between the tools.
Results: Our Internal tool achieved a 3% higher Dice (83.8 vs. 80.8) for subcutaneous fat and a 5% improvement (87.6 vs. 83.2) for muscle segmentation respectively. A Wilcoxon signed-rank test revealed that our results were statistically different with p<0.01. For visceral fat, the Cohen's kappa score of 0.856 indicated near-perfect agreement between the two tools. Our internal tool also showed very strong correlations for muscle volume (R^2=0.99), muscle attenuation (R^2=0.93), and subcutaneous fat volume (R^2=0.99) with a moderate correlation for subcutaneous fat attenuation (R^2=0.45).
Conclusion: Our findings indicated that our Internal tool outperformed TotalSegmentator in measuring subcutaneous fat and muscle. The high Cohen's Kappa score for visceral fat suggests a reliable level of agreement between the two tools. These results demonstrate the potential of our tool in advancing the accuracy of body composition analysis.
△ Less
Submitted 12 April, 2024; v1 submitted 10 January, 2024;
originally announced January 2024.
-
Semantic Image Synthesis for Abdominal CT
Authors:
Yan Zhuang,
Benjamin Hou,
Tejas Sudharshan Mathai,
Pritam Mukherjee,
Boah Kim,
Ronald M. Summers
Abstract:
As a new emerging and promising type of generative models, diffusion models have proven to outperform Generative Adversarial Networks (GANs) in multiple tasks, including image synthesis. In this work, we explore semantic image synthesis for abdominal CT using conditional diffusion models, which can be used for downstream applications such as data augmentation. We systematically evaluated the perfo…
▽ More
As a new emerging and promising type of generative models, diffusion models have proven to outperform Generative Adversarial Networks (GANs) in multiple tasks, including image synthesis. In this work, we explore semantic image synthesis for abdominal CT using conditional diffusion models, which can be used for downstream applications such as data augmentation. We systematically evaluated the performance of three diffusion models, as well as to other state-of-the-art GAN-based approaches, and studied the different conditioning scenarios for the semantic mask. Experimental results demonstrated that diffusion models were able to synthesize abdominal CT images with better quality. Additionally, encoding the mask and the input separately is more effective than naïve concatenating.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
Automated Measurement of Pericoronary Adipose Tissue Attenuation and Volume in CT Angiography
Authors:
Andrew M. Nguyen,
Tejas Sudharshan Mathai,
Liangchen Liu,
Jianfei Liu,
Ronald M. Summers
Abstract:
Pericoronary adipose tissue (PCAT) is the deposition of fat in the vicinity of the coronary arteries. It is an indicator of coronary inflammation and associated with coronary artery disease. Non-invasive coronary CT angiography (CCTA) is presently used to obtain measures of the thickness, volume, and attenuation of fat deposition. However, prior works solely focus on measuring PCAT using semi-auto…
▽ More
Pericoronary adipose tissue (PCAT) is the deposition of fat in the vicinity of the coronary arteries. It is an indicator of coronary inflammation and associated with coronary artery disease. Non-invasive coronary CT angiography (CCTA) is presently used to obtain measures of the thickness, volume, and attenuation of fat deposition. However, prior works solely focus on measuring PCAT using semi-automated approaches at the right coronary artery (RCA) over the left coronary artery (LCA). In this pilot work, we developed a fully automated approach for the measurement of PCAT mean attenuation and volume in the region around both coronary arteries. First, we used a large subset of patients from the public ImageCAS dataset (n = 735) to train a 3D full resolution nnUNet to segment LCA and RCA. Then, we automatically measured PCAT in the surrounding arterial regions. We evaluated our method on a held-out test set of patients (n = 183) from the same dataset. A mean Dice score of 83% and PCAT attenuation of -73.81 $\pm$ 12.69 HU was calculated for the RCA, while a mean Dice score of 81% and PCAT attenuation of -77.51 $\pm$ 7.94 HU was computed for the LCA. To the best of our knowledge, we are the first to develop a fully automated method to measure PCAT attenuation and volume at both the RCA and LCA. Our work underscores how automated PCAT measurement holds promise as a biomarker for identification of inflammation and cardiac disease.
△ Less
Submitted 21 November, 2023;
originally announced November 2023.
-
Utilizing Longitudinal Chest X-Rays and Reports to Pre-Fill Radiology Reports
Authors:
Qingqing Zhu,
Tejas Sudharshan Mathai,
Pritam Mukherjee,
Yifan Peng,
Ronald M. Summers,
Zhiyong Lu
Abstract:
Despite the reduction in turn-around times in radiology reports with the use of speech recognition software, persistent communication errors can significantly impact the interpretation of the radiology report. Pre-filling a radiology report holds promise in mitigating reporting errors, and despite efforts in the literature to generate medical reports, there exists a lack of approaches that exploit…
▽ More
Despite the reduction in turn-around times in radiology reports with the use of speech recognition software, persistent communication errors can significantly impact the interpretation of the radiology report. Pre-filling a radiology report holds promise in mitigating reporting errors, and despite efforts in the literature to generate medical reports, there exists a lack of approaches that exploit the longitudinal nature of patient visit records in the MIMIC-CXR dataset. To address this gap, we propose to use longitudinal multi-modal data, i.e., previous patient visit CXR, current visit CXR, and previous visit report, to pre-fill the 'findings' section of a current patient visit report. We first gathered the longitudinal visit information for 26,625 patients from the MIMIC-CXR dataset and created a new dataset called Longitudinal-MIMIC. With this new dataset, a transformer-based model was trained to capture the information from longitudinal patient visit records containing multi-modal data (CXR images + reports) via a cross-attention-based multi-modal fusion module and a hierarchical memory-driven decoder. In contrast to previous work that only uses current visit data as input to train a model, our work exploits the longitudinal information available to pre-fill the 'findings' section of radiology reports. Experiments show that our approach outperforms several recent approaches. Code will be published at https://github.com/CelestialShine/Longitudinal-Chest-X-Ray.
△ Less
Submitted 10 October, 2023; v1 submitted 14 June, 2023;
originally announced June 2023.
-
Universal Lymph Node Detection in T2 MRI using Neural Networks
Authors:
Tejas Sudharshan Mathai,
Sungwon Lee,
Thomas C. Shen,
Zhiyong Lu,
Ronald M. Summers
Abstract:
Purpose: Identification of abdominal Lymph Nodes (LN) that are suspicious for metastasis in T2 Magnetic Resonance Imaging (MRI) scans is critical for staging of lymphoproliferative diseases. Prior work on LN detection has been limited to specific anatomical regions of the body (pelvis, rectum) in single MR slices. Therefore, the development of a universal approach to detect LN in full T2 MRI volum…
▽ More
Purpose: Identification of abdominal Lymph Nodes (LN) that are suspicious for metastasis in T2 Magnetic Resonance Imaging (MRI) scans is critical for staging of lymphoproliferative diseases. Prior work on LN detection has been limited to specific anatomical regions of the body (pelvis, rectum) in single MR slices. Therefore, the development of a universal approach to detect LN in full T2 MRI volumes is highly desirable.
Methods: In this study, a Computer Aided Detection (CAD) pipeline to universally identify abdominal LN in volumetric T2 MRI using neural networks is proposed. First, we trained various neural network models for detecting LN: Faster RCNN with and without Hard Negative Example Mining (HNEM), FCOS, FoveaBox, VFNet, and Detection Transformer (DETR). Next, we show that the state-of-the-art (SOTA) VFNet model with Adaptive Training Sample Selection (ATSS) outperforms Faster RCNN with HNEM. Finally, we ensembled models that surpassed a 45% mAP threshold. We found that the VFNet model and one-stage model ensemble can be interchangeably used in the CAD pipeline.
Results: Experiments on 122 test T2 MRI volumes revealed that VFNet achieved a 51.1% mAP and 78.7% recall at 4 false positives (FP) per volume, while the one-stage model ensemble achieved a mAP of 52.3% and sensitivity of 78.7% at 4FP.
Conclusion: Our contribution is a CAD pipeline that detects LN in T2 MRI volumes, resulting in a sensitivity improvement of $\sim$14 points over the current SOTA method for LN detection (sensitivity of 78.7% at 4 FP vs. 64.6% at 5 FP per volume).
△ Less
Submitted 31 March, 2022;
originally announced April 2022.
-
Universal Lesion Detection in CT Scans using Neural Network Ensembles
Authors:
Tarun Mattikalli,
Tejas Sudharshan Mathai,
Ronald M. Summers
Abstract:
In clinical practice, radiologists are reliant on the lesion size when distinguishing metastatic from non-metastatic lesions. A prerequisite for lesion sizing is their detection, as it promotes the downstream assessment of tumor spread. However, lesions vary in their size and appearance in CT scans, and radiologists often miss small lesions during a busy clinical day. To overcome these challenges,…
▽ More
In clinical practice, radiologists are reliant on the lesion size when distinguishing metastatic from non-metastatic lesions. A prerequisite for lesion sizing is their detection, as it promotes the downstream assessment of tumor spread. However, lesions vary in their size and appearance in CT scans, and radiologists often miss small lesions during a busy clinical day. To overcome these challenges, we propose the use of state-of-the-art detection neural networks to flag suspicious lesions present in the NIH DeepLesion dataset for sizing. Additionally, we incorporate a bounding box fusion technique to minimize false positives (FP) and improve detection accuracy. Finally, to resemble clinical usage, we constructed an ensemble of the best detection models to localize lesions for sizing with a precision of 65.17% and sensitivity of 91.67% at 4 FP per image. Our results improve upon or maintain the performance of current state-of-the-art methods for lesion detection in challenging CT scans.
△ Less
Submitted 10 November, 2021; v1 submitted 8 November, 2021;
originally announced November 2021.
-
Lymph Node Detection in T2 MRI with Transformers
Authors:
Tejas Sudharshan Mathai,
Sungwon Lee,
Daniel C. Elton,
Thomas C. Shen,
Yifan Peng,
Zhiyong Lu,
Ronald M. Summers
Abstract:
Identification of lymph nodes (LN) in T2 Magnetic Resonance Imaging (MRI) is an important step performed by radiologists during the assessment of lymphoproliferative diseases. The size of the nodes play a crucial role in their staging, and radiologists sometimes use an additional contrast sequence such as diffusion weighted imaging (DWI) for confirmation. However, lymph nodes have diverse appearan…
▽ More
Identification of lymph nodes (LN) in T2 Magnetic Resonance Imaging (MRI) is an important step performed by radiologists during the assessment of lymphoproliferative diseases. The size of the nodes play a crucial role in their staging, and radiologists sometimes use an additional contrast sequence such as diffusion weighted imaging (DWI) for confirmation. However, lymph nodes have diverse appearances in T2 MRI scans, making it tough to stage for metastasis. Furthermore, radiologists often miss smaller metastatic lymph nodes over the course of a busy day. To deal with these issues, we propose to use the DEtection TRansformer (DETR) network to localize suspicious metastatic lymph nodes for staging in challenging T2 MRI scans acquired by different scanners and exam protocols. False positives (FP) were reduced through a bounding box fusion technique, and a precision of 65.41\% and sensitivity of 91.66\% at 4 FP per image was achieved. To the best of our knowledge, our results improve upon the current state-of-the-art for lymph node detection in T2 MRI scans.
△ Less
Submitted 8 November, 2021;
originally announced November 2021.
-
A Study of Domain Generalization on Ultrasound-based Multi-Class Segmentation of Arteries, Veins, Ligaments, and Nerves Using Transfer Learning
Authors:
Edward Chen,
Tejas Sudharshan Mathai,
Vinit Sarode,
Howie Choset,
John Galeotti
Abstract:
Identifying landmarks in the femoral area is crucial for ultrasound (US) -based robot-guided catheter insertion, and their presentation varies when imaged with different scanners. As such, the performance of past deep learning-based approaches is also narrowly limited to the training data distribution; this can be circumvented by fine-tuning all or part of the model, yet the effects of fine-tuning…
▽ More
Identifying landmarks in the femoral area is crucial for ultrasound (US) -based robot-guided catheter insertion, and their presentation varies when imaged with different scanners. As such, the performance of past deep learning-based approaches is also narrowly limited to the training data distribution; this can be circumvented by fine-tuning all or part of the model, yet the effects of fine-tuning are seldom discussed. In this work, we study the US-based segmentation of multiple classes through transfer learning by fine-tuning different contiguous blocks within the model, and evaluating on a gamut of US data from different scanners and settings. We propose a simple method for predicting generalization on unseen datasets and observe statistically significant differences between the fine-tuning methods while working towards domain generalization.
△ Less
Submitted 13 November, 2020;
originally announced November 2020.
-
Assessing Lesion Segmentation Bias of Neural Networks on Motion Corrupted Brain MRI
Authors:
Tejas Sudharshan Mathai,
Yi Wang,
Nathan Cross
Abstract:
Patient motion during the magnetic resonance imaging (MRI) acquisition process results in motion artifacts, which limits the ability of radiologists to provide a quantitative assessment of a condition visualized. Often times, radiologists either "see through" the artifacts with reduced diagnostic confidence, or the MR scans are rejected and patients are asked to be recalled and re-scanned. Present…
▽ More
Patient motion during the magnetic resonance imaging (MRI) acquisition process results in motion artifacts, which limits the ability of radiologists to provide a quantitative assessment of a condition visualized. Often times, radiologists either "see through" the artifacts with reduced diagnostic confidence, or the MR scans are rejected and patients are asked to be recalled and re-scanned. Presently, there are many published approaches that focus on MRI artifact detection and correction. However, the key question of the bias exhibited by these algorithms on motion corrupted MRI images is still unanswered. In this paper, we seek to quantify the bias in terms of the impact that different levels of motion artifacts have on the performance of neural networks engaged in a lesion segmentation task. Additionally, we explore the effect of a different learning strategy, curriculum learning, on the segmentation performance. Our results suggest that a network trained using curriculum learning is effective at compensating for different levels of motion artifacts, and improved the segmentation performance by ~9%-15% (p < 0.05) when compared against a conventional shuffled learning strategy on the same motion data. Within each motion category, it either improved or maintained the dice score. To the best of our knowledge, we are the first to quantitatively assess the segmentation bias on various levels of motion artifacts present in a brain MRI image.
△ Less
Submitted 12 October, 2020;
originally announced October 2020.
-
Accurate Tissue Interface Segmentation via Adversarial Pre-Segmentation of Anterior Segment OCT Images
Authors:
Jiahong Ouyang,
Tejas Sudharshan Mathai,
Kira Lathrop,
John Galeotti
Abstract:
Optical Coherence Tomography (OCT) is an imaging modality that has been widely adopted for visualizing corneal, retinal and limbal tissue structure with micron resolution. It can be used to diagnose pathological conditions of the eye, and for developing pre-operative surgical plans. In contrast to the posterior retina, imaging the anterior tissue structures, such as the limbus and cornea, results…
▽ More
Optical Coherence Tomography (OCT) is an imaging modality that has been widely adopted for visualizing corneal, retinal and limbal tissue structure with micron resolution. It can be used to diagnose pathological conditions of the eye, and for developing pre-operative surgical plans. In contrast to the posterior retina, imaging the anterior tissue structures, such as the limbus and cornea, results in B-scans that exhibit increased speckle noise patterns and imaging artifacts. These artifacts, such as shadowing and specularity, pose a challenge during the analysis of the acquired volumes as they substantially obfuscate the location of tissue interfaces. To deal with the artifacts and speckle noise patterns and accurately segment the shallowest tissue interface, we propose a cascaded neural network framework, which comprises of a conditional Generative Adversarial Network (cGAN) and a Tissue Interface Segmentation Network (TISN). The cGAN pre-segments OCT B-scans by removing undesired specular artifacts and speckle noise patterns just above the shallowest tissue interface, and the TISN combines the original OCT image with the pre-segmentation to segment the shallowest interface. We show the applicability of the cascaded framework to corneal datasets, demonstrate that it precisely segments the shallowest corneal interface, and also show its generalization capacity to limbal datasets. We also propose a hybrid framework, wherein the cGAN pre-segmentation is passed to a traditional image analysis-based segmentation algorithm, and describe the improved segmentation performance. To the best of our knowledge, this is the first approach to remove severe specular artifacts and speckle noise patterns (prior to the shallowest interface) that affects the interpretation of anterior segment OCT datasets, thereby resulting in the accurate segmentation of the shallowest tissue interface.
△ Less
Submitted 7 May, 2019;
originally announced May 2019.
-
Learning to Segment Corneal Tissue Interfaces in OCT Images
Authors:
Tejas Sudharshan Mathai,
Kira Lathrop,
John Galeotti
Abstract:
Accurate and repeatable delineation of corneal tissue interfaces is necessary for surgical planning during anterior segment interventions, such as Keratoplasty. Designing an approach to identify interfaces, which generalizes to datasets acquired from different Optical Coherence Tomographic (OCT) scanners, is paramount. In this paper, we present a Convolutional Neural Network (CNN) based framework…
▽ More
Accurate and repeatable delineation of corneal tissue interfaces is necessary for surgical planning during anterior segment interventions, such as Keratoplasty. Designing an approach to identify interfaces, which generalizes to datasets acquired from different Optical Coherence Tomographic (OCT) scanners, is paramount. In this paper, we present a Convolutional Neural Network (CNN) based framework called CorNet that can accurately segment three corneal interfaces across datasets obtained with different scan settings from different OCT scanners. Extensive validation of the approach was conducted across all imaged datasets. To the best of our knowledge, this is the first deep learning based approach to segment both anterior and posterior corneal tissue interfaces. Our errors are 2x lower than non-proprietary state-of-the-art corneal tissue interface segmentation algorithms, which include image analysis-based and deep learning approaches.
△ Less
Submitted 25 January, 2019; v1 submitted 15 October, 2018;
originally announced October 2018.
-
Fast Vessel Segmentation and Tracking in Ultra High-Frequency Ultrasound Images
Authors:
Tejas Sudharshan Mathai,
Lingbo Jin,
Vijay Gorantla,
John Galeotti
Abstract:
Ultra High Frequency Ultrasound (UHFUS) enables the visualization of highly deformable small and medium vessels in the hand. Intricate vessel-based measurements, such as intimal wall thickness and vessel wall compliance, require sub-millimeter vessel tracking between B-scans. Our fast GPU-based approach combines the advantages of local phase analysis, a distance-regularized level set, and an Exten…
▽ More
Ultra High Frequency Ultrasound (UHFUS) enables the visualization of highly deformable small and medium vessels in the hand. Intricate vessel-based measurements, such as intimal wall thickness and vessel wall compliance, require sub-millimeter vessel tracking between B-scans. Our fast GPU-based approach combines the advantages of local phase analysis, a distance-regularized level set, and an Extended Kalman Filter (EKF), to rapidly segment and track the deforming vessel contour. We validated on 35 UHFUS sequences of vessels in the hand, and we show the transferability of the approach to 5 more diverse datasets acquired by a traditional High Frequency Ultrasound (HFUS) machine. To the best of our knowledge, this is the first algorithm capable of rapidly segmenting and tracking deformable vessel contours in 2D UHFUS images. It is also the fastest and most accurate system for 2D HFUS images.
△ Less
Submitted 23 July, 2018;
originally announced July 2018.