subscribe to arXiv mailings

doi 10.1148/ryai.230601

Deep Learning Segmentation of Ascites on Abdominal CT Scans for Automatic Volume Quantification

Authors: Benjamin Hou, Sung-Won Lee, Jung-Min Lee, Christopher Koh, Jing Xiao, Perry J. Pickhardt, Ronald M. Summers

Abstract: Purpose: To evaluate the performance of an automated deep learning method in detecting ascites and subsequently quantifying its volume in patients with liver cirrhosis and ovarian cancer. Materials and Methods: This retrospective study included contrast-enhanced and non-contrast abdominal-pelvic CT scans of patients with cirrhotic ascites and patients with ovarian cancer from two institutions, N… ▽ More Purpose: To evaluate the performance of an automated deep learning method in detecting ascites and subsequently quantifying its volume in patients with liver cirrhosis and ovarian cancer. Materials and Methods: This retrospective study included contrast-enhanced and non-contrast abdominal-pelvic CT scans of patients with cirrhotic ascites and patients with ovarian cancer from two institutions, National Institutes of Health (NIH) and University of Wisconsin (UofW). The model, trained on The Cancer Genome Atlas Ovarian Cancer dataset (mean age, 60 years +/- 11 [s.d.]; 143 female), was tested on two internal (NIH-LC and NIH-OV) and one external dataset (UofW-LC). Its performance was measured by the Dice coefficient, standard deviations, and 95% confidence intervals, focusing on ascites volume in the peritoneal cavity. Results: On NIH-LC (25 patients; mean age, 59 years +/- 14 [s.d.]; 14 male) and NIH-OV (166 patients; mean age, 65 years +/- 9 [s.d.]; all female), the model achieved Dice scores of 0.855 +/- 0.061 (CI: 0.831-0.878) and 0.826 +/- 0.153 (CI: 0.764-0.887), with median volume estimation errors of 19.6% (IQR: 13.2-29.0) and 5.3% (IQR: 2.4-9.7) respectively. On UofW-LC (124 patients; mean age, 46 years +/- 12 [s.d.]; 73 female), the model had a Dice score of 0.830 +/- 0.107 (CI: 0.798-0.863) and median volume estimation error of 9.7% (IQR: 4.5-15.1). The model showed strong agreement with expert assessments, with r^2 values of 0.79, 0.98, and 0.97 across the test sets. Conclusion: The proposed deep learning method performed well in segmenting and quantifying the volume of ascites in concordance with expert radiologist assessments. △ Less

Submitted 22 June, 2024; originally announced June 2024.

arXiv:2406.10242 [pdf, other]

Physics-Informed Critic in an Actor-Critic Reinforcement Learning for Swimming in Turbulence

Authors: Christopher Koh, Laurent Pagnier, Michael Chertkov

Abstract: Turbulent diffusion causes particles placed in proximity to separate. We investigate the required swimming efforts to maintain a particle close to its passively advected counterpart. We explore optimally balancing these efforts with the intended goal by developing and comparing a novel Physics-Informed Reinforcement Learning (PIRL) strategy with prescribed control (PC) and standard physics-agnosti… ▽ More Turbulent diffusion causes particles placed in proximity to separate. We investigate the required swimming efforts to maintain a particle close to its passively advected counterpart. We explore optimally balancing these efforts with the intended goal by developing and comparing a novel Physics-Informed Reinforcement Learning (PIRL) strategy with prescribed control (PC) and standard physics-agnostic Reinforcement Learning strategies. Our PIRL scheme, coined the Actor-Physicist, is an adaptation of the Actor-Critic algorithm in which the Neural Network parameterized Critic is replaced with an analytically derived physical heuristic function (the physicist). This strategy is then compared with an analytically computed optimal PC policy derived from a stochastic optimal control formulation and standard physics-agnostic Actor-Critic type algorithms. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: 23 pages, 6 figures

arXiv:2402.01741 [pdf]

Development and Testing of a Novel Large Language Model-Based Clinical Decision Support Systems for Medication Safety in 12 Clinical Specialties

Authors: Jasmine Chiat Ling Ong, Liyuan Jin, Kabilan Elangovan, Gilbert Yong San Lim, Daniel Yan Zheng Lim, Gerald Gui Ren Sng, Yuhe Ke, Joshua Yi Min Tung, Ryan Jian Zhong, Christopher Ming Yao Koh, Keane Zhi Hao Lee, Xiang Chen, Jack Kian Chng, Aung Than, Ken Junyang Goh, Daniel Shu Wei Ting

Abstract: Importance: We introduce a novel Retrieval Augmented Generation (RAG)-Large Language Model (LLM) framework as a Clinical Decision Support Systems (CDSS) to support safe medication prescription. Objective: To evaluate the efficacy of LLM-based CDSS in correctly identifying medication errors in different patient case vignettes from diverse medical and surgical sub-disciplines, against a human expe… ▽ More Importance: We introduce a novel Retrieval Augmented Generation (RAG)-Large Language Model (LLM) framework as a Clinical Decision Support Systems (CDSS) to support safe medication prescription. Objective: To evaluate the efficacy of LLM-based CDSS in correctly identifying medication errors in different patient case vignettes from diverse medical and surgical sub-disciplines, against a human expert panel derived ground truth. We compared performance for under 2 different CDSS practical healthcare integration modalities: LLM-based CDSS alone (fully autonomous mode) vs junior pharmacist + LLM-based CDSS (co-pilot, assistive mode). Design, Setting, and Participants: Utilizing a RAG model with state-of-the-art medically-related LLMs (GPT-4, Gemini Pro 1.0 and Med-PaLM 2), this study used 61 prescribing error scenarios embedded into 23 complex clinical vignettes across 12 different medical and surgical specialties. A multidisciplinary expert panel assessed these cases for Drug-Related Problems (DRPs) using the PCNE classification and graded severity / potential for harm using revised NCC MERP medication error index. We compared. Results RAG-LLM performed better compared to LLM alone. When employed in a co-pilot mode, accuracy, recall, and F1 scores were optimized, indicating effectiveness in identifying moderate to severe DRPs. The accuracy of DRP detection with RAG-LLM improved in several categories but at the expense of lower precision. Conclusions This study established that a RAG-LLM based CDSS significantly boosts the accuracy of medication error identification when used alongside junior pharmacists (co-pilot), with notable improvements in detecting severe DRPs. This study also illuminates the comparative performance of current state-of-the-art LLMs in RAG-based CDSS systems. △ Less

Submitted 17 February, 2024; v1 submitted 29 January, 2024; originally announced February 2024.

arXiv:2312.11805 [pdf, other]

Gemini: A Family of Highly Capable Multimodal Models

Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI. △ Less

Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

arXiv:2312.04134 [pdf]

Using a Large Language Model to generate a Design Structure Matrix

Authors: Edwin C. Y. Koh

Abstract: The Design Structure Matrix (DSM) is an established method used in dependency modelling, especially in the design of complex engineering systems. The generation of DSM is traditionally carried out through manual means and can involve interviewing experts to elicit critical system elements and the relationships between them. Such manual approaches can be time-consuming and costly. This paper presen… ▽ More The Design Structure Matrix (DSM) is an established method used in dependency modelling, especially in the design of complex engineering systems. The generation of DSM is traditionally carried out through manual means and can involve interviewing experts to elicit critical system elements and the relationships between them. Such manual approaches can be time-consuming and costly. This paper presents a workflow that uses a Large Language Model (LLM) to support the generation of DSM and improve productivity. A prototype of the workflow was developed in this work and applied on a diesel engine DSM published previously. It was found that the prototype could reproduce 357 out of 462 DSM entries published (i.e. 77.3%), suggesting that the work can aid DSM generation. A no-code version of the prototype is made available online to support future research. △ Less

Submitted 7 December, 2023; originally announced December 2023.

Comments: 16 pages, 7 Figures, 6 Tables

arXiv:2210.13399 [pdf, other]

doi 10.1145/3491102.3517595

Does Mode of Digital Contact Tracing Affect User Willingness to Share Information? A Quantitative Study

Authors: Camellia Zakaria, Pin Sym Foong, Chang Siang Lim, Pavithren V. S. Pakianathan, Gerald Huat Choon Koh, Simon Tangi Perrault

Abstract: Digital contact tracing can limit the spread of infectious diseases. Nevertheless, there remain barriers to attaining sufficient adoption. In this study, we investigate how willingness to participate in contact tracing is affected by two critical factors: the modes of data collection and the type of data collected. We conducted a scenario-based survey study among 220 respondents in the United Stat… ▽ More Digital contact tracing can limit the spread of infectious diseases. Nevertheless, there remain barriers to attaining sufficient adoption. In this study, we investigate how willingness to participate in contact tracing is affected by two critical factors: the modes of data collection and the type of data collected. We conducted a scenario-based survey study among 220 respondents in the United States (U.S.) to understand their perceptions about contact tracing associated with automated and manual contact tracing methods. The findings indicate a promising use of smartphones and a combination of public health officials and medical health records as information sources. Through a quantitative analysis, we describe how different modalities and individual demographic factors may affect user compliance in providing four key pieces of information to contact tracing. △ Less

Submitted 24 October, 2022; originally announced October 2022.

Comments: 18 pages, 11 figures, 13 tables

Journal ref: In CHI Conference on Human Factors in Computing Systems, pp. 1-18. 2022

arXiv:2210.06160 [pdf, other]

doi 10.5220/0010996200003124

RTSDF: Real-time Signed Distance Fields for Soft Shadow Approximation in Games

Authors: Yu Wei Tan, Nicholas Chua, Clarence Koh, Anand Bhojan

Abstract: Signed distance fields (SDFs) are a form of surface representation widely used in computer graphics, having applications in rendering, collision detection and modelling. In interactive media such as games, high-resolution SDFs are commonly produced offline and subsequently loaded into the application, representing rigid meshes only. This work develops a novel technique that combines jump flooding… ▽ More Signed distance fields (SDFs) are a form of surface representation widely used in computer graphics, having applications in rendering, collision detection and modelling. In interactive media such as games, high-resolution SDFs are commonly produced offline and subsequently loaded into the application, representing rigid meshes only. This work develops a novel technique that combines jump flooding and ray tracing to generate approximate SDFs in real-time. Our approach can produce relatively accurate scene representation for rendering soft shadows while maintaining interactive frame rates. We extend our previous work with details on the design and implementation as well as visual quality and performance evaluation of the technique. △ Less

Submitted 11 October, 2022; originally announced October 2022.

ACM Class: I.3

arXiv:2210.04449 [pdf, other]

doi 10.2312/pg.20201232

RTSDF: Generating Signed Distance Fields in Real Time for Soft Shadow Rendering

Authors: Yu Wei Tan, Nicholas Chua, Clarence Koh, Anand Bhojan

Abstract: Signed Distance Fields (SDFs) for surface representation are commonly generated offline and subsequently loaded into interactive applications like games. Since they are not updated every frame, they only provide a rigid surface representation. While there are methods to generate them quickly on GPU, the efficiency of these approaches is limited at high resolutions. This paper showcases a novel tec… ▽ More Signed Distance Fields (SDFs) for surface representation are commonly generated offline and subsequently loaded into interactive applications like games. Since they are not updated every frame, they only provide a rigid surface representation. While there are methods to generate them quickly on GPU, the efficiency of these approaches is limited at high resolutions. This paper showcases a novel technique that combines jump flooding and ray tracing to generate approximate SDFs in real-time for soft shadow approximation, achieving prominent shadow penumbras while maintaining interactive frame rates. △ Less

Submitted 10 October, 2022; originally announced October 2022.

ACM Class: I.3

Journal ref: Pacific Graphics Short Papers, Posters, and Work-in-Progress Papers (2020)

arXiv:2111.08572 [pdf, other]

Saath: Speeding up CoFlows by Exploiting the Spatial Dimension

Authors: Akshay Jajoo, Rohan Gandhi, Y. Charlie Hu, Cheng-Kok Koh

Abstract: Coflow scheduling improves data-intensive application performance by improving their networking performance. State-of-the-art Coflow schedulers in essence approximate the classic online Shortest-Job-First (SJF) scheduling, designed for a single CPU, in a distributed setting, with no coordination among how the flows of a Coflow at individual ports are scheduled, and as a result suffer two performan… ▽ More Coflow scheduling improves data-intensive application performance by improving their networking performance. State-of-the-art Coflow schedulers in essence approximate the classic online Shortest-Job-First (SJF) scheduling, designed for a single CPU, in a distributed setting, with no coordination among how the flows of a Coflow at individual ports are scheduled, and as a result suffer two performance drawbacks: (1) The flows of a Coflow may suffer the out-of-sync problem -- they may be scheduled at different times and become drifting apart, negatively affecting the Coflow completion time (CCT); (2) FIFO scheduling of flows at each port bears no notion of SJF, leading to suboptimal CCT. We propose SAATH, an online Coflow scheduler that overcomes the above drawbacks by explicitly exploiting the spatial dimension of Coflows. In SAATH, the global scheduler schedules the flows of a Coflow using an all-or-none policy which mitigates the out-of-sync problem. To order the Coflows within each queue, SAATH resorts to a Least-Contention-First (LCoF) policy which we show extends the gist of SJF to the spatial dimension, complemented with starvation freedom. Our evaluation using an Azure testbed and simulations of two production cluster traces show that compared to Aalo, SAATH reduces the CCT in median (P90) cases by 1.53x (4.5x) and 1.42x (37x), respectively. △ Less

Submitted 16 November, 2021; originally announced November 2021.

arXiv:2105.06887 [pdf]

A Frequency Domain Constraint for Synthetic and Real X-ray Image Super Resolution

Authors: Qing Ma, Jae Chul Koh, WonSook Lee

Abstract: Synthetic X-ray images are simulated X-ray images projected from CT data. High-quality synthetic X-ray images can facilitate various applications such as surgical image guidance systems and VR training simulations. However, it is difficult to produce high-quality arbitrary view synthetic X-ray images in real-time due to different CT slice thickness, high computational cost, and the complexity of a… ▽ More Synthetic X-ray images are simulated X-ray images projected from CT data. High-quality synthetic X-ray images can facilitate various applications such as surgical image guidance systems and VR training simulations. However, it is difficult to produce high-quality arbitrary view synthetic X-ray images in real-time due to different CT slice thickness, high computational cost, and the complexity of algorithms. Our goal is to generate high-resolution synthetic X-ray images in real-time by upsampling low-resolution images with deep learning-based super-resolution methods. Reference-based Super Resolution (RefSR) has been well studied in recent years and has shown higher performance than traditional Single Image Super-Resolution (SISR). It can produce fine details by utilizing the reference image but still inevitably generates some artifacts and noise. In this paper, we introduce frequency domain loss as a constraint to further improve the quality of the RefSR results with fine details and without obvious artifacts. To the best of our knowledge, this is the first paper utilizing the frequency domain for the loss functions in the field of super-resolution. We achieved good results in evaluating our method on both synthetic and real X-ray image datasets. △ Less

Submitted 10 August, 2021; v1 submitted 14 May, 2021; originally announced May 2021.

arXiv:2002.12588 [pdf, other]

Regional Registration of Whole Slide Image Stacks Containing Highly Deformed Artefacts

Authors: Mahsa Paknezhad, Sheng Yang Michael Loh, Yukti Choudhury, Valerie Koh Cui Koh, TimothyTay Kwang Yong, Hui Shan Tan, Ravindran Kanesvaran, Puay Hoon Tan, John Yuen Shyi Peng, Weimiao Yu, Yongcheng Benjamin Tan, Yong Zhen Loy, Min-Han Tan, Hwee Kuan Lee

Abstract: Motivation: High resolution 2D whole slide imaging provides rich information about the tissue structure. This information can be a lot richer if these 2D images can be stacked into a 3D tissue volume. A 3D analysis, however, requires accurate reconstruction of the tissue volume from the 2D image stack. This task is not trivial due to the distortions that each individual tissue slice experiences wh… ▽ More Motivation: High resolution 2D whole slide imaging provides rich information about the tissue structure. This information can be a lot richer if these 2D images can be stacked into a 3D tissue volume. A 3D analysis, however, requires accurate reconstruction of the tissue volume from the 2D image stack. This task is not trivial due to the distortions that each individual tissue slice experiences while cutting and mounting the tissue on the glass slide. Performing registration for the whole tissue slices may be adversely affected by the deformed tissue regions. Consequently, regional registration is found to be more effective. In this paper, we propose an accurate and robust regional registration algorithm for whole slide images which incrementally focuses registration on the area around the region of interest. Results: Using mean similarity index as the metric, the proposed algorithm (mean $\pm$ std: $0.84 \pm 0.11$) followed by a fine registration algorithm ($0.86 \pm 0.08$) outperformed the state-of-the-art linear whole tissue registration algorithm ($0.74 \pm 0.19$) and the regional version of this algorithm ($0.81 \pm 0.15$). The proposed algorithm also outperforms the state-of-the-art nonlinear registration algorithm (original : $0.82 \pm 0.12$, regional : $0.77 \pm 0.22$) for whole slide images and a recently proposed patch-based registration algorithm (patch size 256: $0.79 \pm 0.16$ , patch size 512: $0.77 \pm 0.16$) for medical images. Availability: The C++ implementation code is available online at the github repository: https://github.com/MahsaPaknezhad/WSIRegistration △ Less

Submitted 28 February, 2020; originally announced February 2020.

arXiv:2002.04455 [pdf]

HRINet: Alternative Supervision Network for High-resolution CT image Interpolation

Authors: Jiawei Li, Jae Chul Koh, Won-Sook Lee

Abstract: Image interpolation in medical area is of high importance as most 3D biomedical volume images are sampled where the distance between consecutive slices significantly greater than the in-plane pixel size due to radiation dose or scanning time. Image interpolation creates a number of new slices between known slices in order to obtain an isotropic volume image. The results can be used for the higher… ▽ More Image interpolation in medical area is of high importance as most 3D biomedical volume images are sampled where the distance between consecutive slices significantly greater than the in-plane pixel size due to radiation dose or scanning time. Image interpolation creates a number of new slices between known slices in order to obtain an isotropic volume image. The results can be used for the higher quality of 3D reconstruction and visualization of human body structures. Semantic interpolation on the manifold has been proved to be very useful for smoothing image interpolation. Nevertheless, all previous methods focused on low-resolution image interpolation, and most of them work poorly on high-resolution image. We propose a novel network, High Resolution Interpolation Network (HRINet), aiming at producing high-resolution CT image interpolations. We combine the idea of ACAI and GANs, and propose a novel idea of alternative supervision method by applying supervised and unsupervised training alternatively to raise the accuracy of human organ structures in CT while keeping high quality. We compare an MSE based and a perceptual based loss optimizing methods for high quality interpolation, and show the tradeoff between the structural correctness and sharpness. Our experiments show the great improvement on 256 2 and 5122 images quantitatively and qualitatively. △ Less

Submitted 7 June, 2020; v1 submitted 11 February, 2020; originally announced February 2020.

Showing 1–12 of 12 results for author: Koh, C