subscribe to arXiv mailings

arXiv:2407.06658 [pdf, other]

TriQXNet: Forecasting Dst Index from Solar Wind Data Using an Interpretable Parallel Classical-Quantum Framework with Uncertainty Quantification

Authors: Md Abrar Jahin, M. F. Mridha, Zeyar Aung, Nilanjan Dey, R. Simon Sherratt

Abstract: Geomagnetic storms, caused by solar wind energy transfer to Earth's magnetic field, can disrupt critical infrastructure like GPS, satellite communications, and power grids. The disturbance storm-time (Dst) index measures storm intensity. Despite advancements in empirical, physics-based, and machine-learning models using real-time solar wind data, accurately forecasting extreme geomagnetic events r… ▽ More Geomagnetic storms, caused by solar wind energy transfer to Earth's magnetic field, can disrupt critical infrastructure like GPS, satellite communications, and power grids. The disturbance storm-time (Dst) index measures storm intensity. Despite advancements in empirical, physics-based, and machine-learning models using real-time solar wind data, accurately forecasting extreme geomagnetic events remains challenging due to noise and sensor failures. This research introduces TriQXNet, a novel hybrid classical-quantum neural network for Dst forecasting. Our model integrates classical and quantum computing, conformal prediction, and explainable AI (XAI) within a hybrid architecture. To ensure high-quality input data, we developed a comprehensive preprocessing pipeline that included feature selection, normalization, aggregation, and imputation. TriQXNet processes preprocessed solar wind data from NASA's ACE and NOAA's DSCOVR satellites, predicting the Dst index for the current hour and the next, providing vital advance notice to mitigate geomagnetic storm impacts. TriQXNet outperforms 13 state-of-the-art hybrid deep-learning models, achieving a root mean squared error of 9.27 nanoteslas (nT). Rigorous evaluation through 10-fold cross-validated paired t-tests confirmed its superior performance with 95% confidence. Conformal prediction techniques provide quantifiable uncertainty, which is essential for operational decisions, while XAI methods like ShapTime enhance interpretability. Comparative analysis shows TriQXNet's superior forecasting accuracy, setting a new level of expectations for geomagnetic storm prediction and highlighting the potential of classical-quantum hybrid models in space weather forecasting. △ Less

Submitted 10 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

arXiv:2405.15743 [pdf, other]

Sparse maximal update parameterization: A holistic approach to sparse training dynamics

Authors: Nolan Dey, Shane Bergsma, Joel Hestness

Abstract: Several challenges make it difficult for sparse neural networks to compete with dense models. First, setting a large fraction of weights to zero impairs forward and gradient signal propagation. Second, sparse studies often need to test multiple sparsity levels, while also introducing new hyperparameters (HPs), leading to prohibitive tuning costs. Indeed, the standard practice is to re-use the lear… ▽ More Several challenges make it difficult for sparse neural networks to compete with dense models. First, setting a large fraction of weights to zero impairs forward and gradient signal propagation. Second, sparse studies often need to test multiple sparsity levels, while also introducing new hyperparameters (HPs), leading to prohibitive tuning costs. Indeed, the standard practice is to re-use the learning HPs originally crafted for dense models. Unfortunately, we show sparse and dense networks do not share the same optimal HPs. Without stable dynamics and effective training recipes, it is costly to test sparsity at scale, which is key to surpassing dense networks and making the business case for sparsity acceleration in hardware. A holistic approach is needed to tackle these challenges and we propose S$μ$Par as one such approach. S$μ$Par ensures activations, gradients, and weight updates all scale independently of sparsity level. Further, by reparameterizing the HPs, S$μ$Par enables the same HP values to be optimal as we vary both sparsity level and model width. HPs can be tuned on small dense networks and transferred to large sparse models, greatly reducing tuning costs. On large-scale language modeling, S$μ$Par training improves loss by up to 8.2% over the common approach of using the dense model standard parameterization. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: 9 pages main text, 11 pages reference and appendix, 11 figures

arXiv:2312.13534 [pdf, other]

doi 10.1109/TMI.2024.3411989

SE(3)-Equivariant and Noise-Invariant 3D Rigid Motion Tracking in Brain MRI

Authors: Benjamin Billot, Neel Dey, Daniel Moyer, Malte Hoffmann, Esra Abaci Turk, Borjan Gagoski, Ellen Grant, Polina Golland

Abstract: Rigid motion tracking is paramount in many medical imaging applications where movements need to be detected, corrected, or accounted for. Modern strategies rely on convolutional neural networks (CNN) and pose this problem as rigid registration. Yet, CNNs do not exploit natural symmetries in this task, as they are equivariant to translations (their outputs shift with their inputs) but not to rotati… ▽ More Rigid motion tracking is paramount in many medical imaging applications where movements need to be detected, corrected, or accounted for. Modern strategies rely on convolutional neural networks (CNN) and pose this problem as rigid registration. Yet, CNNs do not exploit natural symmetries in this task, as they are equivariant to translations (their outputs shift with their inputs) but not to rotations. Here we propose EquiTrack, the first method that uses recent steerable SE(3)-equivariant CNNs (E-CNN) for motion tracking. While steerable E-CNNs can extract corresponding features across different poses, testing them on noisy medical images reveals that they do not have enough learning capacity to learn noise invariance. Thus, we introduce a hybrid architecture that pairs a denoiser with an E-CNN to decouple the processing of anatomically irrelevant intensity features from the extraction of equivariant spatial features. Rigid transforms are then estimated in closed-form. EquiTrack outperforms state-of-the-art learning and optimisation methods for motion tracking in adult brain MRI and fetal MRI time series. Our code is available at https://github.com/BBillot/EquiTrack. △ Less

Submitted 12 June, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

Comments: Published at IEEE transactions on Medical Imaging

arXiv:2312.06358 [pdf, other]

Intraoperative 2D/3D Image Registration via Differentiable X-ray Rendering

Authors: Vivek Gopalakrishnan, Neel Dey, Polina Golland

Abstract: Surgical decisions are informed by aligning rapid portable 2D intraoperative images (e.g., X-rays) to a high-fidelity 3D preoperative reference scan (e.g., CT). 2D/3D image registration often fails in practice: conventional optimization methods are prohibitively slow and susceptible to local minima, while neural networks trained on small datasets fail on new patients or require impractical landmar… ▽ More Surgical decisions are informed by aligning rapid portable 2D intraoperative images (e.g., X-rays) to a high-fidelity 3D preoperative reference scan (e.g., CT). 2D/3D image registration often fails in practice: conventional optimization methods are prohibitively slow and susceptible to local minima, while neural networks trained on small datasets fail on new patients or require impractical landmark supervision. We present DiffPose, a self-supervised approach that leverages patient-specific simulation and differentiable physics-based rendering to achieve accurate 2D/3D registration without relying on manually labeled data. Preoperatively, a CNN is trained to regress the pose of a randomly oriented synthetic X-ray rendered from the preoperative CT. The CNN then initializes rapid intraoperative test-time optimization that uses the differentiable X-ray renderer to refine the solution. Our work further proposes several geometrically principled methods for sampling camera poses from $\mathbf{SE}(3)$, for sparse differentiable rendering, and for driving registration in the tangent space $\mathfrak{se}(3)$ with geodesic and multiscale locality-sensitive losses. DiffPose achieves sub-millimeter accuracy across surgical datasets at intraoperative speeds, improving upon existing unsupervised methods by an order of magnitude and even outperforming supervised baselines. Our code is available at https://github.com/eigenvivek/DiffPose. △ Less

Submitted 27 March, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

Comments: CVPR 2024

arXiv:2312.05148 [pdf, other]

doi 10.59275/j.melba.2023-g3f8

Shape-aware Segmentation of the Placenta in BOLD Fetal MRI Time Series

Authors: S. Mazdak Abulnaga, Neel Dey, Sean I. Young, Eileen Pan, Katherine I. Hobgood, Clinton J. Wang, P. Ellen Grant, Esra Abaci Turk, Polina Golland

Abstract: Blood oxygen level dependent (BOLD) MRI time series with maternal hyperoxia can assess placental oxygenation and function. Measuring precise BOLD changes in the placenta requires accurate temporal placental segmentation and is confounded by fetal and maternal motion, contractions, and hyperoxia-induced intensity changes. Current BOLD placenta segmentation methods warp a manually annotated subject-… ▽ More Blood oxygen level dependent (BOLD) MRI time series with maternal hyperoxia can assess placental oxygenation and function. Measuring precise BOLD changes in the placenta requires accurate temporal placental segmentation and is confounded by fetal and maternal motion, contractions, and hyperoxia-induced intensity changes. Current BOLD placenta segmentation methods warp a manually annotated subject-specific template to the entire time series. However, as the placenta is a thin, elongated, and highly non-rigid organ subject to large deformations and obfuscated edges, existing work cannot accurately segment the placental shape, especially near boundaries. In this work, we propose a machine learning segmentation framework for placental BOLD MRI and apply it to segmenting each volume in a time series. We use a placental-boundary weighted loss formulation and perform a comprehensive evaluation across several popular segmentation objectives. Our model is trained and tested on a cohort of 91 subjects containing healthy fetuses, fetuses with fetal growth restriction, and mothers with high BMI. Biomedically, our model performs reliably in segmenting volumes in both normoxic and hyperoxic points in the BOLD time series. We further find that boundary-weighting increases placental segmentation performance by 8.3% and 6.0% Dice coefficient for the cross-entropy and signed distance transform objectives, respectively. Our code and trained model is available at https://github.com/mabulnaga/automatic-placenta-segmentation. △ Less

Submitted 8 December, 2023; originally announced December 2023.

Comments: Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA) https://melba-journal.org/2023:017. arXiv admin note: substantial text overlap with arXiv:2208.02895

Journal ref: Machine.Learning.for.Biomedical.Imaging. 2 (2023)

arXiv:2311.02874 [pdf, other]

doi 10.48550/arXiv.2311.02874

Dynamic Neural Fields for Learning Atlases of 4D Fetal MRI Time-series

Authors: Zeen Chi, Zhongxiao Cong, Clinton J. Wang, Yingcheng Liu, Esra Abaci Turk, P. Ellen Grant, S. Mazdak Abulnaga, Polina Golland, Neel Dey

Abstract: We present a method for fast biomedical image atlas construction using neural fields. Atlases are key to biomedical image analysis tasks, yet conventional and deep network estimation methods remain time-intensive. In this preliminary work, we frame subject-specific atlas building as learning a neural field of deformable spatiotemporal observations. We apply our method to learning subject-specific… ▽ More We present a method for fast biomedical image atlas construction using neural fields. Atlases are key to biomedical image analysis tasks, yet conventional and deep network estimation methods remain time-intensive. In this preliminary work, we frame subject-specific atlas building as learning a neural field of deformable spatiotemporal observations. We apply our method to learning subject-specific atlases and motion stabilization of dynamic BOLD MRI time-series of fetuses in utero. Our method yields high-quality atlases of fetal BOLD time-series with $\sim$5-7$\times$ faster convergence compared to existing work. While our method slightly underperforms well-tuned baselines in terms of anatomical overlap, it estimates templates significantly faster, thus enabling rapid processing and stabilization of large databases of 4D dynamic MRI acquisitions. Code is available at https://github.com/Kidrauh/neural-atlasing △ Less

Submitted 6 November, 2023; originally announced November 2023.

Comments: 6 pages, 2 figures. Accepted by Medical Imaging Meets NeurIPS 2023

arXiv:2311.00308 [pdf]

From Image to Language: A Critical Analysis of Visual Question Answering (VQA) Approaches, Challenges, and Opportunities

Authors: Md Farhan Ishmam, Md Sakib Hossain Shovon, M. F. Mridha, Nilanjan Dey

Abstract: The multimodal task of Visual Question Answering (VQA) encompassing elements of Computer Vision (CV) and Natural Language Processing (NLP), aims to generate answers to questions on any visual input. Over time, the scope of VQA has expanded from datasets focusing on an extensive collection of natural images to datasets featuring synthetic images, video, 3D environments, and various other visual inp… ▽ More The multimodal task of Visual Question Answering (VQA) encompassing elements of Computer Vision (CV) and Natural Language Processing (NLP), aims to generate answers to questions on any visual input. Over time, the scope of VQA has expanded from datasets focusing on an extensive collection of natural images to datasets featuring synthetic images, video, 3D environments, and various other visual inputs. The emergence of large pre-trained networks has shifted the early VQA approaches relying on feature extraction and fusion schemes to vision language pre-training (VLP) techniques. However, there is a lack of comprehensive surveys that encompass both traditional VQA architectures and contemporary VLP-based methods. Furthermore, the VLP challenges in the lens of VQA haven't been thoroughly explored, leaving room for potential open problems to emerge. Our work presents a survey in the domain of VQA that delves into the intricacies of VQA datasets and methods over the field's history, introduces a detailed taxonomy to categorize the facets of VQA, and highlights the recent trends, challenges, and scopes for improvement. We further generalize VQA to multimodal question answering, explore tasks related to VQA, and present a set of open problems for future investigation. The work aims to navigate both beginners and experts by shedding light on the potential avenues of research and expanding the boundaries of the field. △ Less

Submitted 1 November, 2023; originally announced November 2023.

arXiv:2310.13017 [pdf, other]

Position Interpolation Improves ALiBi Extrapolation

Authors: Faisal Al-Khateeb, Nolan Dey, Daria Soboleva, Joel Hestness

Abstract: Linear position interpolation helps pre-trained models using rotary position embeddings (RoPE) to extrapolate to longer sequence lengths. We propose using linear position interpolation to extend the extrapolation range of models using Attention with Linear Biases (ALiBi). We find position interpolation significantly improves extrapolation capability on upstream language modelling and downstream su… ▽ More Linear position interpolation helps pre-trained models using rotary position embeddings (RoPE) to extrapolate to longer sequence lengths. We propose using linear position interpolation to extend the extrapolation range of models using Attention with Linear Biases (ALiBi). We find position interpolation significantly improves extrapolation capability on upstream language modelling and downstream summarization and retrieval tasks. △ Less

Submitted 18 October, 2023; originally announced October 2023.

Comments: 4 pages content, 1 page references, 4 figures

arXiv:2310.03870 [pdf, other]

Consistency Regularization Improves Placenta Segmentation in Fetal EPI MRI Time Series

Authors: Yingcheng Liu, Neerav Karani, Neel Dey, S. Mazdak Abulnaga, Junshen Xu, P. Ellen Grant, Esra Abaci Turk, Polina Golland

Abstract: The placenta plays a crucial role in fetal development. Automated 3D placenta segmentation from fetal EPI MRI holds promise for advancing prenatal care. This paper proposes an effective semi-supervised learning method for improving placenta segmentation in fetal EPI MRI time series. We employ consistency regularization loss that promotes consistency under spatial transformation of the same image a… ▽ More The placenta plays a crucial role in fetal development. Automated 3D placenta segmentation from fetal EPI MRI holds promise for advancing prenatal care. This paper proposes an effective semi-supervised learning method for improving placenta segmentation in fetal EPI MRI time series. We employ consistency regularization loss that promotes consistency under spatial transformation of the same image and temporal consistency across nearby images in a time series. The experimental results show that the method improves the overall segmentation accuracy and provides better performance for outliers and hard samples. The evaluation also indicates that our method improves the temporal coherency of the prediction, which could lead to more accurate computation of temporal placental biomarkers. This work contributes to the study of the placenta and prenatal clinical decision-making. Code is available at https://github.com/firstmover/cr-seg. △ Less

Submitted 15 October, 2023; v1 submitted 5 October, 2023; originally announced October 2023.

arXiv:2309.11568 [pdf, other]

BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model

Authors: Nolan Dey, Daria Soboleva, Faisal Al-Khateeb, Bowen Yang, Ribhu Pathria, Hemant Khachane, Shaheer Muhammad, Zhiming, Chen, Robert Myers, Jacob Robert Steeves, Natalia Vassilieva, Marvin Tom, Joel Hestness

Abstract: We introduce the Bittensor Language Model, called "BTLM-3B-8K", a new state-of-the-art 3 billion parameter open-source language model. BTLM-3B-8K was trained on 627B tokens from the SlimPajama dataset with a mixture of 2,048 and 8,192 context lengths. BTLM-3B-8K outperforms all existing 3B parameter models by 2-5.5% across downstream tasks. BTLM-3B-8K is even competitive with some 7B parameter mod… ▽ More We introduce the Bittensor Language Model, called "BTLM-3B-8K", a new state-of-the-art 3 billion parameter open-source language model. BTLM-3B-8K was trained on 627B tokens from the SlimPajama dataset with a mixture of 2,048 and 8,192 context lengths. BTLM-3B-8K outperforms all existing 3B parameter models by 2-5.5% across downstream tasks. BTLM-3B-8K is even competitive with some 7B parameter models. Additionally, BTLM-3B-8K provides excellent long context performance, outperforming MPT-7B-8K and XGen-7B-8K on tasks up to 8,192 context length. We trained the model on a cleaned and deduplicated SlimPajama dataset; aggressively tuned the \textmu P hyperparameters and schedule; used ALiBi position embeddings; and adopted the SwiGLU nonlinearity. On Hugging Face, the most popular models have 7B parameters, indicating that users prefer the quality-size ratio of 7B models. Compacting the 7B parameter model to one with 3B parameters, with little performance impact, is an important milestone. BTLM-3B-8K needs only 3GB of memory with 4-bit precision and takes 2.5x less inference compute than 7B models, helping to open up access to a powerful language model on mobile and edge devices. BTLM-3B-8K is available under an Apache 2.0 license on Hugging Face: https://huggingface.co/cerebras/btlm-3b-8k-base. △ Less

Submitted 20 September, 2023; originally announced September 2023.

arXiv:2307.08163 [pdf, other]

Boundary-weighted logit consistency improves calibration of segmentation networks

Authors: Neerav Karani, Neel Dey, Polina Golland

Abstract: Neural network prediction probabilities and accuracy are often only weakly-correlated. Inherent label ambiguity in training data for image segmentation aggravates such miscalibration. We show that logit consistency across stochastic transformations acts as a spatially varying regularizer that prevents overconfident predictions at pixels with ambiguous labels. Our boundary-weighted extension of thi… ▽ More Neural network prediction probabilities and accuracy are often only weakly-correlated. Inherent label ambiguity in training data for image segmentation aggravates such miscalibration. We show that logit consistency across stochastic transformations acts as a spatially varying regularizer that prevents overconfident predictions at pixels with ambiguous labels. Our boundary-weighted extension of this regularizer provides state-of-the-art calibration for prostate and heart MRI segmentation. △ Less

Submitted 16 July, 2023; originally announced July 2023.

Comments: Accepted for publication at MICCAI 2023

arXiv:2307.07044 [pdf, other]

AnyStar: Domain randomized universal star-convex 3D instance segmentation

Authors: Neel Dey, S. Mazdak Abulnaga, Benjamin Billot, Esra Abaci Turk, P. Ellen Grant, Adrian V. Dalca, Polina Golland

Abstract: Star-convex shapes arise across bio-microscopy and radiology in the form of nuclei, nodules, metastases, and other units. Existing instance segmentation networks for such structures train on densely labeled instances for each dataset, which requires substantial and often impractical manual annotation effort. Further, significant reengineering or finetuning is needed when presented with new dataset… ▽ More Star-convex shapes arise across bio-microscopy and radiology in the form of nuclei, nodules, metastases, and other units. Existing instance segmentation networks for such structures train on densely labeled instances for each dataset, which requires substantial and often impractical manual annotation effort. Further, significant reengineering or finetuning is needed when presented with new datasets and imaging modalities due to changes in contrast, shape, orientation, resolution, and density. We present AnyStar, a domain-randomized generative model that simulates synthetic training data of blob-like objects with randomized appearance, environments, and imaging physics to train general-purpose star-convex instance segmentation networks. As a result, networks trained using our generative model do not require annotated images from unseen datasets. A single network trained on our synthesized data accurately 3D segments C. elegans and P. dumerilii nuclei in fluorescence microscopy, mouse cortical nuclei in micro-CT, zebrafish brain nuclei in EM, and placental cotyledons in human fetal MRI, all without any retraining, finetuning, transfer learning, or domain adaptation. Code is available at https://github.com/neel-dey/AnyStar. △ Less

Submitted 13 July, 2023; originally announced July 2023.

Comments: Code available at https://github.com/neel-dey/AnyStar

arXiv:2304.06103 [pdf, other]

$E(3) \times SO(3)$-Equivariant Networks for Spherical Deconvolution in Diffusion MRI

Authors: Axel Elaldi, Guido Gerig, Neel Dey

Abstract: We present Roto-Translation Equivariant Spherical Deconvolution (RT-ESD), an $E(3)\times SO(3)$ equivariant framework for sparse deconvolution of volumes where each voxel contains a spherical signal. Such 6D data naturally arises in diffusion MRI (dMRI), a medical imaging modality widely used to measure microstructure and structural connectivity. As each dMRI voxel is typically a mixture of variou… ▽ More We present Roto-Translation Equivariant Spherical Deconvolution (RT-ESD), an $E(3)\times SO(3)$ equivariant framework for sparse deconvolution of volumes where each voxel contains a spherical signal. Such 6D data naturally arises in diffusion MRI (dMRI), a medical imaging modality widely used to measure microstructure and structural connectivity. As each dMRI voxel is typically a mixture of various overlapping structures, there is a need for blind deconvolution to recover crossing anatomical structures such as white matter tracts. Existing dMRI work takes either an iterative or deep learning approach to sparse spherical deconvolution, yet it typically does not account for relationships between neighboring measurements. This work constructs equivariant deep learning layers which respect to symmetries of spatial rotations, reflections, and translations, alongside the symmetries of voxelwise spherical rotations. As a result, RT-ESD improves on previous work across several tasks including fiber recovery on the DiSCo dataset, deconvolution-derived partial volume estimation on real-world \textit{in vivo} human brain dMRI, and improved downstream reconstruction of fiber tractograms on the Tractometer dataset. Our implementation is available at https://github.com/AxelElaldi/e3so3_conv △ Less

Submitted 12 April, 2023; originally announced April 2023.

Comments: Accepted to Medical Imaging with Deep Learning (MIDL) 2023. Code available at https://github.com/AxelElaldi/e3so3_conv . 19 pages with 6 figures

arXiv:2304.03208 [pdf, other]

Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster

Authors: Nolan Dey, Gurpreet Gosal, Zhiming, Chen, Hemant Khachane, William Marshall, Ribhu Pathria, Marvin Tom, Joel Hestness

Abstract: We study recent research advances that improve large language models through efficient pre-training and scaling, and open datasets and tools. We combine these advances to introduce Cerebras-GPT, a family of open compute-optimal language models scaled from 111M to 13B parameters. We train Cerebras-GPT models on the Eleuther Pile dataset following DeepMind Chinchilla scaling rules for efficient pre-… ▽ More We study recent research advances that improve large language models through efficient pre-training and scaling, and open datasets and tools. We combine these advances to introduce Cerebras-GPT, a family of open compute-optimal language models scaled from 111M to 13B parameters. We train Cerebras-GPT models on the Eleuther Pile dataset following DeepMind Chinchilla scaling rules for efficient pre-training (highest accuracy for a given compute budget). We characterize the predictable power-law scaling and compare Cerebras-GPT with other publicly-available models to show all Cerebras-GPT models have state-of-the-art training efficiency on both pre-training and downstream objectives. We describe our learnings including how Maximal Update Parameterization ($μ$P) can further improve large model scaling, improving accuracy and hyperparameter predictability at scale. We release our pre-trained models and code, making this paper the first open and reproducible work comparing compute-optimal model scaling to models trained on fixed dataset sizes. Cerebras-GPT models are available on HuggingFace: https://huggingface.co/cerebras. △ Less

Submitted 6 April, 2023; originally announced April 2023.

Comments: 13 pages main text, 16 pages appendix, 13 figures

arXiv:2302.10840 [pdf, other]

Valid Inference for Machine Learning Model Parameters

Authors: Neil Dey, Jonathan P. Williams

Abstract: The parameters of a machine learning model are typically learned by minimizing a loss function on a set of training data. However, this can come with the risk of overtraining; in order for the model to generalize well, it is of great importance that we are able to find the optimal parameter for the model on the entire population -- not only on the given training sample. In this paper, we construct… ▽ More The parameters of a machine learning model are typically learned by minimizing a loss function on a set of training data. However, this can come with the risk of overtraining; in order for the model to generalize well, it is of great importance that we are able to find the optimal parameter for the model on the entire population -- not only on the given training sample. In this paper, we construct valid confidence sets for this optimal parameter of a machine learning model, which can be generated using only the training data without any knowledge of the population. We then show that studying the distribution of this confidence set allows us to assign a notion of confidence to arbitrary regions of the parameter space, and we demonstrate that this distribution can be well-approximated using bootstrapping techniques. △ Less

Submitted 9 May, 2024; v1 submitted 21 February, 2023; originally announced February 2023.

Comments: 35 pages, 6 figures

arXiv:2302.09244 [pdf, other]

Dual-Domain Self-Supervised Learning for Accelerated Non-Cartesian MRI Reconstruction

Authors: Bo Zhou, Jo Schlemper, Neel Dey, Seyed Sadegh Mohseni Salehi, Kevin Sheth, Chi Liu, James S. Duncan, Michal Sofka

Abstract: While enabling accelerated acquisition and improved reconstruction accuracy, current deep MRI reconstruction networks are typically supervised, require fully sampled data, and are limited to Cartesian sampling patterns. These factors limit their practical adoption as fully-sampled MRI is prohibitively time-consuming to acquire clinically. Further, non-Cartesian sampling patterns are particularly d… ▽ More While enabling accelerated acquisition and improved reconstruction accuracy, current deep MRI reconstruction networks are typically supervised, require fully sampled data, and are limited to Cartesian sampling patterns. These factors limit their practical adoption as fully-sampled MRI is prohibitively time-consuming to acquire clinically. Further, non-Cartesian sampling patterns are particularly desirable as they are more amenable to acceleration and show improved motion robustness. To this end, we present a fully self-supervised approach for accelerated non-Cartesian MRI reconstruction which leverages self-supervision in both k-space and image domains. In training, the undersampled data are split into disjoint k-space domain partitions. For the k-space self-supervision, we train a network to reconstruct the input undersampled data from both the disjoint partitions and from itself. For the image-level self-supervision, we enforce appearance consistency obtained from the original undersampled data and the two partitions. Experimental results on our simulated multi-coil non-Cartesian MRI dataset demonstrate that DDSS can generate high-quality reconstruction that approaches the accuracy of the fully supervised reconstruction, outperforming previous baseline methods. Finally, DDSS is shown to scale to highly challenging real-world clinical MRI reconstruction acquired on a portable low-field (0.064 T) MRI scanner with no data available for supervised training while demonstrating improved image quality as compared to traditional reconstruction, as determined by a radiologist study. △ Less

Submitted 18 February, 2023; originally announced February 2023.

Comments: 14 pages, 10 figures, published at Medical Image Analysis (MedIA)

arXiv:2301.10365 [pdf, other]

Data Consistent Deep Rigid MRI Motion Correction

Authors: Nalini M. Singh, Neel Dey, Malte Hoffmann, Bruce Fischl, Elfar Adalsteinsson, Robert Frost, Adrian V. Dalca, Polina Golland

Abstract: Motion artifacts are a pervasive problem in MRI, leading to misdiagnosis or mischaracterization in population-level imaging studies. Current retrospective rigid intra-slice motion correction techniques jointly optimize estimates of the image and the motion parameters. In this paper, we use a deep network to reduce the joint image-motion parameter search to a search over rigid motion parameters alo… ▽ More Motion artifacts are a pervasive problem in MRI, leading to misdiagnosis or mischaracterization in population-level imaging studies. Current retrospective rigid intra-slice motion correction techniques jointly optimize estimates of the image and the motion parameters. In this paper, we use a deep network to reduce the joint image-motion parameter search to a search over rigid motion parameters alone. Our network produces a reconstruction as a function of two inputs: corrupted k-space data and motion parameters. We train the network using simulated, motion-corrupted k-space data generated with known motion parameters. At test-time, we estimate unknown motion parameters by minimizing a data consistency loss between the motion parameters, the network-based image reconstruction given those parameters, and the acquired measurements. Intra-slice motion correction experiments on simulated and realistic 2D fast spin echo brain MRI achieve high reconstruction fidelity while providing the benefits of explicit data consistency optimization. Our code is publicly available at https://www.github.com/nalinimsingh/neuroMoCo. △ Less

Submitted 16 November, 2023; v1 submitted 24 January, 2023; originally announced January 2023.

Comments: Presented at MIDL 2023. 14 pages, 6 figures. Keywords: motion correction, magnetic resonance imaging, deep learning

arXiv:2212.01928 [pdf, ps, other]

Space-Time- and Frequency- Spreading for Interference Minimization in Dense IoT

Authors: ndrakshi Dey, Nicola Marchetti

Abstract: In this article, we propose a space spreading-assisted framework that leverages either time or frequency diversity or both to reduce interference and signal loss owing to channel impairments and facilitate the efficient operation of large-scale dense Internet-of-Things (IoT). Our approach employs dispersion of data-streams transmitted from individual IoT devices over indexed space-time (ST), space… ▽ More In this article, we propose a space spreading-assisted framework that leverages either time or frequency diversity or both to reduce interference and signal loss owing to channel impairments and facilitate the efficient operation of large-scale dense Internet-of-Things (IoT). Our approach employs dispersion of data-streams transmitted from individual IoT devices over indexed space-time (ST), space-frequency (SF) or space-time-frequency (STF) blocks. As a result, no two devices transmit on the same block; only one is activated while the rest of the devices in the network is silent, thereby minimizing possibility of interference on the transmit side. On the receive side, multiple-antenna array ameliorates performance in presence of channel impairments while exploiting array-processing gain. As interference due to superposition of multiple data-streams is killed at its root, no extra energy is wasted in fighting interference and other impairments, thereby enabling energy-efficient transmission from multiple devices over multiple access channel (MAC). To validate the proposed concept, we simulate the performance of the framework against dense IoT networks deployed in generalized indoor and outdoor scenarios in terms of probability of signal outage. Results demonstrate that our conceptualized framework benefits from interference-free transmission as well as enhancement in overall system performance. △ Less

Submitted 4 December, 2022; originally announced December 2022.

arXiv:2206.13434 [pdf, other]

ContraReg: Contrastive Learning of Multi-modality Unsupervised Deformable Image Registration

Authors: Neel Dey, Jo Schlemper, Seyed Sadegh Mohseni Salehi, Bo Zhou, Guido Gerig, Michal Sofka

Abstract: Establishing voxelwise semantic correspondence across distinct imaging modalities is a foundational yet formidable computer vision task. Current multi-modality registration techniques maximize hand-crafted inter-domain similarity functions, are limited in modeling nonlinear intensity-relationships and deformations, and may require significant re-engineering or underperform on new tasks, datasets,… ▽ More Establishing voxelwise semantic correspondence across distinct imaging modalities is a foundational yet formidable computer vision task. Current multi-modality registration techniques maximize hand-crafted inter-domain similarity functions, are limited in modeling nonlinear intensity-relationships and deformations, and may require significant re-engineering or underperform on new tasks, datasets, and domain pairs. This work presents ContraReg, an unsupervised contrastive representation learning approach to multi-modality deformable registration. By projecting learned multi-scale local patch features onto a jointly learned inter-domain embedding space, ContraReg obtains representations useful for non-rigid multi-modality alignment. Experimentally, ContraReg achieves accurate and robust results with smooth and invertible deformations across a series of baselines and ablations on a neonatal T1-T2 brain MRI registration task with all methods validated over a wide range of deformation regularization strengths. △ Less

Submitted 27 June, 2022; originally announced June 2022.

Comments: Accepted by MICCAI 2022. 13 pages, 6 figures, and 1 table

arXiv:2206.04281 [pdf, other]

Local Spatiotemporal Representation Learning for Longitudinally-consistent Neuroimage Analysis

Authors: Mengwei Ren, Neel Dey, Martin A. Styner, Kelly Botteron, Guido Gerig

Abstract: Recent self-supervised advances in medical computer vision exploit global and local anatomical self-similarity for pretraining prior to downstream tasks such as segmentation. However, current methods assume i.i.d. image acquisition, which is invalid in clinical study designs where follow-up longitudinal scans track subject-specific temporal changes. Further, existing self-supervised methods for me… ▽ More Recent self-supervised advances in medical computer vision exploit global and local anatomical self-similarity for pretraining prior to downstream tasks such as segmentation. However, current methods assume i.i.d. image acquisition, which is invalid in clinical study designs where follow-up longitudinal scans track subject-specific temporal changes. Further, existing self-supervised methods for medically-relevant image-to-image architectures exploit only spatial or temporal self-similarity and only do so via a loss applied at a single image-scale, with naive multi-scale spatiotemporal extensions collapsing to degenerate solutions. To these ends, this paper makes two contributions: (1) It presents a local and multi-scale spatiotemporal representation learning method for image-to-image architectures trained on longitudinal images. It exploits the spatiotemporal self-similarity of learned multi-scale intra-subject features for pretraining and develops several feature-wise regularizations that avoid collapsed identity representations; (2) During finetuning, it proposes a surprisingly simple self-supervised segmentation consistency regularization to exploit intra-subject correlation. Benchmarked in the one-shot segmentation setting, the proposed framework outperforms both well-tuned randomly-initialized baselines and current self-supervised techniques designed for both i.i.d. and longitudinal datasets. These improvements are demonstrated across both longitudinal neurodegenerative adult MRI and developing infant brain MRI and yield both higher performance and longitudinal consistency. △ Less

Submitted 12 December, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

Comments: Accepted at NeurIPS 2022

arXiv:2201.10776 [pdf, other]

DSFormer: A Dual-domain Self-supervised Transformer for Accelerated Multi-contrast MRI Reconstruction

Authors: Bo Zhou, Neel Dey, Jo Schlemper, Seyed Sadegh Mohseni Salehi, Chi Liu, James S. Duncan, Michal Sofka

Abstract: Multi-contrast MRI (MC-MRI) captures multiple complementary imaging modalities to aid in radiological decision-making. Given the need for lowering the time cost of multiple acquisitions, current deep accelerated MRI reconstruction networks focus on exploiting the redundancy between multiple contrasts. However, existing works are largely supervised with paired data and/or prohibitively expensive fu… ▽ More Multi-contrast MRI (MC-MRI) captures multiple complementary imaging modalities to aid in radiological decision-making. Given the need for lowering the time cost of multiple acquisitions, current deep accelerated MRI reconstruction networks focus on exploiting the redundancy between multiple contrasts. However, existing works are largely supervised with paired data and/or prohibitively expensive fully-sampled MRI sequences. Further, reconstruction networks typically rely on convolutional architectures which are limited in their capacity to model long-range interactions and may lead to suboptimal recovery of fine anatomical detail. To these ends, we present a dual-domain self-supervised transformer (DSFormer) for accelerated MC-MRI reconstruction. DSFormer develops a deep conditional cascade transformer (DCCT) consisting of several cascaded Swin transformer reconstruction networks (SwinRN) trained under two deep conditioning strategies to enable MC-MRI information sharing. We further present a dual-domain (image and k-space) self-supervised learning strategy for DCCT to alleviate the costs of acquiring fully sampled training data. DSFormer generates high-fidelity reconstructions which experimentally outperform current fully-supervised baselines. Moreover, we find that DSFormer achieves nearly the same performance when trained either with full supervision or with our proposed dual-domain self-supervision. △ Less

Submitted 16 August, 2022; v1 submitted 26 January, 2022; originally announced January 2022.

Comments: Accepted at WACV 2023

arXiv:2111.02592 [pdf, other]

Conformal prediction for text infilling and part-of-speech prediction

Authors: Neil Dey, Jing Ding, Jack Ferrell, Carolina Kapper, Maxwell Lovig, Emiliano Planchon, Jonathan P Williams

Abstract: Modern machine learning algorithms are capable of providing remarkably accurate point-predictions; however, questions remain about their statistical reliability. Unlike conventional machine learning methods, conformal prediction algorithms return confidence sets (i.e., set-valued predictions) that correspond to a given significance level. Moreover, these confidence sets are valid in the sense that… ▽ More Modern machine learning algorithms are capable of providing remarkably accurate point-predictions; however, questions remain about their statistical reliability. Unlike conventional machine learning methods, conformal prediction algorithms return confidence sets (i.e., set-valued predictions) that correspond to a given significance level. Moreover, these confidence sets are valid in the sense that they guarantee finite sample control over type 1 error probabilities, allowing the practitioner to choose an acceptable error rate. In our paper, we propose inductive conformal prediction (ICP) algorithms for the tasks of text infilling and part-of-speech (POS) prediction for natural language data. We construct new conformal prediction-enhanced bidirectional encoder representations from transformers (BERT) and bidirectional long short-term memory (BiLSTM) algorithms for POS tagging and a new conformal prediction-enhanced BERT algorithm for text infilling. We analyze the performance of the algorithms in simulations using the Brown Corpus, which contains over 57,000 sentences. Our results demonstrate that the ICP algorithms are able to produce valid set-valued predictions that are small enough to be applicable in real-world applications. We also provide a real data example for how our proposed set-valued predictions can improve machine generated audio transcriptions. △ Less

Submitted 3 November, 2021; originally announced November 2021.

arXiv:2110.05321 [pdf, other]

Quantum solutions to possible challenges of Blockchain technology

Authors: Nivedita Dey, Mrityunjay Ghosh, Amlan Chakrabarti

Abstract: Technological advancements of Blockchain and other Distributed Ledger Techniques (DLTs) promise to provide significant advantages to applications seeking transparency, redundancy, and accountability. Actual adoption of these emerging technologies requires incorporating cost-effective, fast, QoS-enabled, secure, and scalable design. With the recent advent of quantum computing, the security of curre… ▽ More Technological advancements of Blockchain and other Distributed Ledger Techniques (DLTs) promise to provide significant advantages to applications seeking transparency, redundancy, and accountability. Actual adoption of these emerging technologies requires incorporating cost-effective, fast, QoS-enabled, secure, and scalable design. With the recent advent of quantum computing, the security of current blockchain cryptosystems can be compromised to a greater extent. Quantum algorithms like Shor's large integer factorization algorithm and Grover's unstructured database search algorithm can provide exponential and quadratic speedup, respectively, in contrast to their classical counterpart. This can put threats on both public-key cryptosystems and hash functions, which necessarily demands to migrate from classical cryptography to quantum-secure cryptography. Moreover, the computational latency of blockchain platforms causes slow transaction speed, so quantum computing principles might provide significant speedup and scalability in transaction processing and accelerating the mining process. For such purpose, this article first studies current and future classical state-of-the-art blockchain scalability and security primitives. The relevant quantum-safe blockchain cryptosystem initiatives which have been taken by Bitcoin, Ethereum, Corda, etc. are stated and compared with respect to key sizes, hash length, execution time, computational overhead, and energy efficiency. Post Quantum Cryptographic algorithms like Code-based, Lattice-based, Multivariate-based, and other schemes are not well suited for classical blockchain technology due to several disadvantages in practical implementation. Decryption latency, massive consumption of computational resources, and increased key size are few challenges that can hinder blockchain performance. △ Less

Submitted 11 October, 2021; originally announced October 2021.

arXiv:2106.13188 [pdf, other]

Q-space Conditioned Translation Networks for Directional Synthesis of Diffusion Weighted Images from Multi-modal Structural MRI

Authors: Mengwei Ren, Heejong Kim, Neel Dey, Guido Gerig

Abstract: Current deep learning approaches for diffusion MRI modeling circumvent the need for densely-sampled diffusion-weighted images (DWIs) by directly predicting microstructural indices from sparsely-sampled DWIs. However, they implicitly make unrealistic assumptions of static $q$-space sampling during training and reconstruction. Further, such approaches can restrict downstream usage of variably sample… ▽ More Current deep learning approaches for diffusion MRI modeling circumvent the need for densely-sampled diffusion-weighted images (DWIs) by directly predicting microstructural indices from sparsely-sampled DWIs. However, they implicitly make unrealistic assumptions of static $q$-space sampling during training and reconstruction. Further, such approaches can restrict downstream usage of variably sampled DWIs for usages including the estimation of microstructural indices or tractography. We propose a generative adversarial translation framework for high-quality DWI synthesis with arbitrary $q$-space sampling given commonly acquired structural images (e.g., B0, T1, T2). Our translation network linearly modulates its internal representations conditioned on continuous $q$-space information, thus removing the need for fixed sampling schemes. Moreover, this approach enables downstream estimation of high-quality microstructural maps from arbitrarily subsampled DWIs, which may be particularly important in cases with sparsely sampled DWIs. Across several recent methodologies, the proposed approach yields improved DWI synthesis accuracy and fidelity with enhanced downstream utility as quantified by the accuracy of scalar microstructure indices estimated from the synthesized images. Code is available at https://github.com/mengweiren/q-space-conditioned-dwi-synthesis. △ Less

Submitted 24 June, 2021; originally announced June 2021.

Comments: Accepted by MICCAI 2021. Project page: https://heejongkim.com/dwi-synthesis; Code: https://github.com/mengweiren/q-space-conditioned-dwi-synthesis

arXiv:2105.04349 [pdf, other]

Generative Adversarial Registration for Improved Conditional Deformable Templates

Authors: Neel Dey, Mengwei Ren, Adrian V. Dalca, Guido Gerig

Abstract: Deformable templates are essential to large-scale medical image registration, segmentation, and population analysis. Current conventional and deep network-based methods for template construction use only regularized registration objectives and often yield templates with blurry and/or anatomically implausible appearance, confounding downstream biomedical interpretation. We reformulate deformable re… ▽ More Deformable templates are essential to large-scale medical image registration, segmentation, and population analysis. Current conventional and deep network-based methods for template construction use only regularized registration objectives and often yield templates with blurry and/or anatomically implausible appearance, confounding downstream biomedical interpretation. We reformulate deformable registration and conditional template estimation as an adversarial game wherein we encourage realism in the moved templates with a generative adversarial registration framework conditioned on flexible image covariates. The resulting templates exhibit significant gain in specificity to attributes such as age and disease, better fit underlying group-wise spatiotemporal trends, and achieve improved sharpness and centrality. These improvements enable more accurate population modeling with diverse covariates for standardized downstream analyses and easier anatomical delineation for structures of interest. △ Less

Submitted 17 March, 2022; v1 submitted 7 May, 2021; originally announced May 2021.

Comments: ICCV 2021 camera-ready. 24 pages, 15 figures. Project page: https://www.neeldey.com/deformable-templates/ Code: https://github.com/neel-dey/Atlas-GAN

Journal ref: Proceedings of the IEEE/CVF International Conference on Computer Vision 2021

arXiv:2103.05617 [pdf, ps, other]

Point-supervised Segmentation of Microscopy Images and Volumes via Objectness Regularization

Authors: Shijie Li, Neel Dey, Katharina Bermond, Leon von der Emde, Christine A. Curcio, Thomas Ach, Guido Gerig

Abstract: Annotation is a major hurdle in the semantic segmentation of microscopy images and volumes due to its prerequisite expertise and effort. This work enables the training of semantic segmentation networks on images with only a single point for training per instance, an extreme case of weak supervision which drastically reduces the burden of annotation. Our approach has two key aspects: (1) we constru… ▽ More Annotation is a major hurdle in the semantic segmentation of microscopy images and volumes due to its prerequisite expertise and effort. This work enables the training of semantic segmentation networks on images with only a single point for training per instance, an extreme case of weak supervision which drastically reduces the burden of annotation. Our approach has two key aspects: (1) we construct a graph-theoretic soft-segmentation using individual seeds to be used within a regularizer during training and (2) we use an objective function that enables learning from the constructed soft-labels. We achieve competitive results against the state-of-the-art in point-supervised semantic segmentation on challenging datasets in digital pathology. Finally, we scale our methodology to point-supervised segmentation in 3D fluorescence microscopy volumes, obviating the need for arduous manual volumetric delineation. Our code is freely available. △ Less

Submitted 18 March, 2021; v1 submitted 9 March, 2021; originally announced March 2021.

Comments: Accepted to IEEE ISBI 2021. Code available at https://github.com/CJLee94/Point-Supervised-Segmentation

arXiv:2102.09462 [pdf, other]

Equivariant Spherical Deconvolution: Learning Sparse Orientation Distribution Functions from Spherical Data

Authors: Axel Elaldi, Neel Dey, Heejong Kim, Guido Gerig

Abstract: We present a rotation-equivariant unsupervised learning framework for the sparse deconvolution of non-negative scalar fields defined on the unit sphere. Spherical signals with multiple peaks naturally arise in Diffusion MRI (dMRI), where each voxel consists of one or more signal sources corresponding to anisotropic tissue structure such as white matter. Due to spatial and spectral partial voluming… ▽ More We present a rotation-equivariant unsupervised learning framework for the sparse deconvolution of non-negative scalar fields defined on the unit sphere. Spherical signals with multiple peaks naturally arise in Diffusion MRI (dMRI), where each voxel consists of one or more signal sources corresponding to anisotropic tissue structure such as white matter. Due to spatial and spectral partial voluming, clinically-feasible dMRI struggles to resolve crossing-fiber white matter configurations, leading to extensive development in spherical deconvolution methodology to recover underlying fiber directions. However, these methods are typically linear and struggle with small crossing-angles and partial volume fraction estimation. In this work, we improve on current methodologies by nonlinearly estimating fiber structures via unsupervised spherical convolutional networks with guaranteed equivariance to spherical rotation. Experimentally, we first validate our proposition via extensive single and multi-shell synthetic benchmarks demonstrating competitive performance against common baselines. We then show improved downstream performance on fiber tractography measures on the Tractometer benchmark dataset. Finally, we show downstream improvements in terms of tractography and partial volume estimation on a multi-shell dataset of human subjects. △ Less

Submitted 17 February, 2021; originally announced February 2021.

Comments: Accepted to Information Processing in Medical Imaging (IPMI) 2021. Code available at https://github.com/AxelElaldi/equivariant-spherical-deconvolution . First two authors contributed equally. 13 pages with 6 figures

arXiv:2102.06315 [pdf, other]

doi 10.1109/TMI.2021.3059726

Segmentation-Renormalized Deep Feature Modulation for Unpaired Image Harmonization

Authors: Mengwei Ren, Neel Dey, James Fishbaugh, Guido Gerig

Abstract: Deep networks are now ubiquitous in large-scale multi-center imaging studies. However, the direct aggregation of images across sites is contraindicated for downstream statistical and deep learning-based image analysis due to inconsistent contrast, resolution, and noise. To this end, in the absence of paired data, variations of Cycle-consistent Generative Adversarial Networks have been used to harm… ▽ More Deep networks are now ubiquitous in large-scale multi-center imaging studies. However, the direct aggregation of images across sites is contraindicated for downstream statistical and deep learning-based image analysis due to inconsistent contrast, resolution, and noise. To this end, in the absence of paired data, variations of Cycle-consistent Generative Adversarial Networks have been used to harmonize image sets between a source and target domain. Importantly, these methods are prone to instability, contrast inversion, intractable manipulation of pathology, and steganographic mappings which limit their reliable adoption in real-world medical imaging. In this work, based on an underlying assumption that morphological shape is consistent across imaging sites, we propose a segmentation-renormalized image translation framework to reduce inter-scanner heterogeneity while preserving anatomical layout. We replace the affine transformations used in the normalization layers within generative networks with trainable scale and shift parameters conditioned on jointly learned anatomical segmentation embeddings to modulate features at every level of translation. We evaluate our methodologies against recent baselines across several imaging modalities (T1w MRI, FLAIR MRI, and OCT) on datasets with and without lesions. Segmentation-renormalization for translation GANs yields superior image harmonization as quantified by Inception distances, demonstrates improved downstream utility via post-hoc segmentation accuracy, and improved robustness to translation perturbation and self-adversarial attacks. △ Less

Submitted 15 February, 2021; v1 submitted 11 February, 2021; originally announced February 2021.

Comments: Accepted by IEEE Transactions on Medical Imaging. Code available at https://github.com/mengweiren/segmentation-renormalized-harmonization

arXiv:2011.03043 [pdf, other]

Identifying and interpreting tuning dimensions in deep networks

Authors: Nolan S. Dey, J. Eric Taylor, Bryan P. Tripp, Alexander Wong, Graham W. Taylor

Abstract: In neuroscience, a tuning dimension is a stimulus attribute that accounts for much of the activation variance of a group of neurons. These are commonly used to decipher the responses of such groups. While researchers have attempted to manually identify an analogue to these tuning dimensions in deep neural networks, we are unaware of an automatic way to discover them. This work contributes an unsup… ▽ More In neuroscience, a tuning dimension is a stimulus attribute that accounts for much of the activation variance of a group of neurons. These are commonly used to decipher the responses of such groups. While researchers have attempted to manually identify an analogue to these tuning dimensions in deep neural networks, we are unaware of an automatic way to discover them. This work contributes an unsupervised framework for identifying and interpreting "tuning dimensions" in deep networks. Our method correctly identifies the tuning dimensions of a synthetic Gabor filter bank and tuning dimensions of the first two layers of InceptionV1 trained on ImageNet. △ Less

Submitted 7 December, 2020; v1 submitted 5 November, 2020; originally announced November 2020.

Comments: 15 pages, 12 figures, Camera-ready for Shared Visual Representations in Human & Machine Intelligence NeurIPS Workshop 2020

ACM Class: I.2.10

arXiv:2010.08053 [pdf, other]

QDLC -- The Quantum Development Life Cycle

Authors: Nivedita Dey, Mrityunjay Ghosh, Subhra Samir kundu, Amlan Chakrabarti

Abstract: The magnificence grandeur of quantum computing lies in the inherent nature of quantum particles to exhibit true parallelism, which can be realized by indubitably fascinating theories of quantum physics. The possibilities opened by quantum computation (QC) is no where analogous to any classical simulation as quantum computers can efficiently simulate the complex dynamics of strongly correlated inte… ▽ More The magnificence grandeur of quantum computing lies in the inherent nature of quantum particles to exhibit true parallelism, which can be realized by indubitably fascinating theories of quantum physics. The possibilities opened by quantum computation (QC) is no where analogous to any classical simulation as quantum computers can efficiently simulate the complex dynamics of strongly correlated inter-facial systems. But, unfolding mysteries and leading to revolutionary breakthroughs in quantum computing are often challenged by lack of research and development potential in developing qubits with longer coherence interval, scaling qubit count, incorporating quantum error correction to name a few. Putting the first footstep into explorative quantum research by researchers and developers is also inherently ambiguous - due to lack of definitive steps in building up a quantum enabled customized computing stack. Difference in behavioral pattern of underlying system, early-stage noisy device, implementation barriers and performance metric cause hindrance in full adoption of existing classical SDLC suites for quantum product development. This in turn, necessitates to devise systematic and cost-effective techniques to quantum software development through a Quantum Development Life Cycle (QDLC) model, specifying the distinguished features and functionalities of quantum feasibility study, quantum requirement specification, quantum system design, quantum software coding and implementation, quantum testing and quantum software quality management. △ Less

Submitted 15 October, 2020; originally announced October 2020.

Comments: 18 pages, 4 tables, 6 diagrams

arXiv:2010.07413 [pdf, other]

doi 10.1049/qtc2.12023

A Novel Quantum Algorithm for Ant Colony Optimization

Authors: Mrityunjay Ghosh, Nivedita Dey, Debdeep Mitra, Amlan Chakrabarti

Abstract: Ant colony optimization (ACO) is a commonly used meta-heuristic to solve complex combinatorial optimization problems like traveling salesman problem (TSP), vehicle routing problem (VRP), etc. However, classical ACO algorithms provide better optimal solutions but do not reduce computation time overhead to a significant extent. Algorithmic speed-up can be achieved by using parallelism offered by qua… ▽ More Ant colony optimization (ACO) is a commonly used meta-heuristic to solve complex combinatorial optimization problems like traveling salesman problem (TSP), vehicle routing problem (VRP), etc. However, classical ACO algorithms provide better optimal solutions but do not reduce computation time overhead to a significant extent. Algorithmic speed-up can be achieved by using parallelism offered by quantum computing. Existing quantum algorithms to solve ACO are either quantum-inspired classical algorithms or hybrid quantum-classical algorithms. Since all these algorithms need the intervention of classical computing, leveraging the true potential of quantum computing on real quantum hardware remains a challenge. This paper's main contribution is to propose a fully quantum algorithm to solve ACO, enhancing the quantum information processing toolbox in the fault-tolerant quantum computing (FTQC) era. We have Solved the Single Source Single Destination (SSSD) shortest-path problem using our proposed adaptive quantum circuit for representing dynamic pheromone updating strategy in real IBMQ devices. Our quantum ACO technique can be further used as a quantum ORACLE to solve complex optimization problems in a fully quantum setup with significant speed up upon the availability of more qubits. △ Less

Submitted 4 September, 2021; v1 submitted 14 October, 2020; originally announced October 2020.

Comments: 13 pages, 13 figures

Journal ref: IET Quantum Communication 2021

arXiv:2008.08024 [pdf, ps, other]

Self-supervised Denoising via Diffeomorphic Template Estimation: Application to Optical Coherence Tomography

Authors: Guillaume Gisbert, Neel Dey, Hiroshi Ishikawa, Joel Schuman, James Fishbaugh, Guido Gerig

Abstract: Optical Coherence Tomography (OCT) is pervasive in both the research and clinical practice of Ophthalmology. However, OCT images are strongly corrupted by noise, limiting their interpretation. Current OCT denoisers leverage assumptions on noise distributions or generate targets for training deep supervised denoisers via averaging of repeat acquisitions. However, recent self-supervised advances all… ▽ More Optical Coherence Tomography (OCT) is pervasive in both the research and clinical practice of Ophthalmology. However, OCT images are strongly corrupted by noise, limiting their interpretation. Current OCT denoisers leverage assumptions on noise distributions or generate targets for training deep supervised denoisers via averaging of repeat acquisitions. However, recent self-supervised advances allow the training of deep denoising networks using only repeat acquisitions without clean targets as ground truth, reducing the burden of supervised learning. Despite the clear advantages of self-supervised methods, their use is precluded as OCT shows strong structural deformations even between sequential scans of the same subject due to involuntary eye motion. Further, direct nonlinear alignment of repeats induces correlation of the noise between images. In this paper, we propose a joint diffeomorphic template estimation and denoising framework which enables the use of self-supervised denoising for motion deformed repeat acquisitions, without empirically registering their noise realizations. Strong qualitative and quantitative improvements are achieved in denoising OCT images, with generic utility in any imaging modality amenable to multiple exposures. △ Less

Submitted 18 August, 2020; originally announced August 2020.

Comments: To be published in MICCAI Ophthalmic Medical Image Analysis 2020. 11 pages, 4 figures, 1 table

arXiv:2007.06804 [pdf, other]

2D Qubit Placement of Quantum Circuits using LONGPATH

Authors: Mrityunjay Ghosh, Nivedita Dey, Debdeep Mitra, Amlan Chakrabarti

Abstract: In order to achieve speedup over conventional classical computing for finding solution of computationally hard problems, quantum computing was introduced. Quantum algorithms can be simulated in a pseudo quantum environment, but implementation involves realization of quantum circuits through physical synthesis of quantum gates. This requires decomposition of complex quantum gates into a cascade of… ▽ More In order to achieve speedup over conventional classical computing for finding solution of computationally hard problems, quantum computing was introduced. Quantum algorithms can be simulated in a pseudo quantum environment, but implementation involves realization of quantum circuits through physical synthesis of quantum gates. This requires decomposition of complex quantum gates into a cascade of simple one qubit and two qubit gates. The methodological framework for physical synthesis imposes a constraint regarding placement of operands (qubits) and operators. If physical qubits can be placed on a grid, where each node of the grid represents a qubit then quantum gates can only be operated on adjacent qubits, otherwise SWAP gates must be inserted to convert non-Linear Nearest Neighbor architecture to Linear Nearest Neighbor architecture. Insertion of SWAP gates should be made optimal to reduce cumulative cost of physical implementation. A schedule layout generation is required for placement and routing apriori to actual implementation. In this paper, two algorithms are proposed to optimize the number of SWAP gates in any arbitrary quantum circuit. The first algorithm is intended to start with generation of an interaction graph followed by finding the longest path starting from the node with maximum degree. The second algorithm optimizes the number of SWAP gates between any pair of non-neighbouring qubits. Our proposed approach has a significant reduction in number of SWAP gates in 1D and 2D NTC architecture. △ Less

Submitted 14 July, 2020; originally announced July 2020.

Comments: Advanced Computing and Systems for Security, SpringerLink, Volume 10

arXiv:2005.01683 [pdf, other]

Group Equivariant Generative Adversarial Networks

Authors: Neel Dey, Antong Chen, Soheil Ghafurian

Abstract: Recent improvements in generative adversarial visual synthesis incorporate real and fake image transformation in a self-supervised setting, leading to increased stability and perceptual fidelity. However, these approaches typically involve image augmentations via additional regularizers in the GAN objective and thus spend valuable network capacity towards approximating transformation equivariance… ▽ More Recent improvements in generative adversarial visual synthesis incorporate real and fake image transformation in a self-supervised setting, leading to increased stability and perceptual fidelity. However, these approaches typically involve image augmentations via additional regularizers in the GAN objective and thus spend valuable network capacity towards approximating transformation equivariance instead of their desired task. In this work, we explicitly incorporate inductive symmetry priors into the network architectures via group-equivariant convolutional networks. Group-convolutions have higher expressive power with fewer samples and lead to better gradient feedback between generator and discriminator. We show that group-equivariance integrates seamlessly with recent techniques for GAN training across regularizers, architectures, and loss functions. We demonstrate the utility of our methods for conditional synthesis by improving generation in the limited data regime across symmetric imaging datasets and even find benefits for natural images with preferred orientation. △ Less

Submitted 30 March, 2021; v1 submitted 4 May, 2020; originally announced May 2020.

Comments: Accepted by the International Conference on Learning Representations (ICLR) 2021

arXiv:2004.03431 [pdf]

Harmony-Search and Otsu based System for Coronavirus Disease (COVID-19) Detection using Lung CT Scan Images

Authors: V. Rajinikanth, Nilanjan Dey, Alex Noel Joseph Raj, Aboul Ella Hassanien, K. C. Santosh, N. Sri Madhava Raja

Abstract: Pneumonia is one of the foremost lung diseases and untreated pneumonia will lead to serious threats for all age groups. The proposed work aims to extract and evaluate the Coronavirus disease (COVID-19) caused pneumonia infection in lung using CT scans. We propose an image-assisted system to extract COVID-19 infected sections from lung CT scans (coronal view). It includes following steps: (i) Thres… ▽ More Pneumonia is one of the foremost lung diseases and untreated pneumonia will lead to serious threats for all age groups. The proposed work aims to extract and evaluate the Coronavirus disease (COVID-19) caused pneumonia infection in lung using CT scans. We propose an image-assisted system to extract COVID-19 infected sections from lung CT scans (coronal view). It includes following steps: (i) Threshold filter to extract the lung region by eliminating possible artifacts; (ii) Image enhancement using Harmony-Search-Optimization and Otsu thresholding; (iii) Image segmentation to extract infected region(s); and (iv) Region-of-interest (ROI) extraction (features) from binary image to compute level of severity. The features that are extracted from ROI are then employed to identify the pixel ratio between the lung and infection sections to identify infection level of severity. The primary objective of the tool is to assist the pulmonologist not only to detect but also to help plan treatment process. As a consequence, for mass screening processing, it will help prevent diagnostic burden. △ Less

Submitted 6 April, 2020; originally announced April 2020.

Comments: 13 pages

arXiv:2003.09868 [pdf]

Composite Monte Carlo Decision Making under High Uncertainty of Novel Coronavirus Epidemic Using Hybridized Deep Learning and Fuzzy Rule Induction

Authors: Simon James Fong, Gloria Li, Nilanjan Dey, Ruben Gonzalez Crespo, Enrique Herrera-Viedma

Abstract: In the advent of the novel coronavirus epidemic since December 2019, governments and authorities have been struggling to make critical decisions under high uncertainty at their best efforts. Composite Monte-Carlo (CMC) simulation is a forecasting method which extrapolates available data which are broken down from multiple correlated/casual micro-data sources into many possible future outcomes by d… ▽ More In the advent of the novel coronavirus epidemic since December 2019, governments and authorities have been struggling to make critical decisions under high uncertainty at their best efforts. Composite Monte-Carlo (CMC) simulation is a forecasting method which extrapolates available data which are broken down from multiple correlated/casual micro-data sources into many possible future outcomes by drawing random samples from some probability distributions. For instance, the overall trend and propagation of the infested cases in China are influenced by the temporal-spatial data of the nearby cities around the Wuhan city (where the virus is originated from), in terms of the population density, travel mobility, medical resources such as hospital beds and the timeliness of quarantine control in each city etc. Hence a CMC is reliable only up to the closeness of the underlying statistical distribution of a CMC, that is supposed to represent the behaviour of the future events, and the correctness of the composite data relationships. In this paper, a case study of using CMC that is enhanced by deep learning network and fuzzy rule induction for gaining better stochastic insights about the epidemic development is experimented. Instead of applying simplistic and uniform assumptions for a MC which is a common practice, a deep learning-based CMC is used in conjunction of fuzzy rule induction techniques. As a result, decision makers are benefited from a better fitted MC outputs complemented by min-max rules that foretell about the extreme ranges of future possibilities with respect to the epidemic. △ Less

Submitted 22 March, 2020; originally announced March 2020.

Comments: 19 pages

arXiv:1907.00625 [pdf, other]

On-chip learning in a conventional silicon MOSFET based Analog Hardware Neural Network

Authors: Nilabjo Dey, Janak Sharda, Utkarsh Saxena, Divya Kaushik, Utkarsh Singh, Debanjan Bhowmik

Abstract: On-chip learning in a crossbar array based analog hardware Neural Network (NN) has been shown to have major advantages in terms of speed and energy compared to training NN on a traditional computer. However analog hardware NN proposals and implementations thus far have mostly involved Non Volatile Memory (NVM) devices like Resistive Random Access Memory (RRAM), Phase Change Memory (PCM), spintroni… ▽ More On-chip learning in a crossbar array based analog hardware Neural Network (NN) has been shown to have major advantages in terms of speed and energy compared to training NN on a traditional computer. However analog hardware NN proposals and implementations thus far have mostly involved Non Volatile Memory (NVM) devices like Resistive Random Access Memory (RRAM), Phase Change Memory (PCM), spintronic devices or floating gate transistors as synapses. Fabricating systems based on RRAM, PCM or spintronic devices need in-house laboratory facilities and cannot be done through merchant foundries, unlike conventional silicon based CMOS chips. Floating gate transistors need large voltage pulses for weight update, making on-chip learning in such systems energy inefficient. This paper proposes and implements through SPICE simulations on-chip learning in analog hardware NN using only conventional silicon based MOSFETs (without any floating gate) as synapses since they are easy to fabricate. We first model the synaptic characteristic of our single transistor synapse using SPICE circuit simulator and benchmark it against experimentally obtained current-voltage characteristics of a transistor. Next we design a Fully Connected Neural Network (FCNN) crossbar array using such transistor synapses. We also design analog peripheral circuits for neuron and synaptic weight update calculation, needed for on-chip learning, again using conventional transistors. Simulating the entire system on SPICE simulator, we obtain high training and test accuracy on the standard Fisher's Iris dataset, widely used in machine learning. We also compare the speed and energy performance of our transistor based implementation of analog hardware NN with some previous implementations of NN with NVM devices and show comparable performance with respect to on-chip learning. △ Less

Submitted 1 July, 2019; originally announced July 2019.

Comments: 18 pages, 10 figures, 1 table (shorter version submitted to conference for review)

arXiv:1307.0277 [pdf]

Multilevel Threshold Based Gray Scale Image Segmentation using Cuckoo Search

Authors: Sourav Samantaa, Nilanjan Dey, Poulami Das, Suvojit Acharjee, Sheli Sinha Chaudhuri

Abstract: Image Segmentation is a technique of partitioning the original image into some distinct classes. Many possible solutions may be available for segmenting an image into a certain number of classes, each one having different quality of segmentation. In our proposed method, multilevel thresholding technique has been used for image segmentation. A new approach of Cuckoo Search (CS) is used for selectio… ▽ More Image Segmentation is a technique of partitioning the original image into some distinct classes. Many possible solutions may be available for segmenting an image into a certain number of classes, each one having different quality of segmentation. In our proposed method, multilevel thresholding technique has been used for image segmentation. A new approach of Cuckoo Search (CS) is used for selection of optimal threshold value. In other words, the algorithm is used to achieve the best solution from the initial random threshold values or solutions and to evaluate the quality of a solution correlation function is used. Finally, MSE and PSNR are measured to understand the segmentation quality. △ Less

Submitted 1 July, 2013; originally announced July 2013.

Comments: 8 Pages,7 figures,ICECIT2012,Anatapur,India. arXiv admin note: text overlap with arXiv:1003.1594, arXiv:1005.2908 by other authors

arXiv:1304.2310 [pdf]

Embedding of Blink Frequency in Electrooculography Signal using Difference Expansion based Reversible Watermarking Technique

Authors: Nilanjan Dey, Prasenjit Maji, Poulami Das, Shouvik Biswas, Achintya Das, Sheli Sinha Chaudhuri

Abstract: In the past few years, like other fields, rapid expansion of digitization and globalization has influenced the medical field as well. For progress of diagnostic results most of the reputed hospitals and diagnostic centres all over the world have started exchanging medical information. In this proposed method, the calculated diagnostic parametric values of the original Electrooculography (EOG) sign… ▽ More In the past few years, like other fields, rapid expansion of digitization and globalization has influenced the medical field as well. For progress of diagnostic results most of the reputed hospitals and diagnostic centres all over the world have started exchanging medical information. In this proposed method, the calculated diagnostic parametric values of the original Electrooculography (EOG) signal are embedded as a watermark by using Difference Expansion (DE) algorithm based reversible watermarking technique. The extracted watermark provides the required parametric values at the recipient end without any post computation of the recovered EOG signal. By computing the parametric values from the recovered signal, the integrity of the extracted watermark can be validated. The time domain features of EOG signal are calculated for the generation of watermark. In the current work, various features are studied and two major features related to blink frequency are used to generate the watermark. The high Signal to Noise Ratio (SNR) and the Bit Error Rate (BER) claim the robustness of the proposed method. △ Less

Submitted 9 March, 2013; originally announced April 2013.

Comments: 6 Pages, 3 Figures, 4 Tables

Journal ref: Scientific Bulletin of the Politehnica University of Timisoara - Transactions on Electronics and Communications p-ISSN 1583-3380, vol. 57(71), no. 2, 2012

arXiv:1303.5972 [pdf]

Odd-Even Embedding Scheme Based Modified Reversible Watermarking Technique using Blueprint

Authors: Arijit Kumar Pal, Poulami Das, Nilanjan Dey

Abstract: Digital watermarking is a technique of information adding or information hiding in order to identify the owner of the data in multimedia content. It seems that a signal or digital image can permanently embed over another digital data providing a good way to protect intellectual property from illegal replication. The cover data that is transmitted through the internet hides the watermark in a compu… ▽ More Digital watermarking is a technique of information adding or information hiding in order to identify the owner of the data in multimedia content. It seems that a signal or digital image can permanently embed over another digital data providing a good way to protect intellectual property from illegal replication. The cover data that is transmitted through the internet hides the watermark in a computer aided assertion method such that it becomes undetectable. Finally it stands as a hindrance over many operations without harming the embedded host document. Unfortunately, many owners of the digital materials such as images, text, audio and video are reluctant to the spreading of their documents on the web or other networked environment, because the ease of duplicating digital materials facilitates copyright violation. Digital media distribution occurs through various channels. The cover data may or may not hold any relation with the watermark information. In the last two decades, a considerable amount of research has been done on the digital watermarking of multimedia files such as audio, video, images and text. Different type of watermarking algorithms has been proposed by the researchers to achieve high level of security and authenticity. In our proposed method, a modified reversible watermarking technique is introduced, which employs a blueprint generation of original image based on odd-even embedding methodology to yield large data hiding capacity, security as well as high watermarked quality. The experimental results demonstrate that, no matter how much secret data is embedded, the watermarked quality is about 51dB in this proposed scheme. △ Less

Submitted 24 March, 2013; originally announced March 2013.

Comments: 10 Pages, Figure 2, Table 1, FOSET, Academic Meet, Kolkata, India, 22-23 March 2013

arXiv:1303.2211 [pdf]

Medical Information Embedding in Compressed Watermarked Intravascular Ultrasound Video

Authors: Nilanjan Dey, Suvojit Acharjee, Debalina Biswas, Achintya Das, Sheli Sinha Chaudhuri

Abstract: In medical field, intravascular ultrasound (IVUS) is a tomographic imaging modality, which can identify the boundaries of different layers of blood vessels. IVUS can detect myocardial infarction (heart attack) that remains ignored and unattended when only angioplasty is done. During the past decade, it became easier for some individuals or groups to copy and transmits digital information without t… ▽ More In medical field, intravascular ultrasound (IVUS) is a tomographic imaging modality, which can identify the boundaries of different layers of blood vessels. IVUS can detect myocardial infarction (heart attack) that remains ignored and unattended when only angioplasty is done. During the past decade, it became easier for some individuals or groups to copy and transmits digital information without the permission of the owner. For increasing authentication and security of copyrights, digital watermarking, an information hiding technique, was introduced. Achieving watermarking technique with lesser amount of distortion in biomedical data is a challenging task. Watermark can be embedded into an image or in a video. As video data is a huge amount of information, therefore a large storage area is needed which is not feasible. In this case motion vector based video compression is done to reduce size. In this present paper, an Electronic Patient Record (EPR) is embedded as watermark within an IVUS video and then motion vector is calculated. This proposed method proves robustness as the extracted watermark has good PSNR value and less MSE. △ Less

Submitted 9 March, 2013; originally announced March 2013.

Comments: Pages-7 Fig.-15 Tables-2

Journal ref: Scientific Bulletin of the Politehnica University of Timisoara - Transactions on Electronics and Communications p-ISSN 1583-3380 , vol. 57(71), no. 2, 2012

arXiv:1209.2903 [pdf]

A Novel Approach of Harris Corner Detection of Noisy Images using Adaptive Wavelet Thresholding Technique

Authors: Nilanjan Dey, Pradipti Nandi, Nilanjana Barman

Abstract: In this paper we propose a method of corner detection for obtaining features which is required to track and recognize objects within a noisy image. Corner detection of noisy images is a challenging task in image processing. Natural images often get corrupted by noise during acquisition and transmission. Though Corner detection of these noisy images does not provide desired results, hence de-noisin… ▽ More In this paper we propose a method of corner detection for obtaining features which is required to track and recognize objects within a noisy image. Corner detection of noisy images is a challenging task in image processing. Natural images often get corrupted by noise during acquisition and transmission. Though Corner detection of these noisy images does not provide desired results, hence de-noising is required. Adaptive wavelet thresholding approach is applied for the same. △ Less

Submitted 13 September, 2012; originally announced September 2012.

Comments: 5 pages, 10 figures. arXiv admin note: substantial text overlap with arXiv:1209.1558

Journal ref: International Journal of Computer Science & Technology(IJCST) Vol. 2, ISSUE 4, OCT. - DEC. 2011

arXiv:1209.1563 [pdf]

Wavelet Based QRS Complex Detection of ECG Signal

Authors: Sayantan Mukhopadhyay, Shouvik Biswas, Anamitra Bardhan Roy, Nilanjan Dey

Abstract: The Electrocardiogram (ECG) is a sensitive diagnostic tool that is used to detect various cardiovascular diseases by measuring and recording the electrical activity of the heart in exquisite detail. A wide range of heart condition is determined by thorough examination of the features of the ECG report. Automatic extraction of time plane features is important for identification of vital cardiac dis… ▽ More The Electrocardiogram (ECG) is a sensitive diagnostic tool that is used to detect various cardiovascular diseases by measuring and recording the electrical activity of the heart in exquisite detail. A wide range of heart condition is determined by thorough examination of the features of the ECG report. Automatic extraction of time plane features is important for identification of vital cardiac diseases. This paper presents a multi-resolution wavelet transform based system for detection 'P', 'Q', 'R', 'S', 'T' peaks complex from original ECG signal. 'R-R' time lapse is an important minutia of the ECG signal that corresponds to the heartbeat of the concerned person. Abrupt increase in height of the 'R' wave or changes in the measurement of the 'R-R' denote various anomalies of human heart. Similarly 'P-P', 'Q-Q', 'S-S', 'T-T' also corresponds to different anomalies of heart and their peak amplitude also envisages other cardiac diseases. In this proposed method the 'PQRST' peaks are marked and stored over the entire signal and the time interval between two consecutive 'R' peaks and other peaks interval are measured to detect anomalies in behavior of heart, if any. The peaks are achieved by the composition of Daubeheissub bands wavelet of original ECG signal. The accuracy of the 'PQRST' complex detection and interval measurement is achieved up to 100% with high exactitude by processing and thresholding the original ECG signal. △ Less

Submitted 7 September, 2012; originally announced September 2012.

Comments: 5 pages, 8 figures, ISSN: 2248-9622

Journal ref: Journal of Engineering Research and Applications (IJERA) Vol. 2, Issue 3, 2012, pp.2361-2365

arXiv:1209.1558 [pdf]

A Comparative Study between Moravec and Harris Corner Detection of Noisy Images Using Adaptive Wavelet Thresholding Technique

Authors: Nilanjan Dey, Pradipti Nandi, Nilanjana Barman, Debolina Das, Subhabrata Chakraborty

Abstract: In this paper a comparative study between Moravec and Harris Corner Detection has been done for obtaining features required to track and recognize objects within a noisy image. Corner detection of noisy images is a challenging task in image processing. Natural images often get corrupted by noise during acquisition and transmission. As Corner detection of these noisy images does not provide desired… ▽ More In this paper a comparative study between Moravec and Harris Corner Detection has been done for obtaining features required to track and recognize objects within a noisy image. Corner detection of noisy images is a challenging task in image processing. Natural images often get corrupted by noise during acquisition and transmission. As Corner detection of these noisy images does not provide desired results, hence de-noising is required. Adaptive wavelet thresholding approach is applied for the same. △ Less

Submitted 7 September, 2012; originally announced September 2012.

Comments: 8 pages, 13 figures

Journal ref: International Journal of Engineering Research and Applications (IJERA) Vol. 2, Issue 1, Jan-Feb 2012, pp.599-606

arXiv:1209.1224 [pdf]

Wavelet Based Normal and Abnormal Heart Sound Identification using Spectrogram Analysis

Authors: Nilanjan Dey, Achintya Das, Sheli Sinha Chaudhuri

Abstract: The present work proposes a computer-aided normal and abnormal heart sound identification based on Discrete Wavelet Transform (DWT), it being useful for tele-diagnosis of heart diseases. Due to the presence of Cumulative Frequency components in the spectrogram, DWT is applied on the spectro-gram up to n level to extract the features from the individual approximation components. One dimensional fea… ▽ More The present work proposes a computer-aided normal and abnormal heart sound identification based on Discrete Wavelet Transform (DWT), it being useful for tele-diagnosis of heart diseases. Due to the presence of Cumulative Frequency components in the spectrogram, DWT is applied on the spectro-gram up to n level to extract the features from the individual approximation components. One dimensional feature vector is obtained by evaluating the Row Mean of the approximation components of these spectrograms. For this present approach, the set of spectrograms has been considered as the database, rather than raw sound samples. Minimum Euclidean distance is computed between feature vector of the test sample and the feature vectors of the stored samples to identify the heart sound. By applying this algorithm, almost 82% of accuracy was achieved. △ Less

Submitted 6 September, 2012; originally announced September 2012.

Comments: 7 pages, 13 figures

Journal ref: International Journal of Computer Science & Engineering Technology (IJCSET), Vol. 3 No. 6 June 2012, ISSN : 2229-3345

arXiv:1209.1181 [pdf]

FCM Based Blood Vessel Segmentation Method for Retinal Images

Authors: Nilanjan Dey, Anamitra Bardhan Roy, Moumita Pal, Achintya Das

Abstract: Segmentation of blood vessels in retinal images provides early diagnosis of diseases like glaucoma, diabetic retinopathy and macular degeneration. Among these diseases occurrence of Glaucoma is most frequent and has serious ocular consequences that can even lead to blindness, if it is not detected early. The clinical criteria for the diagnosis of glaucoma include intraocular pressure measurement,… ▽ More Segmentation of blood vessels in retinal images provides early diagnosis of diseases like glaucoma, diabetic retinopathy and macular degeneration. Among these diseases occurrence of Glaucoma is most frequent and has serious ocular consequences that can even lead to blindness, if it is not detected early. The clinical criteria for the diagnosis of glaucoma include intraocular pressure measurement, optic nerve head evaluation, retinal nerve fiber layer and visual field defects. This form of blood vessel segmentation helps in early detection for ophthalmic diseases, and potentially reduces the risk of blindness. The low-contrast images at the retina owing to narrow blood vessels of the retina are difficult to extract. These low contrast images are, however useful in revealing certain systemic diseases. Motivated by the goals of improving detection of such vessels, this present work proposes an algorithm for segmentation of blood vessels and compares the results between expert ophthalmologist hand-drawn ground-truths and segmented image(i.e. the output of the present work).Sensitivity, specificity, positive predictive value (PPV), positive likelihood ratio (PLR) and accuracy are used to evaluate overall performance.It is found that this work segments blood vessels successfully with sensitivity, specificity, PPV, PLR and accuracy of 99.62%, 54.66%, 95.08%, 219.72 and 95.03%, respectively. △ Less

Submitted 6 September, 2012; originally announced September 2012.

Comments: 5 pages,3figures

Journal ref: International Journal of Computer Science and Network (IJCSN),Volume 1, Issue 3, June 2012,ISSN 2277-5420

arXiv:1209.0054 [pdf]

A Novel Session Based Dual Steganographic Technique Using DWT and Spread Spectrum

Authors: Tanmay Bhattacharya, Nilanjan Dey, S. R. Bhadra Chaudhuri

Abstract: This paper proposed a DWT based Steganographic technique. Cover image is decomposed into four sub bands using DWT. Two secret images are embedded within the HL and HH sub bands respectively. During embedding secret images are dispersed within each band using a pseudo random sequence and a Session key. Secret images are extracted using the session key and the size of the images. In this approach th… ▽ More This paper proposed a DWT based Steganographic technique. Cover image is decomposed into four sub bands using DWT. Two secret images are embedded within the HL and HH sub bands respectively. During embedding secret images are dispersed within each band using a pseudo random sequence and a Session key. Secret images are extracted using the session key and the size of the images. In this approach the stego image generated is of acceptable level of imperceptibility and distortion compared to the cover image and the overall security is high. △ Less

Submitted 1 September, 2012; originally announced September 2012.

Comments: 5 pages, 9 figures. arXiv admin note: substantial text overlap with arXiv:1208.0803

Journal ref: International Journal of Modern Engineering Research (IJMER),Vol.1, Dec 2011, Issue1, pp-157-161,ISSN: 2249-6645

arXiv:1209.0053 [pdf]

A Session Based Blind Watermarking Technique within the NROI of Retinal Fundus Images for Authentication Using DWT, Spread Spectrum and Harris Corner Detection

Authors: Nilanjan Dey, Moumita Pal, Achintya Das

Abstract: Digital Retinal Fundus Images helps to detect various ophthalmic diseases by detecting morphological changes in optical cup, optical disc and macula. Present work proposes a method for the authentication of medical images based on Discrete Wavelet Transformation (DWT) and Spread Spectrum. Proper selection of the Non Region of Interest (NROI) for watermarking is crucial, as the area under concern h… ▽ More Digital Retinal Fundus Images helps to detect various ophthalmic diseases by detecting morphological changes in optical cup, optical disc and macula. Present work proposes a method for the authentication of medical images based on Discrete Wavelet Transformation (DWT) and Spread Spectrum. Proper selection of the Non Region of Interest (NROI) for watermarking is crucial, as the area under concern has to be the least required portion conveying any medical information. Proposed method discusses both the selection of least impact area and the blind watermarking technique. Watermark is embedded within the High-High (HH) sub band. During embedding, watermarked image is dispersed within the band using a pseudo random sequence and a Session key. Watermarked image is extracted using the session key and the size of the image. In this approach the generated watermarked image having an acceptable level of imperceptibility and distortion is compared to the Original retinal image based on Peak Signal to Noise Ratio (PSNR) and correlation value. △ Less

Submitted 1 September, 2012; originally announced September 2012.

Comments: 9 pages, 10 figures

Journal ref: International Journal of Modern Engineering Research (IJMER),Vol.2, Issue.3,May-June 2012 pp-749-757,ISSN: 2249-6645

arXiv:1208.0950 [pdf]

doi 10.5120/4604-6808

A Session based Multiple Image Hiding Technique using DWT and DCT

Authors: Tanmay Bhattacharya, Nilanjan Dey, S. R. Bhadra Chaudhuri

Abstract: This work proposes Steganographic technique for hiding multiple images in a color image based on DWT and DCT. The cover image is decomposed into three separate color planes namely R, G and B. Individual planes are decomposed into subbands using DWT. DCT is applied in HH component of each plane. Secret images are dispersed among the selected DCT coefficients using a pseudo random sequence and a Ses… ▽ More This work proposes Steganographic technique for hiding multiple images in a color image based on DWT and DCT. The cover image is decomposed into three separate color planes namely R, G and B. Individual planes are decomposed into subbands using DWT. DCT is applied in HH component of each plane. Secret images are dispersed among the selected DCT coefficients using a pseudo random sequence and a Session key. Secret images are extracted using the session key and the size of the images from the planer decomposed stego image. In this approach the stego image generated is of acceptable level of imperceptibility and distortion compared to the cover image and the overall security is high. △ Less

Submitted 4 August, 2012; originally announced August 2012.

Comments: 4 pages,16 figures, "Published with International Journal of Computer Applications (IJCA)"

Journal ref: Tanmay Bhattacharya,Nilanjan Dey,Bhadra S R Chaudhuri. Article:A Session Based Multiple Image Hiding Technique using DWT&DCT.International Journal of Computer Applications 38(5):18-21,January 2012.Published by Foundation of Computer Science

arXiv:1208.0803 [pdf]

doi 10.5120/4487-6316

A Novel Approach of Color Image Hiding using RGB Color planes and DWT

Authors: Nilanjan Dey, Anamitra Bardhan Roy, Sayantan Dey

Abstract: This work proposes a wavelet based Steganographic technique for the color image. The true color cover image and the true color secret image both are decomposed into three separate color planes namely R, G and B. Each plane of the images is decomposed into four sub bands using DWT. Each color plane of the secret image is hidden by alpha blending technique in the corresponding sub bands of the respe… ▽ More This work proposes a wavelet based Steganographic technique for the color image. The true color cover image and the true color secret image both are decomposed into three separate color planes namely R, G and B. Each plane of the images is decomposed into four sub bands using DWT. Each color plane of the secret image is hidden by alpha blending technique in the corresponding sub bands of the respective color planes of the original image. During embedding, secret image is dispersed within the original image depending upon the alpha value. Extraction of the secret image varies according to the alpha value. In this approach the stego image generated is of acceptable level of imperceptibility and distortion compared to the cover image and the overall security is high. △ Less

Submitted 3 August, 2012; originally announced August 2012.

Comments: 6 pages, 14 figures, Published with International Journal of Computer Applications (IJCA)

Journal ref: International Journal of Computer Applications 36(5):19-24, December 2011

Showing 1–50 of 50 results for author: Dey, N