Skip to main content

Showing 1–50 of 78 results for author: Raman, S

  1. arXiv:2407.09294  [pdf, other

    cs.CV

    SS-SfP:Neural Inverse Rendering for Self Supervised Shape from (Mixed) Polarization

    Authors: Ashish Tiwari, Shanmuganathan Raman

    Abstract: We present a novel inverse rendering-based framework to estimate the 3D shape (per-pixel surface normals and depth) of objects and scenes from single-view polarization images, the problem popularly known as Shape from Polarization (SfP). The existing physics-based and learning-based methods for SfP perform under certain restrictions, i.e., (a) purely diffuse or purely specular reflections, which a… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: Published in Pacific Graphics 2023

  2. arXiv:2405.13832  [pdf, other

    cs.CR cs.AI cs.LG

    Federated Learning in Healthcare: Model Misconducts, Security, Challenges, Applications, and Future Research Directions -- A Systematic Review

    Authors: Md Shahin Ali, Md Manjurul Ahsan, Lamia Tasnim, Sadia Afrin, Koushik Biswas, Md Maruf Hossain, Md Mahfuz Ahmed, Ronok Hashan, Md Khairul Islam, Shivakumar Raman

    Abstract: Data privacy has become a major concern in healthcare due to the increasing digitization of medical records and data-driven medical research. Protecting sensitive patient information from breaches and unauthorized access is critical, as such incidents can have severe legal and ethical complications. Federated Learning (FL) addresses this concern by enabling multiple healthcare institutions to coll… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  3. arXiv:2405.10925  [pdf

    stat.ME cs.AI cs.LG

    High-dimensional multiple imputation (HDMI) for partially observed confounders including natural language processing-derived auxiliary covariates

    Authors: Janick Weberpals, Pamela A. Shaw, Kueiyu Joshua Lin, Richard Wyss, Joseph M Plasek, Li Zhou, Kerry Ngan, Thomas DeRamus, Sudha R. Raman, Bradley G. Hammill, Hana Lee, Sengwee Toh, John G. Connolly, Kimberly J. Dandreo, Fang Tian, Wei Liu, Jie Li, José J. Hernández-Muñoz, Sebastian Schneeweiss, Rishi J. Desai

    Abstract: Multiple imputation (MI) models can be improved by including auxiliary covariates (AC), but their performance in high-dimensional data is not well understood. We aimed to develop and compare high-dimensional MI (HDMI) approaches using structured and natural language processing (NLP)-derived AC in studies with partially observed confounders. We conducted a plasmode simulation study using data from… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  4. arXiv:2312.06317  [pdf, other

    cs.GR

    Flow Symmetrization for Parameterized Constrained Diffeomorphisms

    Authors: Aalok Gangopadhyay, Dwip Dalal, Progyan Das, Shanmuganathan Raman

    Abstract: Diffeomorphisms play a crucial role while searching for shapes with fixed topological properties, allowing for smooth deformation of template shapes. Several approaches use diffeomorphism for shape search. However, these approaches employ only unconstrained diffeomorphisms. In this work, we develop Flow Symmetrization - a method to represent a parametric family of constrained diffeomorphisms that… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

  5. arXiv:2311.11988  [pdf, other

    cs.CV cs.AI

    Categorizing the Visual Environment and Analyzing the Visual Attention of Dogs

    Authors: Shreyas Sundara Raman, Madeline H. Pelgrim, Daphna Buchsbaum, Thomas Serre

    Abstract: Dogs have a unique evolutionary relationship with humans and serve many important roles e.g. search and rescue, blind assistance, emotional support. However, few datasets exist to categorize visual features and objects available to dogs, as well as how dogs direct their visual attention within their environment. We collect and study a dataset with over 11,698 gazes to categorize the objects availa… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: 13 pages, 11 figures, 1 table, WACV CV4Smalls Workshop

  6. arXiv:2311.04942  [pdf, other

    eess.IV cs.CV

    CSAM: A 2.5D Cross-Slice Attention Module for Anisotropic Volumetric Medical Image Segmentation

    Authors: Alex Ling Yu Hung, Haoxin Zheng, Kai Zhao, Xiaoxi Du, Kaifeng Pang, Qi Miao, Steven S. Raman, Demetri Terzopoulos, Kyunghyun Sung

    Abstract: A large portion of volumetric medical data, especially magnetic resonance imaging (MRI) data, is anisotropic, as the through-plane resolution is typically much lower than the in-plane resolution. Both 3D and purely 2D deep learning-based segmentation methods are deficient in dealing with such volumetric data since the performance of 3D methods suffers when confronting anisotropic data, and 2D meth… ▽ More

    Submitted 26 November, 2023; v1 submitted 7 November, 2023; originally announced November 2023.

  7. arXiv:2310.16532  [pdf, other

    cs.CV

    Learning Robust Deep Visual Representations from EEG Brain Recordings

    Authors: Prajwal Singh, Dwip Dalal, Gautam Vashishtha, Krishna Miyapuram, Shanmuganathan Raman

    Abstract: Decoding the human brain has been a hallmark of neuroscientists and Artificial Intelligence researchers alike. Reconstruction of visual images from brain Electroencephalography (EEG) signals has garnered a lot of interest due to its applications in brain-computer interfacing. This study proposes a two-stage method where the first step is to obtain EEG-derived features for robust learning of deep r… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: Accepted in WACV 2024

  8. arXiv:2310.08645  [pdf, other

    cs.CV cs.LG

    Defect Analysis of 3D Printed Cylinder Object Using Transfer Learning Approaches

    Authors: Md Manjurul Ahsan, Shivakumar Raman, Zahed Siddique

    Abstract: Additive manufacturing (AM) is gaining attention across various industries like healthcare, aerospace, and automotive. However, identifying defects early in the AM process can reduce production costs and improve productivity - a key challenge. This study explored the effectiveness of machine learning (ML) approaches, specifically transfer learning (TL) models, for defect detection in 3D-printed cy… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

  9. arXiv:2309.09919  [pdf, other

    cs.RO cs.AI cs.FL

    Plug in the Safety Chip: Enforcing Constraints for LLM-driven Robot Agents

    Authors: Ziyi Yang, Shreyas S. Raman, Ankit Shah, Stefanie Tellex

    Abstract: Recent advancements in large language models (LLMs) have enabled a new research domain, LLM agents, for solving robotics and planning tasks by leveraging the world knowledge and general reasoning abilities of LLMs obtained during pretraining. However, while considerable effort has been made to teach the robot the "dos," the "don'ts" received relatively less attention. We argue that, for any practi… ▽ More

    Submitted 28 November, 2023; v1 submitted 18 September, 2023; originally announced September 2023.

  10. arXiv:2308.13488  [pdf, other

    eess.IV cs.AI cs.CV physics.med-ph

    Temporal Uncertainty Localization to Enable Human-in-the-loop Analysis of Dynamic Contrast-enhanced Cardiac MRI Datasets

    Authors: Dilek M. Yalcinkaya, Khalid Youssef, Bobak Heydari, Orlando Simonetti, Rohan Dharmakumar, Subha Raman, Behzad Sharif

    Abstract: Dynamic contrast-enhanced (DCE) cardiac magnetic resonance imaging (CMRI) is a widely used modality for diagnosing myocardial blood flow (perfusion) abnormalities. During a typical free-breathing DCE-CMRI scan, close to 300 time-resolved images of myocardial perfusion are acquired at various contrast "wash in/out" phases. Manual segmentation of myocardial contours in each time-frame of a DCE image… ▽ More

    Submitted 13 November, 2023; v1 submitted 25 August, 2023; originally announced August 2023.

    Comments: Accepted for publication in MICCAI 2023

  11. arXiv:2307.08652  [pdf, other

    cs.GR

    Search Me Knot, Render Me Knot: Embedding Search and Differentiable Rendering of Knots in 3D

    Authors: Aalok Gangopadhyay, Paras Gupta, Tarun Sharma, Prajwal Singh, Shanmuganathan Raman

    Abstract: We introduce the problem of knot-based inverse perceptual art. Given multiple target images and their corresponding viewing configurations, the objective is to find a 3D knot-based tubular structure whose appearance resembles the target images when viewed from the specified viewing configurations. To solve this problem, we first design a differentiable rendering algorithm for rendering tubular kno… ▽ More

    Submitted 19 August, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

  12. arXiv:2307.02814  [pdf, other

    cs.CV eess.IV

    Single Image LDR to HDR Conversion using Conditional Diffusion

    Authors: Dwip Dalal, Gautam Vashishtha, Prajwal Singh, Shanmuganathan Raman

    Abstract: Digital imaging aims to replicate realistic scenes, but Low Dynamic Range (LDR) cameras cannot represent the wide dynamic range of real scenes, resulting in under-/overexposed images. This paper presents a deep learning-based approach for recovering intricate details from shadows and highlights while reconstructing High Dynamic Range (HDR) images. We formulate the problem as an image-to-image (I2I… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

    Journal ref: IEEE International Conference on Image Processing 2023

  13. arXiv:2306.13452  [pdf, other

    cs.CV cs.GR

    A Graph Neural Network Approach for Temporal Mesh Blending and Correspondence

    Authors: Aalok Gangopadhyay, Abhinav Narayan Harish, Prajwal Singh, Shanmuganathan Raman

    Abstract: We have proposed a self-supervised deep learning framework for solving the mesh blending problem in scenarios where the meshes are not in correspondence. To solve this problem, we have developed Red-Blue MPNN, a novel graph neural network that processes an augmented graph to estimate the correspondence. We have designed a novel conditional refinement scheme to find the exact correspondence when ce… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

  14. arXiv:2305.20077  [pdf, other

    cs.LG cs.DC cs.SE

    Managed Geo-Distributed Feature Store: Architecture and System Design

    Authors: Anya Li, Bhala Ranganathan, Feng Pan, Mickey Zhang, Qianjun Xu, Runhan Li, Sethu Raman, Shail Paragbhai Shah, Vivienne Tang

    Abstract: Companies are using machine learning to solve real-world problems and are developing hundreds to thousands of features in the process. They are building feature engineering pipelines as part of MLOps life cycle to transform data from various data sources and materialize the same for future consumption. Without feature stores, different teams across various business groups would maintain the above… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: All the authors are from the AzureML Feature Store product group and are listed in alphabetical order. Bhala Ranganathan: System architect and tech lead of AzureML Feature Store. Feng Pan, Qianjun Xu: Engineering managers. Sethu Raman: Product Manager of AzureML Feature Store who structured and organized the product vision and specifications

  15. arXiv:2305.09777  [pdf, other

    cs.LG

    BSGAN: A Novel Oversampling Technique for Imbalanced Pattern Recognitions

    Authors: Md Manjurul Ahsan, Shivakumar Raman, Zahed Siddique

    Abstract: Class imbalanced problems (CIP) are one of the potential challenges in developing unbiased Machine Learning (ML) models for predictions. CIP occurs when data samples are not equally distributed between the two or multiple classes. Borderline-Synthetic Minority Oversampling Techniques (SMOTE) is one of the approaches that has been used to balance the imbalance data by oversampling the minor (limite… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

  16. CERTainty: Detecting DNS Manipulation at Scale using TLS Certificates

    Authors: Elisa Tsai, Deepak Kumar, Ram Sundara Raman, Gavin Li, Yael Eiger, Roya Ensafi

    Abstract: DNS manipulation is an increasingly common technique used by censors and other network adversaries to prevent users from accessing restricted Internet resources and hijack their connections. Prior work in detecting DNS manipulation relies largely on comparing DNS resolutions with trusted control results to identify inconsistencies. However, the emergence of CDNs and other cloud providers practicin… ▽ More

    Submitted 14 May, 2023; originally announced May 2023.

    Comments: To Appear in: Privacy Enhancing Technologies Symposium (PETS), July 2023

  17. arXiv:2305.04401  [pdf, other

    eess.IV cs.CV

    Few Shot Learning for Medical Imaging: A Comparative Analysis of Methodologies and Formal Mathematical Framework

    Authors: Jannatul Nayem, Sayed Sahriar Hasan, Noshin Amina, Bristy Das, Md Shahin Ali, Md Manjurul Ahsan, Shivakumar Raman

    Abstract: Deep learning becomes an elevated context regarding disposing of many machine learning tasks and has shown a breakthrough upliftment to extract features from unstructured data. Though this flourishing context is developing in the medical image processing sector, scarcity of problem-dependent training data has become a larger issue in the way of easy application of deep learning in the medical sect… ▽ More

    Submitted 31 May, 2023; v1 submitted 7 May, 2023; originally announced May 2023.

    Comments: Accepted for a Springer book chapter for a book title "Data-driven approaches to Medical Imaging"

  18. arXiv:2304.10582  [pdf, other

    eess.IV cs.CV

    Invariant Scattering Transform for Medical Imaging

    Authors: Md Manjurul Ahsan, Shivakumar Raman, Zahed Siddique

    Abstract: Over the years, the Invariant Scattering Transform (IST) technique has become popular for medical image analysis, including using wavelet transform computation using Convolutional Neural Networks (CNN) to capture patterns' scale and orientation in the input signal. IST aims to be invariant to transformations that are common in medical images, such as translation, rotation, scaling, and deformation… ▽ More

    Submitted 31 May, 2023; v1 submitted 20 April, 2023; originally announced April 2023.

    Comments: Accepted for Springer book chapter for a book "Data-driven approaches to Medical Imaging"

  19. arXiv:2302.10121  [pdf, other

    cs.HC q-bio.NC

    EEG2IMAGE: Image Reconstruction from EEG Brain Signals

    Authors: Prajwal Singh, Pankaj Pandey, Krishna Miyapuram, Shanmuganathan Raman

    Abstract: Reconstructing images using brain signals of imagined visuals may provide an augmented vision to the disabled, leading to the advancement of Brain-Computer Interface (BCI) technology. The recent progress in deep learning has boosted the study area of synthesizing images from brain signals using Generative Adversarial Networks (GAN). In this work, we have proposed a framework for synthesizing the i… ▽ More

    Submitted 18 March, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

    Comments: Accepted in ICASSP 2023

  20. arXiv:2212.03733  [pdf, other

    cs.LG cs.AI

    Tiered Reward Functions: Specifying and Fast Learning of Desired Behavior

    Authors: Zhiyuan Zhou, Shreyas Sundara Raman, Henry Sowerby, Michael L. Littman

    Abstract: Reinforcement-learning agents seek to maximize a reward signal through environmental interactions. As humans, our job in the learning process is to design reward functions to express desired behavior and enable the agent to learn such behavior swiftly. In this work, we consider the reward-design problem in tasks formulated as reaching desirable states and avoiding undesirable states. To start, we… ▽ More

    Submitted 15 February, 2024; v1 submitted 7 December, 2022; originally announced December 2022.

    Comments: For code, see https://github.com/zhouzypaul/tiered-reward

  21. arXiv:2211.11040  [pdf, other

    cs.CV

    PointResNet: Residual Network for 3D Point Cloud Segmentation and Classification

    Authors: Aadesh Desai, Saagar Parikh, Seema Kumari, Shanmuganathan Raman

    Abstract: Point cloud segmentation and classification are some of the primary tasks in 3D computer vision with applications ranging from augmented reality to robotics. However, processing point clouds using deep learning-based algorithms is quite challenging due to the irregular point formats. Voxelization or 3D grid-based representation are different ways of applying deep neural networks to this problem. I… ▽ More

    Submitted 20 November, 2022; originally announced November 2022.

    Comments: Paper Under Review at IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023

  22. arXiv:2211.09935  [pdf, other

    cs.AI cs.CL cs.LG cs.RO

    CAPE: Corrective Actions from Precondition Errors using Large Language Models

    Authors: Shreyas Sundara Raman, Vanya Cohen, Ifrah Idrees, Eric Rosen, Ray Mooney, Stefanie Tellex, David Paulius

    Abstract: Extracting commonsense knowledge from a large language model (LLM) offers a path to designing intelligent robots. Existing approaches that leverage LLMs for planning are unable to recover when an action fails and often resort to retrying failed actions, without resolving the error's underlying cause. We propose a novel approach (CAPE) that attempts to propose corrective actions to resolve precondi… ▽ More

    Submitted 9 March, 2024; v1 submitted 17 November, 2022; originally announced November 2022.

    Comments: 17 pages, 6 figures, accepted at ICRA 2024

    MSC Class: 68T20; 68T50 ACM Class: I.2.7; I.2.8; I.2.2; I.2.4

  23. arXiv:2211.00565  [pdf, other

    cs.LG

    Revisiting Heterophily in Graph Convolution Networks by Learning Representations Across Topological and Feature Spaces

    Authors: Ashish Tiwari, Sresth Tosniwal, Shanmuganathan Raman

    Abstract: Graph convolution networks (GCNs) have been enormously successful in learning representations over several graph-based machine learning tasks. Specific to learning rich node representations, most of the methods have solely relied on the homophily assumption and have shown limited performance on the heterophilous graphs. While several methods have been developed with new architectures to address he… ▽ More

    Submitted 2 November, 2022; v1 submitted 1 November, 2022; originally announced November 2022.

    Comments: Under Review Project Page: https://sites.google.com/iitgn.ac.in/hetgcn/home

  24. arXiv:2210.04429  [pdf, other

    eess.IV cs.CV

    DeepHS-HDRVideo: Deep High Speed High Dynamic Range Video Reconstruction

    Authors: Zeeshan Khan, Parth Shettiwar, Mukul Khanna, Shanmuganathan Raman

    Abstract: Due to hardware constraints, standard off-the-shelf digital cameras suffers from low dynamic range (LDR) and low frame per second (FPS) outputs. Previous works in high dynamic range (HDR) video reconstruction uses sequence of alternating exposure LDR frames as input, and align the neighbouring frames using optical flow based networks. However, these methods often result in motion artifacts in chal… ▽ More

    Submitted 10 October, 2022; originally announced October 2022.

    Comments: ICPR 2022

  25. arXiv:2207.02025  [pdf, other

    cs.CV

    DeepPS2: Revisiting Photometric Stereo Using Two Differently Illuminated Images

    Authors: Ashish Tiwari, Shanmuganathan Raman

    Abstract: Photometric stereo, a problem of recovering 3D surface normals using images of an object captured under different lightings, has been of great interest and importance in computer vision research. Despite the success of existing traditional and deep learning-based methods, it is still challenging due to: (i) the requirement of three or more differently illuminated images, (ii) the inability to mode… ▽ More

    Submitted 30 August, 2022; v1 submitted 5 July, 2022; originally announced July 2022.

    Comments: Accepted in ECCV 2022 Project Page: https://sites.google.com/iitgn.ac.in/deepps2/home

  26. arXiv:2206.05617  [pdf, other

    cs.CV cs.LG q-bio.TO

    Federated Learning with Research Prototypes for Multi-Center MRI-based Detection of Prostate Cancer with Diverse Histopathology

    Authors: Abhejit Rajagopal, Ekaterina Redekop, Anil Kemisetti, Rushi Kulkarni, Steven Raman, Kirti Magudia, Corey W. Arnold, Peder E. Z. Larson

    Abstract: Early prostate cancer detection and staging from MRI are extremely challenging tasks for both radiologists and deep learning algorithms, but the potential to learn from large and diverse datasets remains a promising avenue to increase their generalization capability both within- and across clinics. To enable this for prototype-stage algorithms, where the majority of existing research remains, in t… ▽ More

    Submitted 11 June, 2022; originally announced June 2022.

    Comments: under review

  27. arXiv:2203.15163  [pdf, other

    eess.IV cs.CV

    CAT-Net: A Cross-Slice Attention Transformer Model for Prostate Zonal Segmentation in MRI

    Authors: Alex Ling Yu Hung, Haoxin Zheng, Qi Miao, Steven S. Raman, Demetri Terzopoulos, Kyunghyun Sung

    Abstract: Prostate cancer is the second leading cause of cancer death among men in the United States. The diagnosis of prostate MRI often relies on the accurate prostate zonal segmentation. However, state-of-the-art automatic segmentation methods often fail to produce well-contained volumetric segmentation of the prostate zones since certain slices of prostate MRI, such as base and apex slices, are harder t… ▽ More

    Submitted 16 June, 2022; v1 submitted 28 March, 2022; originally announced March 2022.

  28. arXiv:2111.15438  [pdf, other

    cs.CV eess.IV

    FMD-cGAN: Fast Motion Deblurring using Conditional Generative Adversarial Networks

    Authors: Jatin Kumar, Indra Deep Mastan, Shanmuganathan Raman

    Abstract: In this paper, we present a Fast Motion Deblurring-Conditional Generative Adversarial Network (FMD-cGAN) that helps in blind motion deblurring of a single image. FMD-cGAN delivers impressive structural similarity and visual appearance after deblurring an image. Like other deep neural network architectures, GANs also suffer from large model size (parameters) and computations. It is not easy to depl… ▽ More

    Submitted 9 December, 2021; v1 submitted 30 November, 2021; originally announced November 2021.

    Comments: International Conference on Computer Vision and Image Processing 2021

    ACM Class: I.4.3; I.4.4

  29. arXiv:2110.11795  [pdf, other

    eess.IV cs.CV

    HDRVideo-GAN: Deep Generative HDR Video Reconstruction

    Authors: Mrinal Anand, Nidhin Harilal, Chandan Kumar, Shanmuganathan Raman

    Abstract: High dynamic range (HDR) videos provide a more visually realistic experience than the standard low dynamic range (LDR) videos. Despite having significant progress in HDR imaging, it is still a challenging task to capture high-quality HDR video with a conventional off-the-shelf camera. Existing approaches rely entirely on using dense optical flow between the neighboring LDR sequences to reconstruct… ▽ More

    Submitted 3 November, 2021; v1 submitted 22 October, 2021; originally announced October 2021.

    Comments: In Proceedings of 12th Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP-21)

  30. arXiv:2110.03170  [pdf, other

    cs.CV

    TreeGCN-ED: Encoding Point Cloud using a Tree-Structured Graph Network

    Authors: Prajwal Singh, Kaustubh Sadekar, Shanmuganathan Raman

    Abstract: Point cloud is one of the widely used techniques for representing and storing 3D geometric data. In the past several methods have been proposed for processing point clouds. Methods such as PointNet and FoldingNet have shown promising results for tasks like 3D shape classification and segmentation. This work proposes a tree-structured autoencoder framework to generate robust embeddings of point clo… ▽ More

    Submitted 28 September, 2022; v1 submitted 6 October, 2021; originally announced October 2021.

  31. arXiv:2110.01660  [pdf, other

    cs.CV eess.IV

    HDR-cGAN: Single LDR to HDR Image Translation using Conditional GAN

    Authors: Prarabdh Raipurkar, Rohil Pal, Shanmuganathan Raman

    Abstract: The prime goal of digital imaging techniques is to reproduce the realistic appearance of a scene. Low Dynamic Range (LDR) cameras are incapable of representing the wide dynamic range of the real-world scene. The captured images turn out to be either too dark (underexposed) or too bright (overexposed). Specifically, saturation in overexposed regions makes the task of reconstructing a High Dynamic R… ▽ More

    Submitted 15 October, 2021; v1 submitted 4 October, 2021; originally announced October 2021.

    Comments: Accepted in ICVGIP 2021

  32. arXiv:2107.14539  [pdf, other

    cs.CV

    Shadow Art Revisited: A Differentiable Rendering Based Approach

    Authors: Kaustubh Sadekar, Ashish Tiwari, Shanmuganathan Raman

    Abstract: While recent learning based methods have been observed to be superior for several vision-related applications, their potential in generating artistic effects has not been explored much. One such interesting application is Shadow Art - a unique form of sculptural art where 2D shadows cast by a 3D sculpture produce artistic effects. In this work, we revisit shadow art using differentiable rendering… ▽ More

    Submitted 30 July, 2021; originally announced July 2021.

    Comments: Accepted in WACV 2022

  33. arXiv:2107.12859  [pdf, other

    cs.CV

    RGL-NET: A Recurrent Graph Learning framework for Progressive Part Assembly

    Authors: Abhinav Narayan Harish, Rajendra Nagar, Shanmuganathan Raman

    Abstract: Autonomous assembly of objects is an essential task in robotics and 3D computer vision. It has been studied extensively in robotics as a problem of motion planning, actuator control and obstacle avoidance. However, the task of developing a generalized framework for assembly robust to structural variants remains relatively unexplored. In this work, we tackle this problem using a recurrent graph lea… ▽ More

    Submitted 30 July, 2021; v1 submitted 27 July, 2021; originally announced July 2021.

    Comments: Accepted at the Winter Conference on Applications of Computer Vision (WACV 2022)

  34. arXiv:2106.15305  [pdf, other

    cs.CV cs.LG

    Joint Learning of Portrait Intrinsic Decomposition and Relighting

    Authors: Mona Zehni, Shaona Ghosh, Krishna Sridhar, Sethu Raman

    Abstract: Inverse rendering is the problem of decomposing an image into its intrinsic components, i.e. albedo, normal and lighting. To solve this ill-posed problem from single image, state-of-the-art methods in shape from shading mostly resort to supervised training on all the components on either synthetic or real datasets. Here, we propose a new self-supervised training paradigm that 1) reduces the need f… ▽ More

    Submitted 22 June, 2021; originally announced June 2021.

  35. arXiv:2106.09358  [pdf, other

    cs.CV

    ShuffleBlock: Shuffle to Regularize Deep Convolutional Neural Networks

    Authors: Sudhakar Kumawat, Gagan Kanojia, Shanmuganathan Raman

    Abstract: Deep neural networks have enormous representational power which leads them to overfit on most datasets. Thus, regularizing them is important in order to reduce overfitting and enhance their generalization capabilities. Recently, channel shuffle operation has been introduced for mixing channels in group convolutions in resource efficient networks in order to reduce memory and computations. This pap… ▽ More

    Submitted 17 June, 2021; originally announced June 2021.

  36. arXiv:2101.11674  [pdf, other

    cs.CV

    LS-HDIB: A Large Scale Handwritten Document Image Binarization Dataset

    Authors: Kaustubh Sadekar, Ashish Tiwari, Prajwal Singh, Shanmuganathan Raman

    Abstract: Handwritten document image binarization is challenging due to high variability in the written content and complex background attributes such as page style, paper quality, stains, shadow gradients, and non-uniform illumination. While the traditional thresholding methods do not effectively generalize on such challenging real-world scenarios, deep learning-based methods have performed relatively well… ▽ More

    Submitted 3 November, 2021; v1 submitted 27 January, 2021; originally announced January 2021.

  37. arXiv:2101.06217  [pdf, other

    cs.CV cs.LG

    APEX-Net: Automatic Plot Extractor Network

    Authors: Aalok Gangopadhyay, Prajwal Singh, Shanmuganathan Raman

    Abstract: Automatic extraction of raw data from 2D line plot images is a problem of great importance having many real-world applications. Several algorithms have been proposed for solving this problem. However, these algorithms involve a significant amount of human intervention. To minimize this intervention, we propose APEX-Net, a deep learning based framework with novel loss functions for solving the plot… ▽ More

    Submitted 11 February, 2021; v1 submitted 15 January, 2021; originally announced January 2021.

    Comments: https://sites.google.com/view/apexnetpaper/

  38. arXiv:2012.06498  [pdf, other

    cs.CV

    DeepObjStyle: Deep Object-based Photo Style Transfer

    Authors: Indra Deep Mastan, Shanmuganathan Raman

    Abstract: One of the major challenges of style transfer is the appropriate image features supervision between the output image and the input (style and content) images. An efficient strategy would be to define an object map between the objects of the style and the content images. However, such a mapping is not well established when there are semantic objects of different types and numbers in the style and t… ▽ More

    Submitted 11 December, 2020; originally announced December 2020.

  39. arXiv:2012.06469  [pdf, other

    cs.CV

    DILIE: Deep Internal Learning for Image Enhancement

    Authors: Indra Deep Mastan, Shanmuganathan Raman

    Abstract: We consider the generic deep image enhancement problem where an input image is transformed into a perceptually better-looking image. Recent methods for image enhancement consider the problem by performing style transfer and image restoration. The methods mostly fall into two categories: training data-based and training data-independent (deep internal learning methods). We perform image enhancement… ▽ More

    Submitted 11 December, 2020; originally announced December 2020.

  40. arXiv:2011.03712  [pdf, other

    cs.CV

    DeepCFL: Deep Contextual Features Learning from a Single Image

    Authors: Indra Deep Mastan, Shanmuganathan Raman

    Abstract: Recently, there is a vast interest in developing image feature learning methods that are independent of the training data, such as deep image prior, InGAN, SinGAN, and DCIL. These methods are unsupervised and are used to perform low-level vision tasks such as image restoration, image editing, and image synthesis. In this work, we proposed a new training data-independent framework, called Deep Cont… ▽ More

    Submitted 7 November, 2020; originally announced November 2020.

    Comments: IEEE Winter Conference on Applications of Computer Vision (WACV 2021), Waikoloa, US, Jan. 5-9, 2021

  41. arXiv:2011.03705  [pdf, other

    cs.CV

    Blind Motion Deblurring through SinGAN Architecture

    Authors: Harshil Jain, Rohit Patil, Indra Deep Mastan, Shanmuganathan Raman

    Abstract: Blind motion deblurring involves reconstructing a sharp image from an observation that is blurry. It is a problem that is ill-posed and lies in the categories of image restoration problems. The training data-based methods for image deblurring mostly involve training models that take a lot of time. These models are data-hungry i.e., they require a lot of training data to generate satisfactory resul… ▽ More

    Submitted 7 November, 2020; originally announced November 2020.

    Comments: Deep Internal Learning: Training with no prior examples. ECCV'2020 Workshop

  42. arXiv:2010.11649  [pdf, other

    cs.CV

    Learning to Sort Image Sequences via Accumulated Temporal Differences

    Authors: Gagan Kanojia, Shanmuganathan Raman

    Abstract: Consider a set of n images of a scene with dynamic objects captured with a static or a handheld camera. Let the temporal order in which these images are captured be unknown. There can be n! possibilities for the temporal order in which these images could have been captured. In this work, we tackle the problem of temporally sequencing the unordered set of images of a dynamic scene captured with a h… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

  43. Depthwise Spatio-Temporal STFT Convolutional Neural Networks for Human Action Recognition

    Authors: Sudhakar Kumawat, Manisha Verma, Yuta Nakashima, Shanmuganathan Raman

    Abstract: Conventional 3D convolutional neural networks (CNNs) are computationally expensive, memory intensive, prone to overfitting, and most importantly, there is a need to improve their feature learning capabilities. To address these issues, we propose spatio-temporal short term Fourier transform (STFT) blocks, a new class of convolutional blocks that can serve as an alternative to the 3D convolutional l… ▽ More

    Submitted 22 July, 2020; originally announced July 2020.

    Comments: Extended version of our CVPR 2019 work

  44. arXiv:2006.08003  [pdf, other

    eess.IV cs.CV cs.LG

    CompressNet: Generative Compression at Extremely Low Bitrates

    Authors: Suraj Kiran Raman, Aditya Ramesh, Vijayakrishna Naganoor, Shubham Dash, Giridharan Kumaravelu, Honglak Lee

    Abstract: Compressing images at extremely low bitrates (< 0.1 bpp) has always been a challenging task since the quality of reconstruction significantly reduces due to the strong imposed constraint on the number of bits allocated for the compressed data. With the increasing need to transfer large amounts of images with limited bandwidth, compressing images to very low sizes is a crucial task. However, the ex… ▽ More

    Submitted 14 June, 2020; originally announced June 2020.

  45. arXiv:2004.10362  [pdf, other

    cs.CV

    Yoga-82: A New Dataset for Fine-grained Classification of Human Poses

    Authors: Manisha Verma, Sudhakar Kumawat, Yuta Nakashima, Shanmuganathan Raman

    Abstract: Human pose estimation is a well-known problem in computer vision to locate joint positions. Existing datasets for the learning of poses are observed to be not challenging enough in terms of pose diversity, object occlusion, and viewpoints. This makes the pose annotation process relatively simple and restricts the application of the models that have been trained on them. To handle more variety in h… ▽ More

    Submitted 21 April, 2020; originally announced April 2020.

    Comments: Accepted CVPR Workshops 2020

  46. arXiv:2002.03165  [pdf, other

    cs.GR cs.CV cs.LG eess.IV

    Deep No-reference Tone Mapped Image Quality Assessment

    Authors: Chandra Sekhar Ravuri, Rajesh Sureddi, Sathya Veera Reddy Dendi, Shanmuganathan Raman, Sumohana S. Channappayya

    Abstract: The process of rendering high dynamic range (HDR) images to be viewed on conventional displays is called tone mapping. However, tone mapping introduces distortions in the final image which may lead to visual displeasure. To quantify these distortions, we introduce a novel no-reference quality assessment technique for these tone mapped images. This technique is composed of two stages. In the first… ▽ More

    Submitted 8 February, 2020; originally announced February 2020.

    Comments: 5 pages, 5 figures, 2 tables

  47. arXiv:2001.09912  [pdf, other

    cs.CV eess.IV

    Depthwise-STFT based separable Convolutional Neural Networks

    Authors: Sudhakar Kumawat, Shanmuganathan Raman

    Abstract: In this paper, we propose a new convolutional layer called Depthwise-STFT Separable layer that can serve as an alternative to the standard depthwise separable convolutional layer. The construction of the proposed layer is inspired by the fact that the Fourier coefficients can accurately represent important features such as edges in an image. It utilizes the Fourier coefficients computed (channelwi… ▽ More

    Submitted 27 January, 2020; originally announced January 2020.

    Comments: Accepted at ICASSP 2020

  48. arXiv:1912.11463  [pdf, other

    cs.CV eess.IV

    FHDR: HDR Image Reconstruction from a Single LDR Image using Feedback Network

    Authors: Zeeshan Khan, Mukul Khanna, Shanmuganathan Raman

    Abstract: High dynamic range (HDR) image generation from a single exposure low dynamic range (LDR) image has been made possible due to the recent advances in Deep Learning. Various feed-forward Convolutional Neural Networks (CNNs) have been proposed for learning LDR to HDR representations. To better utilize the power of CNNs, we exploit the idea of feedback, where the initial low level features are guided b… ▽ More

    Submitted 24 December, 2019; originally announced December 2019.

    Comments: 2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP)

  49. arXiv:1912.05591  [pdf, other

    cs.CV

    Simultaneous Detection and Removal of Dynamic Objects in Multi-view Images

    Authors: Gagan Kanojia, Shanmuganathan Raman

    Abstract: Consider a set of images of a scene consisting of moving objects captured using a hand-held camera. In this work, we propose an algorithm which takes this set of multi-view images as input, detects the dynamic objects present in the scene, and replaces them with the static regions which are being occluded by them. The proposed algorithm scans the reference image in the row-major order at the pixel… ▽ More

    Submitted 11 December, 2019; originally announced December 2019.

    Comments: Accepted at WACV 2020

  50. arXiv:1912.04229  [pdf, other

    eess.IV cs.CV

    DCIL: Deep Contextual Internal Learning for Image Restoration and Image Retargeting

    Authors: Indra Deep Mastan, Shanmuganathan Raman

    Abstract: Recently, there is a vast interest in developing methods which are independent of the training samples such as deep image prior, zero-shot learning, and internal learning. The methods above are based on the common goal of maximizing image features learning from a single image despite inherent technical diversity. In this work, we bridge the gap between the various unsupervised approaches above and… ▽ More

    Submitted 9 December, 2019; originally announced December 2019.