subscribe to arXiv mailings

Stranger Danger! Identifying and Avoiding Unpredictable Pedestrians in RL-based Social Robot Navigation

Authors: Sara Pohland, Alvin Tan, Prabal Dutta, Claire Tomlin

Abstract: Reinforcement learning (RL) methods for social robot navigation show great success navigating robots through large crowds of people, but the performance of these learning-based methods tends to degrade in particularly challenging or unfamiliar situations due to the models' dependency on representative training data. To ensure human safety and comfort, it is critical that these algorithms handle un… ▽ More Reinforcement learning (RL) methods for social robot navigation show great success navigating robots through large crowds of people, but the performance of these learning-based methods tends to degrade in particularly challenging or unfamiliar situations due to the models' dependency on representative training data. To ensure human safety and comfort, it is critical that these algorithms handle uncommon cases appropriately, but the low frequency and wide diversity of such situations present a significant challenge for these data-driven methods. To overcome this challenge, we propose modifications to the learning process that encourage these RL policies to maintain additional caution in unfamiliar situations. Specifically, we improve the Socially Attentive Reinforcement Learning (SARL) policy by (1) modifying the training process to systematically introduce deviations into a pedestrian model, (2) updating the value network to estimate and utilize pedestrian-unpredictability features, and (3) implementing a reward function to learn an effective response to pedestrian unpredictability. Compared to the original SARL policy, our modified policy maintains similar navigation times and path lengths, while reducing the number of collisions by 82% and reducing the proportion of time spent in the pedestrians' personal space by up to 19 percentage points for the most difficult cases. We also describe how to apply these modifications to other RL policies and demonstrate that some key high-level behaviors of our approach transfer to a physical robot. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2406.18700 [pdf, other]

On Fourier analysis of sparse Boolean functions over certain Abelian groups

Authors: Sourav Chakraborty, Swarnalipa Datta, Pranjal Dutta, Arijit Ghosh, Swagato Sanyal

Abstract: Given an Abelian group G, a Boolean-valued function f: G -> {-1,+1}, is said to be s-sparse, if it has at most s-many non-zero Fourier coefficients over the domain G. In a seminal paper, Gopalan et al. proved "Granularity" for Fourier coefficients of Boolean valued functions over Z_2^n, that have found many diverse applications in theoretical computer science and combinatorics. They also studied s… ▽ More Given an Abelian group G, a Boolean-valued function f: G -> {-1,+1}, is said to be s-sparse, if it has at most s-many non-zero Fourier coefficients over the domain G. In a seminal paper, Gopalan et al. proved "Granularity" for Fourier coefficients of Boolean valued functions over Z_2^n, that have found many diverse applications in theoretical computer science and combinatorics. They also studied structural results for Boolean functions over Z_2^n which are approximately Fourier-sparse. In this work, we obtain structural results for approximately Fourier-sparse Boolean valued functions over Abelian groups G of the form,G:= Z_{p_1}^{n_1} \times ... \times Z_{p_t}^{n_t}, for distinct primes p_i. We also obtain a lower bound of the form 1/(m^{2}s)^ceiling(phi(m)/2), on the absolute value of the smallest non-zero Fourier coefficient of an s-sparse function, where m=p_1 ... p_t, and phi(m)=(p_1-1) ... (p_t-1). We carefully apply probabilistic techniques from Gopalan et al., to obtain our structural results, and use some non-trivial results from algebraic number theory to get the lower bound. We construct a family of at most s-sparse Boolean functions over Z_p^n, where p > 2, for arbitrarily large enough s, where the minimum non-zero Fourier coefficient is 1/omega(n). The "Granularity" result of Gopalan et al. implies that the absolute values of non-zero Fourier coefficients of any s-sparse Boolean valued function over Z_2^n are 1/O(s). So, our result shows that one cannot expect such a lower bound for general Abelian groups. Using our new structural results on the Fourier coefficients of sparse functions, we design an efficient testing algorithm for Fourier-sparse Boolean functions, thata requires poly((ms)^phi(m),1/epsilon)-many queries. Further, we prove an Omega(sqrt{s}) lower bound on the query complexity of any adaptive sparsity testing algorithm. △ Less

Submitted 26 June, 2024; originally announced June 2024.

arXiv:2406.16993 [pdf, other]

Are Vision xLSTM Embedded UNet More Reliable in Medical 3D Image Segmentation?

Authors: Pallabi Dutta, Soham Bose, Swalpa Kumar Roy, Sushmita Mitra

Abstract: The advancement of developing efficient medical image segmentation has evolved from initial dependence on Convolutional Neural Networks (CNNs) to the present investigation of hybrid models that combine CNNs with Vision Transformers. Furthermore, there is an increasing focus on creating architectures that are both high-performing in medical image segmentation tasks and computationally efficient to… ▽ More The advancement of developing efficient medical image segmentation has evolved from initial dependence on Convolutional Neural Networks (CNNs) to the present investigation of hybrid models that combine CNNs with Vision Transformers. Furthermore, there is an increasing focus on creating architectures that are both high-performing in medical image segmentation tasks and computationally efficient to be deployed on systems with limited resources. Although transformers have several advantages like capturing global dependencies in the input data, they face challenges such as high computational and memory complexity. This paper investigates the integration of CNNs and Vision Extended Long Short-Term Memory (Vision-xLSTM) models by introducing a novel approach called UVixLSTM. The Vision-xLSTM blocks captures temporal and global relationships within the patches extracted from the CNN feature maps. The convolutional feature reconstruction path upsamples the output volume from the Vision-xLSTM blocks to produce the segmentation output. Our primary objective is to propose that Vision-xLSTM forms a reliable backbone for medical image segmentation tasks, offering excellent segmentation performance and reduced computational complexity. UVixLSTM exhibits superior performance compared to state-of-the-art networks on the publicly-available Synapse dataset. Code is available at: https://github.com/duttapallabi2907/UVixLSTM △ Less

Submitted 24 June, 2024; originally announced June 2024.

arXiv:2405.13449 [pdf, other]

Input Guided Multiple Deconstruction Single Reconstruction neural network models for Matrix Factorization

Authors: Prasun Dutta, Rajat K. De

Abstract: Referring back to the original text in the course of hierarchical learning is a common human trait that ensures the right direction of learning. The models developed based on the concept of Non-negative Matrix Factorization (NMF), in this paper are inspired by this idea. They aim to deal with high-dimensional data by discovering its low rank approximation by determining a unique pair of factor mat… ▽ More Referring back to the original text in the course of hierarchical learning is a common human trait that ensures the right direction of learning. The models developed based on the concept of Non-negative Matrix Factorization (NMF), in this paper are inspired by this idea. They aim to deal with high-dimensional data by discovering its low rank approximation by determining a unique pair of factor matrices. The model, named Input Guided Multiple Deconstruction Single Reconstruction neural network for Non-negative Matrix Factorization (IG-MDSR-NMF), ensures the non-negativity constraints of both factors. Whereas Input Guided Multiple Deconstruction Single Reconstruction neural network for Relaxed Non-negative Matrix Factorization (IG-MDSR-RNMF) introduces a novel idea of factorization with only the basis matrix adhering to the non-negativity criteria. This relaxed version helps the model to learn more enriched low dimensional embedding of the original data matrix. The competency of preserving the local structure of data in its low rank embedding produced by both the models has been appropriately verified. The superiority of low dimensional embedding over that of the original data justifying the need for dimension reduction has been established. The primacy of both the models has also been validated by comparing their performances separately with that of nine other established dimension reduction algorithms on five popular datasets. Moreover, computational complexity of the models and convergence analysis have also been presented testifying to the supremacy of the models. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: 50 pages, 25 figures

arXiv:2401.07631 [pdf, other]

Fixed-parameter debordering of Waring rank

Authors: Pranjal Dutta, Fulvio Gesmundo, Christian Ikenmeyer, Gorav Jindal, Vladimir Lysikov

Abstract: Border complexity measures are defined via limits (or topological closures), so that any function which can approximated arbitrarily closely by low complexity functions itself has low border complexity. Debordering is the task of proving an upper bound on some non-border complexity measure in terms of a border complexity measure, thus getting rid of limits. Debordering is at the heart of underst… ▽ More Border complexity measures are defined via limits (or topological closures), so that any function which can approximated arbitrarily closely by low complexity functions itself has low border complexity. Debordering is the task of proving an upper bound on some non-border complexity measure in terms of a border complexity measure, thus getting rid of limits. Debordering is at the heart of understanding the difference between Valiant's determinant vs permanent conjecture, and Mulmuley and Sohoni's variation which uses border determinantal complexity. The debordering of matrix multiplication tensors by Bini played a pivotal role in the development of efficient matrix multiplication algorithms. Consequently, debordering finds applications in both establishing computational complexity lower bounds and facilitating algorithm design. Currently, very few debordering results are known. In this work, we study the question of debordering the border Waring rank of polynomials. Waring and border Waring rank are very well studied measures in the context of invariant theory, algebraic geometry, and matrix multiplication algorithms. For the first time, we obtain a Waring rank upper bound that is exponential in the border Waring rank and only linear in the degree. All previous known results were exponential in the degree. For polynomials with constant border Waring rank, our results imply an upper bound on the Waring rank linear in degree, which previously was only known for polynomials with border Waring rank at most 5. △ Less

Submitted 15 January, 2024; originally announced January 2024.

Comments: 22 pages; accepted at STACS 2024; this is an edited part of the preprint arXiv:2211.07055

MSC Class: 68Q99 ACM Class: F.1.3

arXiv:2401.01587 [pdf, other]

doi 10.1007/978-3-031-48879-5

Real-Time Human Fall Detection using a Lightweight Pose Estimation Technique

Authors: Ekram Alam, Abu Sufian, Paramartha Dutta, Marco Leo

Abstract: The elderly population is increasing rapidly around the world. There are no enough caretakers for them. Use of AI-based in-home medical care systems is gaining momentum due to this. Human fall detection is one of the most important tasks of medical care system for the aged people. Human fall is a common problem among elderly people. Detection of a fall and providing medical help as early as possib… ▽ More The elderly population is increasing rapidly around the world. There are no enough caretakers for them. Use of AI-based in-home medical care systems is gaining momentum due to this. Human fall detection is one of the most important tasks of medical care system for the aged people. Human fall is a common problem among elderly people. Detection of a fall and providing medical help as early as possible is very important to reduce any further complexity. The chances of death and other medical complications can be reduced by detecting and providing medical help as early as possible after the fall. There are many state-of-the-art fall detection techniques available these days, but the majority of them need very high computing power. In this paper, we proposed a lightweight and fast human fall detection system using pose estimation. We used `Movenet' for human joins key-points extraction. Our proposed method can work in real-time on any low-computing device with any basic camera. All computation can be processed locally, so there is no problem of privacy of the subject. We used two datasets `GMDCSA' and `URFD' for the experiment. We got the sensitivity value of 0.9375 and 0.9167 for the dataset `GMDCSA' and `URFD' respectively. The source code and the dataset GMDCSA of our work are available online to access. △ Less

Submitted 3 January, 2024; originally announced January 2024.

arXiv:2311.17019 [pdf, other]

Homogeneous Algebraic Complexity Theory and Algebraic Formulas

Authors: Pranjal Dutta, Fulvio Gesmundo, Christian Ikenmeyer, Gorav Jindal, Vladimir Lysikov

Abstract: We study algebraic complexity classes and their complete polynomials under \emph{homogeneous linear} projections, not just under the usual affine linear projections that were originally introduced by Valiant in 1979. These reductions are weaker yet more natural from a geometric complexity theory (GCT) standpoint, because the corresponding orbit closure formulations do not require the padding of po… ▽ More We study algebraic complexity classes and their complete polynomials under \emph{homogeneous linear} projections, not just under the usual affine linear projections that were originally introduced by Valiant in 1979. These reductions are weaker yet more natural from a geometric complexity theory (GCT) standpoint, because the corresponding orbit closure formulations do not require the padding of polynomials. We give the \emph{first} complete polynomials for VF, the class of sequences of polynomials that admit small algebraic formulas, under homogeneous linear projections: The sum of the entries of the non-commutative elementary symmetric polynomial in 3 by 3 matrices of homogeneous linear forms. Even simpler variants of the elementary symmetric polynomial are hard for the topological closure of a large subclass of VF: the sum of the entries of the non-commutative elementary symmetric polynomial in 2 by 2 matrices of homogeneous linear forms, and homogeneous variants of the continuant polynomial (Bringmann, Ikenmeyer, Zuiddam, JACM '18). This requires a careful study of circuits with arity-3 product gates. △ Less

Submitted 28 November, 2023; originally announced November 2023.

Comments: This is edited part of preprint arXiv:2211.07055

MSC Class: 68Qxx ACM Class: F.1.3

arXiv:2309.17009 [pdf, other]

Deep Representation Learning for Prediction of Temporal Event Sets in the Continuous Time Domain

Authors: Parag Dutta, Kawin Mayilvaghanan, Pratyaksha Sinha, Ambedkar Dukkipati

Abstract: Temporal Point Processes (TPP) play an important role in predicting or forecasting events. Although these problems have been studied extensively, predicting multiple simultaneously occurring events can be challenging. For instance, more often than not, a patient gets admitted to a hospital with multiple conditions at a time. Similarly people buy more than one stock and multiple news breaks out at… ▽ More Temporal Point Processes (TPP) play an important role in predicting or forecasting events. Although these problems have been studied extensively, predicting multiple simultaneously occurring events can be challenging. For instance, more often than not, a patient gets admitted to a hospital with multiple conditions at a time. Similarly people buy more than one stock and multiple news breaks out at the same time. Moreover, these events do not occur at discrete time intervals, and forecasting event sets in the continuous time domain remains an open problem. Naive approaches for extending the existing TPP models for solving this problem lead to dealing with an exponentially large number of events or ignoring set dependencies among events. In this work, we propose a scalable and efficient approach based on TPPs to solve this problem. Our proposed approach incorporates contextual event embeddings, temporal information, and domain features to model the temporal event sets. We demonstrate the effectiveness of our approach through extensive experiments on multiple datasets, showing that our model outperforms existing methods in terms of prediction metrics and computational efficiency. To the best of our knowledge, this is the first work that solves the problem of predicting event set intensities in the continuous time domain by using TPPs. △ Less

Submitted 29 September, 2023; originally announced September 2023.

Comments: Accepted in ACML 2023 - Conference Track (Long Paper)

arXiv:2309.07449 [pdf]

Rate-Induced Transitions in Networked Complex Adaptive Systems: Exploring Dynamics and Management Implications Across Ecological, Social, and Socioecological Systems

Authors: Vítor V. Vasconcelos, Flávia M. D. Marquitti, Theresa Ong, Lisa C. McManus, Marcus Aguiar, Amanda B. Campos, Partha S. Dutta, Kristen Jovanelly, Victoria Junquera, Jude Kong, Elisabeth H. Krueger, Simon A. Levin, Wenying Liao, Mingzhen Lu, Dhruv Mittal, Mercedes Pascual, Flávio L. Pinheiro, Juan Rocha, Fernando P. Santos, Peter Sloot, Chenyang, Su, Benton Taylor, Eden Tekwa, Sjoerd Terpstra , et al. (5 additional authors not shown)

Abstract: Complex adaptive systems (CASs), from ecosystems to economies, are open systems and inherently dependent on external conditions. While a system can transition from one state to another based on the magnitude of change in external conditions, the rate of change -- irrespective of magnitude -- may also lead to system state changes due to a phenomenon known as a rate-induced transition (RIT). This st… ▽ More Complex adaptive systems (CASs), from ecosystems to economies, are open systems and inherently dependent on external conditions. While a system can transition from one state to another based on the magnitude of change in external conditions, the rate of change -- irrespective of magnitude -- may also lead to system state changes due to a phenomenon known as a rate-induced transition (RIT). This study presents a novel framework that captures RITs in CASs through a local model and a network extension where each node contributes to the structural adaptability of others. Our findings reveal how RITs occur at a critical environmental change rate, with lower-degree nodes tipping first due to fewer connections and reduced adaptive capacity. High-degree nodes tip later as their adaptability sources (lower-degree nodes) collapse. This pattern persists across various network structures. Our study calls for an extended perspective when managing CASs, emphasizing the need to focus not only on thresholds of external conditions but also the rate at which those conditions change, particularly in the context of the collapse of surrounding systems that contribute to the focal system's resilience. Our analytical method opens a path to designing management policies that mitigate RIT impacts and enhance resilience in ecological, social, and socioecological systems. These policies could include controlling environmental change rates, fostering system adaptability, implementing adaptive management strategies, and building capacity and knowledge exchange. Our study contributes to the understanding of RIT dynamics and informs effective management strategies for complex adaptive systems in the face of rapid environmental change. △ Less

Submitted 14 September, 2023; originally announced September 2023.

Comments: 25 pages, 4 figures, 1 box, supplementary information

MSC Class: 37G; 37N; 91B; 91C; 91D; 91E; 92D; 92D25; 92D40; 92F; 93A; 93A14; 93A16 ACM Class: I.6.3; I.6.m; J.3; J.4; J.m; K.4.2

arXiv:2306.15006 [pdf, other]

DNABERT-2: Efficient Foundation Model and Benchmark For Multi-Species Genome

Authors: Zhihan Zhou, Yanrong Ji, Weijian Li, Pratik Dutta, Ramana Davuluri, Han Liu

Abstract: Decoding the linguistic intricacies of the genome is a crucial problem in biology, and pre-trained foundational models such as DNABERT and Nucleotide Transformer have made significant strides in this area. Existing works have largely hinged on k-mer, fixed-length permutations of A, T, C, and G, as the token of the genome language due to its simplicity. However, we argue that the computation and sa… ▽ More Decoding the linguistic intricacies of the genome is a crucial problem in biology, and pre-trained foundational models such as DNABERT and Nucleotide Transformer have made significant strides in this area. Existing works have largely hinged on k-mer, fixed-length permutations of A, T, C, and G, as the token of the genome language due to its simplicity. However, we argue that the computation and sample inefficiencies introduced by k-mer tokenization are primary obstacles in developing large genome foundational models. We provide conceptual and empirical insights into genome tokenization, building on which we propose to replace k-mer tokenization with Byte Pair Encoding (BPE), a statistics-based data compression algorithm that constructs tokens by iteratively merging the most frequent co-occurring genome segment in the corpus. We demonstrate that BPE not only overcomes the limitations of k-mer tokenization but also benefits from the computational efficiency of non-overlapping tokenization. Based on these insights, we introduce DNABERT-2, a refined genome foundation model that adapts an efficient tokenizer and employs multiple strategies to overcome input length constraints, reduce time and memory expenditure, and enhance model capability. Furthermore, we identify the absence of a comprehensive and standardized benchmark for genome understanding as another significant impediment to fair comparative analysis. In response, we propose the Genome Understanding Evaluation (GUE), a comprehensive multi-species genome classification dataset that amalgamates $36$ distinct datasets across $9$ tasks, with input lengths ranging from $70$ to $10000$. Through comprehensive experiments on the GUE benchmark, we demonstrate that DNABERT-2 achieves comparable performance to the state-of-the-art model with $21 \times$ fewer parameters and approximately $92 \times$ less GPU time in pre-training. △ Less

Submitted 18 March, 2024; v1 submitted 26 June, 2023; originally announced June 2023.

Comments: Accepted by ICLR 2024

arXiv:2304.11325 [pdf, ps, other]

doi 10.4230/LIPIcs.CCC.2021.11

Deterministic identity testing paradigms for bounded top-fanin depth-4 circuits

Authors: Pranjal Dutta, Prateek Dwivedi, Nitin Saxena

Abstract: Polynomial Identity Testing (PIT) is a fundamental computational problem. The famous depth-$4$ reduction result by Agrawal and Vinay (FOCS 2008) has made PIT for depth-$4$ circuits an enticing pursuit. A restricted depth-4 circuit computing a $n$-variate degree-$d$ polynomial of the form $\sum_{i = 1}^{k} \prod_{j} g_{ij}$, where $°g_{ij} \leq δ$ is called $Σ^{[k]}ΠΣΠ^{[δ]}$ circuit. On further re… ▽ More Polynomial Identity Testing (PIT) is a fundamental computational problem. The famous depth-$4$ reduction result by Agrawal and Vinay (FOCS 2008) has made PIT for depth-$4$ circuits an enticing pursuit. A restricted depth-4 circuit computing a $n$-variate degree-$d$ polynomial of the form $\sum_{i = 1}^{k} \prod_{j} g_{ij}$, where $°g_{ij} \leq δ$ is called $Σ^{[k]}ΠΣΠ^{[δ]}$ circuit. On further restricting $g_{ij}$ to be sum of univariates we obtain $Σ^{[k]}ΠΣ\wedge$ circuits. The largely open, special-cases of $Σ^{[k]}ΠΣΠ^{[δ]}$ for constant $k$ and $δ$, and $Σ^{[k]}ΠΣ\wedge$ have been a source of many great ideas in the last two decades. For eg. depth-$3$ ideas of Dvir and Shpilka (STOC 2005), Kayal and Saxena (CCC 2006), and Saxena and Seshadhri (FOCS 2010 and STOC 2011). Further, depth-$4$ ideas of Beecken, Mittmann and Saxena (ICALP 2011), Saha, Saxena and Saptharishi (Comput.Compl. 2013), Forbes (FOCS 2015), and Kumar and Saraf (CCC 2016). Additionally, geometric Sylvester-Gallai ideas of Kayal and Saraf (FOCS 2009), Shpilka (STOC 2019), and Peleg and Shpilka (CCC 2020, STOC 2021). Very recently, a subexponential-time blackbox PIT algorithm for constant-depth circuits was obtained via lower bound breakthrough of Limaye, Srinivasan, Tavenas (FOCS 2021). We solve two of the basic underlying open problems in this work. We give the first polynomial-time PIT for $Σ^{[k]}ΠΣ\wedge$. We also give the first quasipolynomial time blackbox PIT for both $Σ^{[k]}ΠΣ\wedge$ and $Σ^{[k]}ΠΣΠ^{[δ]}$. A key technical ingredient in all the three algorithms is how the logarithmic derivative, and its power-series, modify the top $Π$-gate to $\wedge$. △ Less

Submitted 22 April, 2023; originally announced April 2023.

Comments: A preliminary version appeared in 36th Computational Complexity Conference (CCC), 2021

ACM Class: F.2.1

arXiv:2301.06961 [pdf, other]

Composite Deep Network with Feature Weighting for Improved Delineation of COVID Infection in Lung CT

Authors: Pallabi Dutta, Sushmita Mitra

Abstract: An early effective screening and grading of COVID-19 has become imperative towards optimizing the limited available resources of the medical facilities. An automated segmentation of the infected volumes in lung CT is expected to significantly aid in the diagnosis and care of patients. However, an accurate demarcation of lesions remains problematic due to their irregular structure and location(s) w… ▽ More An early effective screening and grading of COVID-19 has become imperative towards optimizing the limited available resources of the medical facilities. An automated segmentation of the infected volumes in lung CT is expected to significantly aid in the diagnosis and care of patients. However, an accurate demarcation of lesions remains problematic due to their irregular structure and location(s) within the lung. A novel deep learning architecture, Composite Deep network with Feature Weighting (CDNetFW), is proposed for efficient delineation of infected regions from lung CT images. Initially a coarser-segmentation is performed directly at shallower levels, thereby facilitating discovery of robust and discriminatory characteristics in the hidden layers. The novel feature weighting module helps prioritise relevant feature maps to be probed, along with those regions containing crucial information within these maps. This is followed by estimating the severity of the disease.The deep network CDNetFW has been shown to outperform several state-of-the-art architectures in the COVID-19 lesion segmentation task, as measured by experimental results on CT slices from publicly available datasets, especially when it comes to defining structures involving complex geometries. △ Less

Submitted 17 February, 2023; v1 submitted 17 January, 2023; originally announced January 2023.

arXiv:2211.07357 [pdf, other]

Controlling Commercial Cooling Systems Using Reinforcement Learning

Authors: Jerry Luo, Cosmin Paduraru, Octavian Voicu, Yuri Chervonyi, Scott Munns, Jerry Li, Crystal Qian, Praneet Dutta, Jared Quincy Davis, Ningjia Wu, Xingwei Yang, Chu-Ming Chang, Ted Li, Rob Rose, Mingyan Fan, Hootan Nakhost, Tinglin Liu, Brian Kirkman, Frank Altamura, Lee Cline, Patrick Tonker, Joel Gouker, Dave Uden, Warren Buddy Bryan, Jason Law , et al. (11 additional authors not shown)

Abstract: This paper is a technical overview of DeepMind and Google's recent work on reinforcement learning for controlling commercial cooling systems. Building on expertise that began with cooling Google's data centers more efficiently, we recently conducted live experiments on two real-world facilities in partnership with Trane Technologies, a building management system provider. These live experiments ha… ▽ More This paper is a technical overview of DeepMind and Google's recent work on reinforcement learning for controlling commercial cooling systems. Building on expertise that began with cooling Google's data centers more efficiently, we recently conducted live experiments on two real-world facilities in partnership with Trane Technologies, a building management system provider. These live experiments had a variety of challenges in areas such as evaluation, learning from offline data, and constraint satisfaction. Our paper describes these challenges in the hope that awareness of them will benefit future applied RL work. We also describe the way we adapted our RL system to deal with these challenges, resulting in energy savings of approximately 9% and 13% respectively at the two live experiment sites. △ Less

Submitted 14 December, 2022; v1 submitted 11 November, 2022; originally announced November 2022.

Comments: 27 pages, 11 figures

arXiv:2211.07055 [pdf, ps, other]

De-bordering and Geometric Complexity Theory for Waring rank and related models

Authors: Pranjal Dutta, Fulvio Gesmundo, Christian Ikenmeyer, Gorav Jindal, Vladimir Lysikov

Abstract: De-bordering is the task of proving that a border complexity measure is bounded from below, by a non-border complexity measure. This task is at the heart of understanding the difference between Valiant's determinant vs permanent conjecture, and Mulmuley and Sohoni's Geometric Complexity Theory (GCT) approach to settle the P \neq NP conjecture. Currently, very few de-bordering results are known.… ▽ More De-bordering is the task of proving that a border complexity measure is bounded from below, by a non-border complexity measure. This task is at the heart of understanding the difference between Valiant's determinant vs permanent conjecture, and Mulmuley and Sohoni's Geometric Complexity Theory (GCT) approach to settle the P \neq NP conjecture. Currently, very few de-bordering results are known. In this work, we study the question of de-bordering the border Waring rank of polynomials. Waring and border Waring rank are very well studied measures, in the context of invariant theory, algebraic geometry and matrix multiplication algorithms. For the first time, we obtain a Waring rank upper bound that is exponential in the border Waring rank and only *linear* in the degree. All previous results were known to be exponential in the degree. According to Kumar's recent surprising result (ToCT'20), a small border Waring rank implies that the polynomial can be approximated as a sum of a constant and a small product of linear polynomials. We prove the converse of Kumar's result, and in this way we de-border Kumar's complexity, and obtain a new formulation of border Waring rank, up to a factor of the degree. We phrase this new formulation as the orbit closure problem of the product-plus-power polynomial, and we successfully de-border this orbit closure. We fully implement the GCT approach against the power sum, and we generalize the ideas of Ikenmeyer-Kandasamy (STOC'20) to this new orbit closure. In this way, we obtain new multiplicity obstructions that are constructed from just the symmetries of the points and representation theoretic branching rules, rather than explicit multilinear computations. Furthermore, we realize that the generalization of our converse of Kumar's theorem to square matrices gives a homogeneous formulation of Ben-Or and Cleve (SICOMP'92). This results ... △ Less

Submitted 13 April, 2023; v1 submitted 13 November, 2022; originally announced November 2022.

MSC Class: 68W30; 14-XX; 05E10 ACM Class: F.1.3

arXiv:2210.15722 [pdf, other]

PatchRot: A Self-Supervised Technique for Training Vision Transformers

Authors: Sachin Chhabra, Prabal Bijoy Dutta, Hemanth Venkateswara, Baoxin Li

Abstract: Vision transformers require a huge amount of labeled data to outperform convolutional neural networks. However, labeling a huge dataset is a very expensive process. Self-supervised learning techniques alleviate this problem by learning features similar to supervised learning in an unsupervised way. In this paper, we propose a self-supervised technique PatchRot that is crafted for vision transforme… ▽ More Vision transformers require a huge amount of labeled data to outperform convolutional neural networks. However, labeling a huge dataset is a very expensive process. Self-supervised learning techniques alleviate this problem by learning features similar to supervised learning in an unsupervised way. In this paper, we propose a self-supervised technique PatchRot that is crafted for vision transformers. PatchRot rotates images and image patches and trains the network to predict the rotation angles. The network learns to extract both global and local features from an image. Our extensive experiments on different datasets showcase PatchRot training learns rich features which outperform supervised learning and compared baseline. △ Less

Submitted 27 October, 2022; originally announced October 2022.

Comments: NeurIPS Workshop on Vision Transformers: Theory and Applications (VTTA)

arXiv:2210.15571 [pdf, other]

Full-scale Deeply Supervised Attention Network for Segmenting COVID-19 Lesions

Authors: Pallabi Dutta, Sushmita Mitra

Abstract: Automated delineation of COVID-19 lesions from lung CT scans aids the diagnosis and prognosis for patients. The asymmetric shapes and positioning of the infected regions make the task extremely difficult. Capturing information at multiple scales will assist in deciphering features, at global and local levels, to encompass lesions of variable size and texture. We introduce the Full-scale Deeply Sup… ▽ More Automated delineation of COVID-19 lesions from lung CT scans aids the diagnosis and prognosis for patients. The asymmetric shapes and positioning of the infected regions make the task extremely difficult. Capturing information at multiple scales will assist in deciphering features, at global and local levels, to encompass lesions of variable size and texture. We introduce the Full-scale Deeply Supervised Attention Network (FuDSA-Net), for efficient segmentation of corona-infected lung areas in CT images. The model considers activation responses from all levels of the encoding path, encompassing multi-scalar features acquired at different levels of the network. This helps segment target regions (lesions) of varying shape, size and contrast. Incorporation of the entire gamut of multi-scalar characteristics into the novel attention mechanism helps prioritize the selection of activation responses and locations containing useful information. Determining robust and discriminatory features along the decoder path is facilitated with deep supervision. Connections in the decoder arm are remodeled to handle the issue of vanishing gradient. As observed from the experimental results, FuDSA-Net surpasses other state-of-the-art architectures; especially, when it comes to characterizing complicated geometries of the lesions. △ Less

Submitted 27 October, 2022; originally announced October 2022.

arXiv:2209.08112 [pdf, other]

Optimizing Industrial HVAC Systems with Hierarchical Reinforcement Learning

Authors: William Wong, Praneet Dutta, Octavian Voicu, Yuri Chervonyi, Cosmin Paduraru, Jerry Luo

Abstract: Reinforcement learning (RL) techniques have been developed to optimize industrial cooling systems, offering substantial energy savings compared to traditional heuristic policies. A major challenge in industrial control involves learning behaviors that are feasible in the real world due to machinery constraints. For example, certain actions can only be executed every few hours while other actions c… ▽ More Reinforcement learning (RL) techniques have been developed to optimize industrial cooling systems, offering substantial energy savings compared to traditional heuristic policies. A major challenge in industrial control involves learning behaviors that are feasible in the real world due to machinery constraints. For example, certain actions can only be executed every few hours while other actions can be taken more frequently. Without extensive reward engineering and experimentation, an RL agent may not learn realistic operation of machinery. To address this, we use hierarchical reinforcement learning with multiple agents that control subsets of actions according to their operation time scales. Our hierarchical approach achieves energy savings over existing baselines while maintaining constraints such as operating chillers within safe bounds in a simulated HVAC control environment. △ Less

Submitted 16 September, 2022; originally announced September 2022.

Comments: 11 pages, 5 figures

arXiv:2207.13131 [pdf, other]

Semi-analytical Industrial Cooling System Model for Reinforcement Learning

Authors: Yuri Chervonyi, Praneet Dutta, Piotr Trochim, Octavian Voicu, Cosmin Paduraru, Crystal Qian, Emre Karagozler, Jared Quincy Davis, Richard Chippendale, Gautam Bajaj, Sims Witherspoon, Jerry Luo

Abstract: We present a hybrid industrial cooling system model that embeds analytical solutions within a multi-physics simulation. This model is designed for reinforcement learning (RL) applications and balances simplicity with simulation fidelity and interpretability. The model's fidelity is evaluated against real world data from a large scale cooling system. This is followed by a case study illustrating ho… ▽ More We present a hybrid industrial cooling system model that embeds analytical solutions within a multi-physics simulation. This model is designed for reinforcement learning (RL) applications and balances simplicity with simulation fidelity and interpretability. The model's fidelity is evaluated against real world data from a large scale cooling system. This is followed by a case study illustrating how the model can be used for RL research. For this, we develop an industrial task suite that allows specifying different problem settings and levels of complexity, and use it to evaluate the performance of different RL algorithms. △ Less

Submitted 26 July, 2022; originally announced July 2022.

Comments: 27 pages, 13 figures

arXiv:2207.10952 [pdf, other]

Vision-based Human Fall Detection Systems using Deep Learning: A Review

Authors: Ekram Alam, Abu Sufian, Paramartha Dutta, Marco Leo

Abstract: Human fall is one of the very critical health issues, especially for elders and disabled people living alone. The number of elder populations is increasing steadily worldwide. Therefore, human fall detection is becoming an effective technique for assistive living for those people. For assistive living, deep learning and computer vision have been used largely. In this review article, we discuss dee… ▽ More Human fall is one of the very critical health issues, especially for elders and disabled people living alone. The number of elder populations is increasing steadily worldwide. Therefore, human fall detection is becoming an effective technique for assistive living for those people. For assistive living, deep learning and computer vision have been used largely. In this review article, we discuss deep learning (DL)-based state-of-the-art non-intrusive (vision-based) fall detection techniques. We also present a survey on fall detection benchmark datasets. For a clear understanding, we briefly discuss different metrics which are used to evaluate the performance of the fall detection systems. This article also gives a future direction on vision-based human fall detection techniques. △ Less

Submitted 22 July, 2022; originally announced July 2022.

arXiv:2207.07697 [pdf, other]

POET: Training Neural Networks on Tiny Devices with Integrated Rematerialization and Paging

Authors: Shishir G. Patil, Paras Jain, Prabal Dutta, Ion Stoica, Joseph E. Gonzalez

Abstract: Fine-tuning models on edge devices like mobile phones would enable privacy-preserving personalization over sensitive data. However, edge training has historically been limited to relatively small models with simple architectures because training is both memory and energy intensive. We present POET, an algorithm to enable training large neural networks on memory-scarce battery-operated edge devices… ▽ More Fine-tuning models on edge devices like mobile phones would enable privacy-preserving personalization over sensitive data. However, edge training has historically been limited to relatively small models with simple architectures because training is both memory and energy intensive. We present POET, an algorithm to enable training large neural networks on memory-scarce battery-operated edge devices. POET jointly optimizes the integrated search search spaces of rematerialization and paging, two algorithms to reduce the memory consumption of backpropagation. Given a memory budget and a run-time constraint, we formulate a mixed-integer linear program (MILP) for energy-optimal training. Our approach enables training significantly larger models on embedded devices while reducing energy consumption while not modifying mathematical correctness of backpropagation. We demonstrate that it is possible to fine-tune both ResNet-18 and BERT within the memory constraints of a Cortex-M class embedded device while outperforming current edge training methods in energy efficiency. POET is an open-source project available at https://github.com/ShishirPatil/poet △ Less

Submitted 15 July, 2022; originally announced July 2022.

Comments: Proceedings of the 39th International Conference on Machine Learning 2022 (ICML 2022)

arXiv:2207.03996 [pdf, other]

doi 10.1109/ICREST51555.2021.9331155

Optimization of Temperature and Relative Humidity in an Automatic Egg Incubator Using Mamdani Interference System

Authors: Pramit Dutta, Nafisa Anjum

Abstract: Temperature and humidity are two of the rudimentary factors that must be controlled during egg incubation. Improper temperature and humidity levels during the incubation period often result in unwanted conditions. This paper proposes the design of an efficient Mamdani fuzzy interference system instead of the widely used Takagi-Sugeno system in this field for controlling the temperature and humidit… ▽ More Temperature and humidity are two of the rudimentary factors that must be controlled during egg incubation. Improper temperature and humidity levels during the incubation period often result in unwanted conditions. This paper proposes the design of an efficient Mamdani fuzzy interference system instead of the widely used Takagi-Sugeno system in this field for controlling the temperature and humidity levels of an egg incubator. Though the optimum incubation temperature and humidity levels used here are that of chicken egg, the proposed methodology is applicable to other avian species as well. Theinput functions have been used here as per estimated values forsafe hatching using Mamdani whereas defuzzification method, COA, has been applied for output. From the model output,a stabilized heat from temperature level and fan speed to control the humidity level of an egg incubator can be obtained. This maximizes the hatching rate of healthy chicks under any conditions in the field. △ Less

Submitted 17 June, 2022; originally announced July 2022.

Comments: 5 pages, 13 figures, 2nd International Conference on Robotics, Electrical and Signal Processing Techniques (ICREST), DHAKA, Bangladesh

Journal ref: 2021 2nd International Conference on Robotics, Electrical and Signal Processing Techniques (ICREST), DHAKA, Bangladesh, 2021, pp. 12-16

arXiv:2206.08565 [pdf, other]

doi 10.1109/IMCOM53663.2022.9721789

Identifying Counterfeit Products using Blockchain Technology in Supply Chain System

Authors: Nafisa Anjum, Pramit Dutta

Abstract: With the advent of globalization and the evergrowing rate of technology, the volume of production as well as ease of procuring counterfeit goods has become unprecedented. Be it food, drug or luxury items, all kinds of industrial manufacturers and distributors are now seeking greater transparency in supply chain operations with a view to deter counterfeiting. This paper introduces a decentralized B… ▽ More With the advent of globalization and the evergrowing rate of technology, the volume of production as well as ease of procuring counterfeit goods has become unprecedented. Be it food, drug or luxury items, all kinds of industrial manufacturers and distributors are now seeking greater transparency in supply chain operations with a view to deter counterfeiting. This paper introduces a decentralized Blockchain based application system (DApp) with a view to identifying counterfeit products in the supply chain system. With the rapid rise of Blockchain technology, it has become known that data recorded within Blockchain is immutable and secure. Hence, the proposed project here uses this concept to handle the transfer of ownership of products. A consumer can verify the product distribution and ownership information scanning a Quick Response (QR) code generated by the DApp for each product linked to the Blockchain. △ Less

Submitted 17 June, 2022; originally announced June 2022.

Comments: 5 pages, 4 figures, 16th International Conference on Ubiquitous Information Management and Communication (IMCOM)

Journal ref: 16th International Conference on Ubiquitous Information Management and Communication (IMCOM), 2022, pp. 1-5

arXiv:2206.08557 [pdf, other]

doi 10.1109/ICREST51555.2021.9331029

COVID-19 Detection using Transfer Learning with Convolutional Neural Network

Authors: Pramit Dutta, Tanny Roy, Nafisa Anjum

Abstract: The Novel Coronavirus disease 2019 (COVID-19) is a fatal infectious disease, first recognized in December 2019 in Wuhan, Hubei, China, and has gone on an epidemic situation. Under these circumstances, it became more important to detect COVID-19 in infected people. Nowadays, the testing kits are gradually lessening in number compared to the number of infected population. Under recent prevailing con… ▽ More The Novel Coronavirus disease 2019 (COVID-19) is a fatal infectious disease, first recognized in December 2019 in Wuhan, Hubei, China, and has gone on an epidemic situation. Under these circumstances, it became more important to detect COVID-19 in infected people. Nowadays, the testing kits are gradually lessening in number compared to the number of infected population. Under recent prevailing conditions, the diagnosis of lung disease by analyzing chest CT (Computed Tomography) images has become an important tool for both diagnosis and prophecy of COVID-19 patients. In this study, a Transfer learning strategy (CNN) for detecting COVID-19 infection from CT images has been proposed. In the proposed model, a multilayer Convolutional neural network (CNN) with Transfer learning model Inception V3 has been designed. Similar to CNN, it uses convolution and pooling to extract features, but this transfer learning model contains weights of dataset Imagenet. Thus it can detect features very effectively which gives it an upper hand for achieving better accuracy. △ Less

Submitted 17 June, 2022; originally announced June 2022.

Comments: 4 pages, 4 figures, 2nd International Conference on Robotics, Electrical and Signal Processing Techniques (ICREST), DHAKA, Bangladesh

Journal ref: 2nd International Conference on Robotics, Electrical and Signal Processing Techniques (ICREST), DHAKA, Bangladesh, 2021, pp. 429-432

arXiv:2206.08543 [pdf]

Multi-Classification of Brain Tumor Images Using Transfer Learning Based Deep Neural Network

Authors: Pramit Dutta, Khaleda Akhter Sathi, Md. Saiful Islam

Abstract: In recent advancement towards computer based diagnostics system, the classification of brain tumor images is a challenging task. This paper mainly focuses on elevating the classification accuracy of brain tumor images with transfer learning based deep neural network. The classification approach is started with the image augmentation operation including rotation, zoom, hori-zontal flip, width shift… ▽ More In recent advancement towards computer based diagnostics system, the classification of brain tumor images is a challenging task. This paper mainly focuses on elevating the classification accuracy of brain tumor images with transfer learning based deep neural network. The classification approach is started with the image augmentation operation including rotation, zoom, hori-zontal flip, width shift, height shift, and shear to increase the diversity in image datasets. Then the general features of the input brain tumor images are extracted based on a pre-trained transfer learning method comprised of Inception-v3. Fi-nally, the deep neural network with 4 customized layers is employed for classi-fying the brain tumors in most frequent brain tumor types as meningioma, glioma, and pituitary. The proposed model acquires an effective performance with an overall accuracy of 96.25% which is much improved than some existing multi-classification methods. Whereas, the fine-tuning of hyper-parameters and inclusion of customized DNN with the Inception-v3 model results in an im-provement of the classification accuracy. △ Less

Submitted 17 June, 2022; originally announced June 2022.

Comments: 7 pages, 4 figures, 2 tables, International Virtual Conference on ARTIFICIAL INTELLIGENCE FOR SMART COMMUNITY, Malaysia

Journal ref: Conference proceedings \c{opyright} 2023 International Conference on Artificial Intelligence for Smart Community

arXiv:2205.15667 [pdf, other]

ViT-BEVSeg: A Hierarchical Transformer Network for Monocular Birds-Eye-View Segmentation

Authors: Pramit Dutta, Ganesh Sistu, Senthil Yogamani, Edgar Galván, John McDonald

Abstract: Generating a detailed near-field perceptual model of the environment is an important and challenging problem in both self-driving vehicles and autonomous mobile robotics. A Bird Eye View (BEV) map, providing a panoptic representation, is a commonly used approach that provides a simplified 2D representation of the vehicle surroundings with accurate semantic level segmentation for many downstream ta… ▽ More Generating a detailed near-field perceptual model of the environment is an important and challenging problem in both self-driving vehicles and autonomous mobile robotics. A Bird Eye View (BEV) map, providing a panoptic representation, is a commonly used approach that provides a simplified 2D representation of the vehicle surroundings with accurate semantic level segmentation for many downstream tasks. Current state-of-the art approaches to generate BEV-maps employ a Convolutional Neural Network (CNN) backbone to create feature-maps which are passed through a spatial transformer to project the derived features onto the BEV coordinate frame. In this paper, we evaluate the use of vision transformers (ViT) as a backbone architecture to generate BEV maps. Our network architecture, ViT-BEVSeg, employs standard vision transformers to generate a multi-scale representation of the input image. The resulting representation is then provided as an input to a spatial transformer decoder module which outputs segmentation maps in the BEV grid. We evaluate our approach on the nuScenes dataset demonstrating a considerable improvement in the performance relative to state-of-the-art approaches. △ Less

Submitted 31 May, 2022; originally announced May 2022.

Comments: Accepted for 2022 IEEE World Congress on Computational Intelligence (Track: IJCNN)

arXiv:2205.12494 [pdf, other]

A Multi-domain Magneto Tunnel Junction for Racetrack Nanowire Strips

Authors: Prayash Dutta, Albert Lee, Kang L. Wang, Alex K. Jones, Sanjukta Bhanja

Abstract: Domain-wall memory (DWM) has SRAM class access performance, low energy, high endurance, high density, and CMOS compatibility. Recently, shift reliability and processing-using-memory (PuM) proposals developed a need to count the number of parallel or anti-parallel domains in a portion of the DWM nanowire. In this paper we propose a multi-domain magneto-tunnel junction (MTJ) that can detect differen… ▽ More Domain-wall memory (DWM) has SRAM class access performance, low energy, high endurance, high density, and CMOS compatibility. Recently, shift reliability and processing-using-memory (PuM) proposals developed a need to count the number of parallel or anti-parallel domains in a portion of the DWM nanowire. In this paper we propose a multi-domain magneto-tunnel junction (MTJ) that can detect different resistance levels as a function of a the number of parallel or anti-parallel domains. Using detailed micromagnetic simulation with LLG, we demonstrate the multi-domain MTJ, study the benefit of its macro-size on resilience to process variation and present a macro-model for scaling the size of the multi-domain MTJ. Our results indicate scalability to seven-domains while maintaining a 16.3mV sense margin. △ Less

Submitted 25 May, 2022; originally announced May 2022.

Comments: This paper is under review for possible publication by the IEEE

arXiv:2204.06389 [pdf, other]

CRUSH: Contextually Regularized and User anchored Self-supervised Hate speech Detection

Authors: Souvic Chakraborty, Parag Dutta, Sumegh Roychowdhury, Animesh Mukherjee

Abstract: The last decade has witnessed a surge in the interaction of people through social networking platforms. While there are several positive aspects of these social platforms, the proliferation has led them to become the breeding ground for cyber-bullying and hate speech. Recent advances in NLP have often been used to mitigate the spread of such hateful content. Since the task of hate speech detection… ▽ More The last decade has witnessed a surge in the interaction of people through social networking platforms. While there are several positive aspects of these social platforms, the proliferation has led them to become the breeding ground for cyber-bullying and hate speech. Recent advances in NLP have often been used to mitigate the spread of such hateful content. Since the task of hate speech detection is usually applicable in the context of social networks, we introduce CRUSH, a framework for hate speech detection using user-anchored self-supervision and contextual regularization. Our proposed approach secures ~ 1-12% improvement in test set metrics over best performing previous approaches on two types of tasks and multiple popular english social media datasets. △ Less

Submitted 4 May, 2022; v1 submitted 13 April, 2022; originally announced April 2022.

Comments: Accepted in NAACL HLT 2022 (Long Paper)

ACM Class: I.2.7; J.4

arXiv:2112.11020 [pdf, ps, other]

Efficient reductions and algorithms for variants of Subset Sum

Authors: Pranjal Dutta, Mahesh Sreekumar Rajasree

Abstract: Given $(a_1, \dots, a_n, t) \in \mathbb{Z}_{\geq 0}^{n + 1}$, the Subset Sum problem ($\mathsf{SSUM}$) is to decide whether there exists $S \subseteq [n]$ such that $\sum_{i \in S} a_i = t$. There is a close variant of the $\mathsf{SSUM}$, called $\mathsf{Subset~Product}$. Given positive integers $a_1, ..., a_n$ and a target integer $t$, the $\mathsf{Subset~Product}$ problem asks to determine whet… ▽ More Given $(a_1, \dots, a_n, t) \in \mathbb{Z}_{\geq 0}^{n + 1}$, the Subset Sum problem ($\mathsf{SSUM}$) is to decide whether there exists $S \subseteq [n]$ such that $\sum_{i \in S} a_i = t$. There is a close variant of the $\mathsf{SSUM}$, called $\mathsf{Subset~Product}$. Given positive integers $a_1, ..., a_n$ and a target integer $t$, the $\mathsf{Subset~Product}$ problem asks to determine whether there exists a subset $S \subseteq [n]$ such that $\prod_{i \in S} a_i=t$. There is a pseudopolynomial time dynamic programming algorithm, due to Bellman (1957) which solves the $\mathsf{SSUM}$ and $\mathsf{Subset~Product}$ in $O(nt)$ time and $O(t)$ space. In the first part, we present {\em search} algorithms for variants of the Subset Sum problem. Our algorithms are parameterized by $k$, which is a given upper bound on the number of realisable sets (i.e.,~number of solutions, summing exactly $t$). We show that $\mathsf{SSUM}$ with a unique solution is already NP-hard, under randomized reduction. This makes the regime of parametrized algorithms, in terms of $k$, very interesting. Subsequently, we present an $\tilde{O}(k\cdot (n+t))$ time deterministic algorithm, which finds the hamming weight of all the realisable sets for a subset sum instance. We also give a poly$(knt)$-time and $O(\log(knt))$-space deterministic algorithm that finds all the realisable sets for a subset sum instance. In the latter part, we present a simple and elegant randomized $\tilde{O}(n + t)$ time algorithm for $\mathsf{Subset~Product}$. Moreover, we also present a poly$(nt)$ time and $O(\log^2 (nt))$ space deterministic algorithm for the same. We study these problems in the unbounded setting as well. Our algorithms use multivariate FFT, power series and number-theoretic techniques, introduced by Jin and Wu (SOSA'19) and Kane (2010). △ Less

Submitted 1 June, 2022; v1 submitted 21 December, 2021; originally announced December 2021.

Comments: A part of this work has been published in the proceedings of CALDAM 2022. We have improved running-time of some algorithms from the previous version of the draft

MSC Class: 68W20; 68W30; 68W40 ACM Class: F.2.1; F.2.2

arXiv:2111.04020 [pdf, other]

Biologically Inspired Oscillating Activation Functions Can Bridge the Performance Gap between Biological and Artificial Neurons

Authors: Matthew Mithra Noel, Shubham Bharadwaj, Venkataraman Muthiah-Nakarajan, Praneet Dutta, Geraldine Bessie Amali

Abstract: The recent discovery of special human neocortical pyramidal neurons that can individually learn the XOR function highlights the significant performance gap between biological and artificial neurons. The output of these pyramidal neurons first increases to a maximum with input and then decreases. Artificial neurons with similar characteristics can be designed with oscillating activation functions.… ▽ More The recent discovery of special human neocortical pyramidal neurons that can individually learn the XOR function highlights the significant performance gap between biological and artificial neurons. The output of these pyramidal neurons first increases to a maximum with input and then decreases. Artificial neurons with similar characteristics can be designed with oscillating activation functions. Oscillating activation functions have multiple zeros allowing single neurons to have multiple hyper-planes in their decision boundary. This enables even single neurons to learn the XOR function. This paper proposes four new oscillating activation functions inspired by human pyramidal neurons that can also individually learn the XOR function. Oscillating activation functions are non-saturating for all inputs unlike popular activation functions, leading to improved gradient flow and faster convergence. Using oscillating activation functions instead of popular monotonic or non-monotonic single-zero activation functions enables neural networks to train faster and solve classification problems with fewer layers. An extensive comparison of 23 activation functions on CIFAR 10, CIFAR 100, and Imagentte benchmarks is presented and the oscillating activation functions proposed in this paper are shown to outperform all known popular activation functions. △ Less

Submitted 10 May, 2023; v1 submitted 7 November, 2021; originally announced November 2021.

Comments: 29 pages, 9 figures

arXiv:2108.12943 [pdf, other]

Growing Cosine Unit: A Novel Oscillatory Activation Function That Can Speedup Training and Reduce Parameters in Convolutional Neural Networks

Authors: Mathew Mithra Noel, Arunkumar L, Advait Trivedi, Praneet Dutta

Abstract: Convolutional neural networks have been successful in solving many socially important and economically significant problems. This ability to learn complex high-dimensional functions hierarchically can be attributed to the use of nonlinear activation functions. A key discovery that made training deep networks feasible was the adoption of the Rectified Linear Unit (ReLU) activation function to allev… ▽ More Convolutional neural networks have been successful in solving many socially important and economically significant problems. This ability to learn complex high-dimensional functions hierarchically can be attributed to the use of nonlinear activation functions. A key discovery that made training deep networks feasible was the adoption of the Rectified Linear Unit (ReLU) activation function to alleviate the vanishing gradient problem caused by using saturating activation functions. Since then, many improved variants of the ReLU activation have been proposed. However, a majority of activation functions used today are non-oscillatory and monotonically increasing due to their biological plausibility. This paper demonstrates that oscillatory activation functions can improve gradient flow and reduce network size. Two theorems on limits of non-oscillatory activation functions are presented. A new oscillatory activation function called Growing Cosine Unit(GCU) defined as $C(z) = z\cos z$ that outperforms Sigmoids, Swish, Mish and ReLU on a variety of architectures and benchmarks is presented. The GCU activation has multiple zeros enabling single GCU neurons to have multiple hyperplanes in the decision boundary. This allows single GCU neurons to learn the XOR function without feature engineering. Experimental results indicate that replacing the activation function in the convolution layers with the GCU activation function significantly improves performance on CIFAR-10, CIFAR-100 and Imagenette. △ Less

Submitted 12 January, 2023; v1 submitted 29 August, 2021; originally announced August 2021.

Comments: 20 Pages

ACM Class: I.5

arXiv:2108.01202 [pdf, other]

PIRM: Processing In Racetrack Memories

Authors: Sebastien Ollivier, Stephen Longofono, Prayash Dutta, Jingtong Hu, Sanjukta Bhanja, Alex K. Jones

Abstract: The growth in data needs of modern applications has created significant challenges for modern systems leading a "memory wall." Spintronic Domain Wall Memory (DWM), related to Spin-Transfer Torque Memory (STT-MRAM), provides near-SRAM read/write performance, energy savings and nonvolatility, potential for extremely high storage density, and does not have significant endurance limitations. However,… ▽ More The growth in data needs of modern applications has created significant challenges for modern systems leading a "memory wall." Spintronic Domain Wall Memory (DWM), related to Spin-Transfer Torque Memory (STT-MRAM), provides near-SRAM read/write performance, energy savings and nonvolatility, potential for extremely high storage density, and does not have significant endurance limitations. However, DWM's benefits cannot address data access latency and throughput limitations of memory bus bandwidth. We propose PIRM, a DWM-based in-memory computing solution that leverages the properties of DWM nanowires and allows them to serve as polymorphic gates. While normally DWM is accessed by applying spin polarized currents orthogonal to the nanowire at access points to read individual bits, transverse access along the DWM nanowire allows the differentiation of the aggregate resistance of multiple bits in the nanowire, akin to a multilevel cell. PIRM leverages this transverse reading to directly provide bulk-bitwise logic of multiple adjacent operands in the nanowire, simultaneously. Based on this in-memory logic, PIRM provides a technique to conduct multi-operand addition and two operand multiplication using transverse access. PIRM provides a 1.6x speedup compared to the leading DRAM PIM technique for query applications that leverage bulk bitwise operations. Compared to the leading PIM technique for DWM, PIRM improves performance by 6.9x, 2.3x and energy by 5.5x, 3.4x for 8-bit addition and multiplication, respectively. For arithmetic heavy benchmarks, PIRM reduces access latency by 2.1x, while decreasing energy consumption by 25.2x for a reasonable 10% area overhead versus non-PIM DWM. △ Less

Submitted 1 August, 2022; v1 submitted 2 August, 2021; originally announced August 2021.

Comments: This paper is accepted to the IEEE/ACM Symposium on Microarchitecture, October 2022 under the title "CORUSCANT: Fast Efficient Processing-in-Racetrack Memories"

arXiv:2107.00596 [pdf, other]

Multimodal Graph-based Transformer Framework for Biomedical Relation Extraction

Authors: Sriram Pingali, Shweta Yadav, Pratik Dutta, Sriparna Saha

Abstract: The recent advancement of pre-trained Transformer models has propelled the development of effective text mining models across various biomedical tasks. However, these models are primarily learned on the textual data and often lack the domain knowledge of the entities to capture the context beyond the sentence. In this study, we introduced a novel framework that enables the model to learn multi-omn… ▽ More The recent advancement of pre-trained Transformer models has propelled the development of effective text mining models across various biomedical tasks. However, these models are primarily learned on the textual data and often lack the domain knowledge of the entities to capture the context beyond the sentence. In this study, we introduced a novel framework that enables the model to learn multi-omnics biological information about entities (proteins) with the help of additional multi-modal cues like molecular structure. Towards this, rather developing modality-specific architectures, we devise a generalized and optimized graph based multi-modal learning mechanism that utilizes the GraphBERT model to encode the textual and molecular structure information and exploit the underlying features of various modalities to enable end-to-end learning. We evaluated our proposed method on ProteinProtein Interaction task from the biomedical corpus, where our proposed generalized approach is observed to be benefited by the additional domain-specific modality. △ Less

Submitted 1 July, 2021; originally announced July 2021.

Comments: To appear in Findings of ACL 2021

arXiv:2103.06490

Active$^2$ Learning: Actively reducing redundancies in Active Learning methods for Sequence Tagging and Machine Translation

Authors: Rishi Hazra, Parag Dutta, Shubham Gupta, Mohammed Abdul Qaathir, Ambedkar Dukkipati

Abstract: While deep learning is a powerful tool for natural language processing (NLP) problems, successful solutions to these problems rely heavily on large amounts of annotated samples. However, manually annotating data is expensive and time-consuming. Active Learning (AL) strategies reduce the need for huge volumes of labeled data by iteratively selecting a small number of examples for manual annotation… ▽ More While deep learning is a powerful tool for natural language processing (NLP) problems, successful solutions to these problems rely heavily on large amounts of annotated samples. However, manually annotating data is expensive and time-consuming. Active Learning (AL) strategies reduce the need for huge volumes of labeled data by iteratively selecting a small number of examples for manual annotation based on their estimated utility in training the given model. In this paper, we argue that since AL strategies choose examples independently, they may potentially select similar examples, all of which may not contribute significantly to the learning process. Our proposed approach, Active$\mathbf{^2}$ Learning (A$\mathbf{^2}$L), actively adapts to the deep learning model being trained to eliminate further such redundant examples chosen by an AL strategy. We show that A$\mathbf{^2}$L is widely applicable by using it in conjunction with several different AL strategies and NLP tasks. We empirically demonstrate that the proposed approach is further able to reduce the data requirements of state-of-the-art AL strategies by an absolute percentage reduction of $\approx\mathbf{3-25\%}$ on multiple NLP tasks while achieving the same performance with no additional computation overhead. △ Less

Submitted 3 April, 2021; v1 submitted 11 March, 2021; originally announced March 2021.

Comments: Two of the authors had published similar manuscripts on arXiv. So withdrawing this one. All further updations will be reflected at arXiv:1911.00234

arXiv:2102.11690 [pdf, ps, other]

Inferring temporal dynamics from cross-sectional data using Langevin dynamics

Authors: Pritha Dutta, Rick Quax, Loes Crielaard, Peter M. A. Sloot

Abstract: Cross-sectional studies are widely prevalent since they are more feasible to conduct compared to longitudinal studies. However, cross-sectional data lack the temporal information required to study the evolution of the underlying processes. Nevertheless, this is essential to develop predictive computational models which is the first step towards causal modelling. We propose a method for inferring c… ▽ More Cross-sectional studies are widely prevalent since they are more feasible to conduct compared to longitudinal studies. However, cross-sectional data lack the temporal information required to study the evolution of the underlying processes. Nevertheless, this is essential to develop predictive computational models which is the first step towards causal modelling. We propose a method for inferring computational models from cross-sectional data using Langevin dynamics. This method can be applied to any system that can be described as effectively following a free energy landscape, such as protein folding, stem cell differentiation and reprogramming, and social systems involving human interaction and social norms. A crucial assumption in our method is that the data-points are gathered from a system in (local) equilibrium. The result is a set of stochastic differential equations which capture the temporal dynamics, by assuming that groups of data-points are subject to the same free energy landscape and amount of noise. Our method is a 'baseline' method which initiates the development of computational models which can be iteratively enhanced through the inclusion of expert knowledge. We validate the proposed method against two population-based longitudinal datasets and observe significant predictive power in comparison with random choice algorithms. We also show how the predictive power of our 'baseline' model can be enhanced by incorporating domain expert knowledge. Our method addresses an important obstacle for model development in fields dominated by cross-sectional datasets. △ Less

Submitted 23 February, 2021; originally announced February 2021.

Comments: 17 pages, 3 figures, "The code for the proposed method is written in Mathematica programming language and is available at https://github.com/Pritha17/langevin-crosssectional"

arXiv:2011.09192 [pdf, other]

Game Plan: What AI can do for Football, and What Football can do for AI

Authors: Karl Tuyls, Shayegan Omidshafiei, Paul Muller, Zhe Wang, Jerome Connor, Daniel Hennes, Ian Graham, William Spearman, Tim Waskett, Dafydd Steele, Pauline Luc, Adria Recasens, Alexandre Galashov, Gregory Thornton, Romuald Elie, Pablo Sprechmann, Pol Moreno, Kris Cao, Marta Garnelo, Praneet Dutta, Michal Valko, Nicolas Heess, Alex Bridgland, Julien Perolat, Bart De Vylder , et al. (11 additional authors not shown)

Abstract: The rapid progress in artificial intelligence (AI) and machine learning has opened unprecedented analytics possibilities in various team and individual sports, including baseball, basketball, and tennis. More recently, AI techniques have been applied to football, due to a huge increase in data collection by professional teams, increased computational power, and advances in machine learning, with t… ▽ More The rapid progress in artificial intelligence (AI) and machine learning has opened unprecedented analytics possibilities in various team and individual sports, including baseball, basketball, and tennis. More recently, AI techniques have been applied to football, due to a huge increase in data collection by professional teams, increased computational power, and advances in machine learning, with the goal of better addressing new scientific challenges involved in the analysis of both individual players' and coordinated teams' behaviors. The research challenges associated with predictive and prescriptive football analytics require new developments and progress at the intersection of statistical learning, game theory, and computer vision. In this paper, we provide an overarching perspective highlighting how the combination of these fields, in particular, forms a unique microcosm for AI research, while offering mutual benefits for professional teams, spectators, and broadcasters in the years to come. We illustrate that this duality makes football analytics a game changer of tremendous value, in terms of not only changing the game of football itself, but also in terms of what this domain can mean for the field of AI. We review the state-of-the-art and exemplify the types of analysis enabled by combining the aforementioned fields, including illustrative examples of counterfactual analysis using predictive models, and the combination of game-theoretic analysis of penalty kicks with statistical learning of player attributes. We conclude by highlighting envisioned downstream impacts, including possibilities for extensions to other sports (real and virtual). △ Less

Submitted 18 November, 2020; originally announced November 2020.

arXiv:2011.08456 [pdf, ps, other]

Collusion-Resistant Identity-based Proxy Re-Encryption: Lattice-based Constructions in Standard Model

Authors: Priyanka Dutta, Willy Susilo, Dung Hoang Duong, Partha Sarathi Roy

Abstract: The concept of proxy re-encryption (PRE) dates back to the work of Blaze, Bleumer, and Strauss in 1998. PRE offers delegation of decryption rights, i.e., it securely enables the re-encryption of ciphertexts from one key to another, without relying on trusted parties. PRE allows a semi-trusted third party termed as a ``proxy" to securely divert encrypted files of user A (delegator) to user B (deleg… ▽ More The concept of proxy re-encryption (PRE) dates back to the work of Blaze, Bleumer, and Strauss in 1998. PRE offers delegation of decryption rights, i.e., it securely enables the re-encryption of ciphertexts from one key to another, without relying on trusted parties. PRE allows a semi-trusted third party termed as a ``proxy" to securely divert encrypted files of user A (delegator) to user B (delegatee) without revealing any information about the underlying files to the proxy. To eliminate the necessity of having a costly certificate verification process, Green and Ateniese introduced an identity-based PRE (IB-PRE). The potential applicability of IB-PRE sprung up a long line of intensive research from its first instantiation. Unfortunately, till today, there is no collusion-Resistant unidirectional IB-PRE secure in the standard model, which can withstand quantum attack. In this paper, we present the first concrete constructions of collusion-Resistant unidirectional IB-PRE, for both selective and adaptive identity, which are secure in standard model based on the hardness of learning with error problem. △ Less

Submitted 16 November, 2020; originally announced November 2020.

Comments: arXiv admin note: substantial text overlap with arXiv:2005.06741

arXiv:2005.13164 [pdf, other]

CoVista: A Unified View on Privacy Sensitive Mobile Contact Tracing Effort

Authors: David Culler, Prabal Dutta, Gabe Fierro, Joseph E. Gonzalez, Nathan Pemberton, Johann Schleier-Smith, K. Shankari, Alvin Wan, Thomas Zachariah

Abstract: Governments around the world have become increasingly frustrated with tech giants dictating public health policy. The software created by Apple and Google enables individuals to track their own potential exposure through collated exposure notifications. However, the same software prohibits location tracking, denying key information needed by public health officials for robust contract tracing. Thi… ▽ More Governments around the world have become increasingly frustrated with tech giants dictating public health policy. The software created by Apple and Google enables individuals to track their own potential exposure through collated exposure notifications. However, the same software prohibits location tracking, denying key information needed by public health officials for robust contract tracing. This information is needed to treat and isolate COVID-19 positive people, identify transmission hotspots, and protect against continued spread of infection. In this article, we present two simple ideas: the lighthouse and the covid-commons that address the needs of public health authorities while preserving the privacy-sensitive goals of the Apple and google exposure notification protocols. △ Less

Submitted 27 May, 2020; originally announced May 2020.

arXiv:2005.06741 [pdf, ps, other]

Lattice-based Unidirectional IBPRE Secure in Standard Model

Authors: Priyanka Dutta, Willy Susilo, Dung Hoang Duong, Joonsang Baek, Partha Sarathi Roy

Abstract: Proxy re-encryption (PRE) securely enables the re-encryption of ciphertexts from one key to another, without relying on trusted parties, i.e., it offers delegation of decryption rights. PRE allows a semi-trusted third party termed as a "proxy" to securely divert encrypted files of user A (delegator) to user B (delegatee) without revealing any information about the underlying files to the proxy. To… ▽ More Proxy re-encryption (PRE) securely enables the re-encryption of ciphertexts from one key to another, without relying on trusted parties, i.e., it offers delegation of decryption rights. PRE allows a semi-trusted third party termed as a "proxy" to securely divert encrypted files of user A (delegator) to user B (delegatee) without revealing any information about the underlying files to the proxy. To eliminate the necessity of having a costly certificate verification process, Green and Ateniese introduced an identity-based PRE (IB-PRE). The potential applicability of IB-PRE leads to intensive research from its first instantiation. Unfortunately, till today, there is no unidirectional IB-PRE secure in the standard model, which can withstand quantum attack. In this paper, we provide, for the first time, a concrete construction of unidirectional IB-PRE which is secure in standard model based on the hardness of learning with error problem. Our technique is to use the novel trapdoor delegation technique of Micciancio and Peikert. The way we use trapdoor delegation technique may prove useful for functionalities other than proxy re-encryption as well. △ Less

Submitted 14 May, 2020; originally announced May 2020.

arXiv:2001.04074 [pdf, other]

doi 10.1016/j.knosys.2020.106062

Evolution of Image Segmentation using Deep Convolutional Neural Network: A Survey

Authors: Farhana Sultana, Abu Sufian, Paramartha Dutta

Abstract: From the autonomous car driving to medical diagnosis, the requirement of the task of image segmentation is everywhere. Segmentation of an image is one of the indispensable tasks in computer vision. This task is comparatively complicated than other vision tasks as it needs low-level spatial information. Basically, image segmentation can be of two types: semantic segmentation and instance segmentati… ▽ More From the autonomous car driving to medical diagnosis, the requirement of the task of image segmentation is everywhere. Segmentation of an image is one of the indispensable tasks in computer vision. This task is comparatively complicated than other vision tasks as it needs low-level spatial information. Basically, image segmentation can be of two types: semantic segmentation and instance segmentation. The combined version of these two basic tasks is known as panoptic segmentation. In the recent era, the success of deep convolutional neural networks (CNN) has influenced the field of segmentation greatly and gave us various successful models to date. In this survey, we are going to take a glance at the evolution of both semantic and instance segmentation work based on CNN. We have also specified comparative architectural details of some state-of-the-art models and discuss their training details to present a lucid understanding of hyper-parameter tuning of those models. We have also drawn a comparison among the performance of those models on different datasets. Lastly, we have given a glimpse of some state-of-the-art panoptic segmentation models. △ Less

Submitted 29 May, 2020; v1 submitted 13 January, 2020; originally announced January 2020.

Comments: 38 pages, 29 figures, 8 tables

Journal ref: journal = "Knowledge-Based Systems", volume = "201-202", pages = "106062", year = "2020"

arXiv:1911.06932 [pdf, other]

3D Conditional Generative Adversarial Networks to enable large-scale seismic image enhancement

Authors: Praneet Dutta, Bruce Power, Adam Halpert, Carlos Ezequiel, Aravind Subramanian, Chanchal Chatterjee, Sindhu Hari, Kenton Prindle, Vishal Vaddina, Andrew Leach, Raj Domala, Laura Bandura, Massimo Mascaro

Abstract: We propose GAN-based image enhancement models for frequency enhancement of 2D and 3D seismic images. Seismic imagery is used to understand and characterize the Earth's subsurface for energy exploration. Because these images often suffer from resolution limitations and noise contamination, our proposed method performs large-scale seismic volume frequency enhancement and denoising. The enhanced imag… ▽ More We propose GAN-based image enhancement models for frequency enhancement of 2D and 3D seismic images. Seismic imagery is used to understand and characterize the Earth's subsurface for energy exploration. Because these images often suffer from resolution limitations and noise contamination, our proposed method performs large-scale seismic volume frequency enhancement and denoising. The enhanced images reduce uncertainty and improve decisions about issues, such as optimal well placement, that often rely on low signal-to-noise ratio (SNR) seismic volumes. We explored the impact of adding lithology class information to the models, resulting in improved performance on PSNR and SSIM metrics over a baseline model with no conditional information. △ Less

Submitted 15 November, 2019; originally announced November 2019.

Comments: To be Presented at the NeurIPS 2019, Second Workshop on Machine Learning and the Physicial Sciences, Vancouver, Canada

arXiv:1911.00234 [pdf, other]

Active$^2$ Learning: Actively reducing redundancies in Active Learning methods for Sequence Tagging and Machine Translation

Authors: Rishi Hazra, Parag Dutta, Shubham Gupta, Mohammed Abdul Qaathir, Ambedkar Dukkipati

Abstract: While deep learning is a powerful tool for natural language processing (NLP) problems, successful solutions to these problems rely heavily on large amounts of annotated samples. However, manually annotating data is expensive and time-consuming. Active Learning (AL) strategies reduce the need for huge volumes of labeled data by iteratively selecting a small number of examples for manual annotation… ▽ More While deep learning is a powerful tool for natural language processing (NLP) problems, successful solutions to these problems rely heavily on large amounts of annotated samples. However, manually annotating data is expensive and time-consuming. Active Learning (AL) strategies reduce the need for huge volumes of labeled data by iteratively selecting a small number of examples for manual annotation based on their estimated utility in training the given model. In this paper, we argue that since AL strategies choose examples independently, they may potentially select similar examples, all of which may not contribute significantly to the learning process. Our proposed approach, Active$\mathbf{^2}$ Learning (A$\mathbf{^2}$L), actively adapts to the deep learning model being trained to eliminate such redundant examples chosen by an AL strategy. We show that A$\mathbf{^2}$L is widely applicable by using it in conjunction with several different AL strategies and NLP tasks. We empirically demonstrate that the proposed approach is further able to reduce the data requirements of state-of-the-art AL strategies by $\approx \mathbf{3-25\%}$ on an absolute scale on multiple NLP tasks while achieving the same performance with virtually no additional computation overhead. △ Less

Submitted 6 April, 2021; v1 submitted 1 November, 2019; originally announced November 2019.

Comments: Accepted in NAACL-HLT 2021

arXiv:1910.11916 [pdf, other]

Minus HELLO: HELLO Devoid Protocols for Energy Preservation in Mobile Ad Hoc Networks

Authors: Anuradha Banerjee, Abu Sufian, Paramartha Dutta, M M Hafizur Rahman

Abstract: In mobile ad-hoc networks, nodes have to transmit HELLO or Route Maintenance messages at regular intervals, and all nodes residing within its radio range, reply with an acknowledgment message informing their node identifier, current location, and radio-range. Regular transmitting these messages consume a significant amount of battery power in nodes, especially when the set of down-link neighbors d… ▽ More In mobile ad-hoc networks, nodes have to transmit HELLO or Route Maintenance messages at regular intervals, and all nodes residing within its radio range, reply with an acknowledgment message informing their node identifier, current location, and radio-range. Regular transmitting these messages consume a significant amount of battery power in nodes, especially when the set of down-link neighbors does not change over time and the radio-range of the sender node is large. The present article focuses on this aspect and tries to eliminate the number of HELLO messages in existing state-of-art protocols. Also, it shortens radio-ranges of nodes whenever possible. Simulation results show that the average lifetime of nodes greatly increases in proposed Minus HELLO devoid routing protocols along with a great increase in network throughput. Also, the required number of route re-discovery reduces. △ Less

Submitted 8 September, 2020; v1 submitted 25 October, 2019; originally announced October 2019.

Comments: 33 pages, 20 figures, pre-print of under review manuscript

arXiv:1909.03212 [pdf, other]

AutoML for Contextual Bandits

Authors: Praneet Dutta, Joe Cheuk, Jonathan S Kim, Massimo Mascaro

Abstract: Contextual Bandits is one of the widely popular techniques used in applications such as personalization, recommendation systems, mobile health, causal marketing etc . As a dynamic approach, it can be more efficient than standard A/B testing in minimizing regret. We propose an end to end automated meta-learning pipeline to approximate the optimal Q function for contextual bandits problems. We see t… ▽ More Contextual Bandits is one of the widely popular techniques used in applications such as personalization, recommendation systems, mobile health, causal marketing etc . As a dynamic approach, it can be more efficient than standard A/B testing in minimizing regret. We propose an end to end automated meta-learning pipeline to approximate the optimal Q function for contextual bandits problems. We see that our model is able to perform much better than random exploration, being more regret efficient and able to converge with a limited number of samples, while remaining very general and easy to use due to the meta-learning approach. We used a linearly annealed e-greedy exploration policy to define the exploration vs exploitation schedule. We tested the system on a synthetic environment to characterize it fully and we evaluated it on some open source datasets to benchmark against prior work. We see that our model outperforms or performs comparatively to other models while requiring no tuning nor feature engineering. △ Less

Submitted 1 February, 2022; v1 submitted 7 September, 2019; originally announced September 2019.

Comments: Presented(peer-reviewed) at the REVEAL Workshop at the ACM RecSys Conference Copenhagen'19 [https://sites.google.com/view/reveal2019/proceedings]

arXiv:1906.00760 [pdf]

doi 10.17485/ijst/2016/v9i43/104383

Fuzzy Route Switching For Energy Preservation (FEP) in Ad Hoc Networks

Authors: A. Banerjee, P. Dutta, A. Sufian

Abstract: Nodes in ad hoc networks have limited battery power. Hence they require an energy-efficient technique to improve average network performance. Maintaining energy-efficiency in ad hoc networks is really challenging because highest energy efficiency is achieved if all the nodes are always switched off and energy-efficiency will be minimum if all the nodes are fully operational i.e. always turned-on.… ▽ More Nodes in ad hoc networks have limited battery power. Hence they require an energy-efficient technique to improve average network performance. Maintaining energy-efficiency in ad hoc networks is really challenging because highest energy efficiency is achieved if all the nodes are always switched off and energy-efficiency will be minimum if all the nodes are fully operational i.e. always turned-on. Energy preservation requires redirection of data packets through some other routes having good performance. This improves the data packet delivery ratio and the number of alive nodes decreasing the cost of messages. △ Less

Submitted 26 May, 2019; originally announced June 2019.

Comments: 11 pages, 12 figures, 4 tables

Journal ref: Indian Journal of Science and Technology, Vol 9(43), November 2016

arXiv:1906.00759 [pdf]

doi 10.17485/ijst/2016/v9i43/104384

Fuzzy-Controlled Scheduling of Route-Request Packets (FSRR) in Mobile Ad Hoc Networks

Authors: A. Sufian, A. Banerjee, P. Dutta

Abstract: In ad hoc networks, the scheduling of route-request packets should be different from that of message packets, because during transmission of message packets the location of the destination is known whereas in route discovery this is not known in most of the cases. The router has to depend upon the last known location, if any, of the destination to determine the center and radius of the circle that… ▽ More In ad hoc networks, the scheduling of route-request packets should be different from that of message packets, because during transmission of message packets the location of the destination is known whereas in route discovery this is not known in most of the cases. The router has to depend upon the last known location, if any, of the destination to determine the center and radius of the circle that embeds all possible current position of the destination. Route-request packets generated from the source are directed towards this circle i.e., directional route discovery can be applied. Otherwise, when no earlier location of the destination is known the route-requested has to be broadcast in the whole network consuming a significant amount of time than directional route discovery. The present article proposes fuzzy controlled scheduling of route-request packets in particular that greatly reduces the average delay in route discovery in ad hoc networks. △ Less

Submitted 26 May, 2019; originally announced June 2019.

Comments: 6 pages, 4 figures

Journal ref: Indian Journal of Science and Technology, Vol 9(43), November 2016

arXiv:1905.11644 [pdf]

Cheat-Proof Communication through Cluster Head (C3H) in Mobile Ad Hoc Network

Authors: A. Sufian, A. Banerjee, P. Dutta

Abstract: The mobile ad hoc network (MANET) is a wireless network based on a group of mobile nodes without any centralised infrastructure. In civilian data communication, all nodes cannot be homogeneous-type and not do a specific data communication. Therefore, node co-operation and cheat-proof are essential characteristics for successfully running MANETs in civilian data communication. Denial of service and… ▽ More The mobile ad hoc network (MANET) is a wireless network based on a group of mobile nodes without any centralised infrastructure. In civilian data communication, all nodes cannot be homogeneous-type and not do a specific data communication. Therefore, node co-operation and cheat-proof are essential characteristics for successfully running MANETs in civilian data communication. Denial of service and malicious behaviour of the node are the main concerns in securing successful communication in MANETs. This scheme proposed a generic solution to preventing malicious behaviour of the node by the cluster head through the single hop node clustering strategy. △ Less

Submitted 28 May, 2019; originally announced May 2019.

Comments: 14 pages, 8 figures, 1 table

Journal ref: Pertanika J. Sci. & Technol. 26 (3): 1513 - 1526 (2018)

arXiv:1905.11627 [pdf]

Data Load Balancing In Mobile Ad Hoc Network Using Fuzzy Logic (DBMF)

Authors: A. Sufian, F. Sultana, P. Dutta

Abstract: Volume and movement of data rapidly increasing in every type of data communications and networking, and ad hoc networks are not spared from these challenges. Traditional Multipath routing protocols in Mobile Ad-hoc Networks (MANETs) did not focus on data load distribution and balancing as much as required. In this scheme, we have proposed data load distribution and balancing through multiple paths… ▽ More Volume and movement of data rapidly increasing in every type of data communications and networking, and ad hoc networks are not spared from these challenges. Traditional Multipath routing protocols in Mobile Ad-hoc Networks (MANETs) did not focus on data load distribution and balancing as much as required. In this scheme, we have proposed data load distribution and balancing through multiple paths simultaneously. We have considered three important parameters of ad hoc network those are: mobility of node, the energy of node and packet drop rate at a node. This scheme combines these three metrics using fuzzy logic to get the decisive parameter. We have shown improvement of this scheme over similar kind of protocols in NS-2 network simulator. △ Less

Submitted 28 May, 2019; originally announced May 2019.

Comments: 8 pages, 3 Figures, National Conference on Recent Advances in Computer Science and IT (NCRACIT),BGSB University, Rajouri, J&K

Journal ref: Proceeding of National Conference on Recent Advances in Computer Science and IT (NCRACIT), BGSB University, Rajouri, J&K, Published online: http://ijsrcseit.com/CSEIT411813, 2018

arXiv:1905.03288 [pdf, other]

doi 10.1109/ICRCICN.2018.8718718

Advancements in Image Classification using Convolutional Neural Network

Authors: Farhana Sultana, A. Sufian, Paramartha Dutta

Abstract: Convolutional Neural Network (CNN) is the state-of-the-art for image classification task. Here we have briefly discussed different components of CNN. In this paper, We have explained different CNN architectures for image classification. Through this paper, we have shown advancements in CNN from LeNet-5 to latest SENet model. We have discussed the model description and training details of each mode… ▽ More Convolutional Neural Network (CNN) is the state-of-the-art for image classification task. Here we have briefly discussed different components of CNN. In this paper, We have explained different CNN architectures for image classification. Through this paper, we have shown advancements in CNN from LeNet-5 to latest SENet model. We have discussed the model description and training details of each model. We have also drawn a comparison among those models. △ Less

Submitted 8 May, 2019; originally announced May 2019.

Comments: 9 pages, 15 figures, 3 Tables. Submitted to 2018 Fourth International Conference on Research in Computational Intelligence and Communication Networks(ICRCICN 2018)

Journal ref: 2018 Fourth International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN)

arXiv:1905.01614 [pdf, other]

doi 10.1007/978-981-15-4288-6_1

A Review of Object Detection Models based on Convolutional Neural Network

Authors: F. Sultana, A. Sufian, P. Dutta

Abstract: Convolutional Neural Network (CNN) has become the state-of-the-art for object detection in image task. In this chapter, we have explained different state-of-the-art CNN based object detection models. We have made this review with categorization those detection models according to two different approaches: two-stage approach and one-stage approach. Through this chapter, it has shown advancements in… ▽ More Convolutional Neural Network (CNN) has become the state-of-the-art for object detection in image task. In this chapter, we have explained different state-of-the-art CNN based object detection models. We have made this review with categorization those detection models according to two different approaches: two-stage approach and one-stage approach. Through this chapter, it has shown advancements in object detection models from R-CNN to latest RefineDet. It has also discussed the model description and training details of each model. Here, we have also drawn a comparison among those models. △ Less

Submitted 1 October, 2019; v1 submitted 5 May, 2019; originally announced May 2019.

Comments: 17 pages, 11 figures, 1 table

Journal ref: Intelligent Computing: Image Processing Based Applications. Advances in Intelligent Systems and Computing, vol 1157, pages 1-16, 2020

arXiv:1802.07805 [pdf, other]

doi 10.1109/IPSN.2018.00047

The Signpost Platform for City-Scale Sensing

Authors: Joshua Adkins, Branden Ghena, Neal Jackson, Pat Pannuto, Samuel Rohrer, Bradford Campbell, Prabal Dutta

Abstract: City-scale sensing holds the promise of enabling a deeper understanding of our urban environments. However, a city-scale deployment requires physical installation, power management, and communications---all challenging tasks standing between a good idea and a realized one. This indicates the need for a platform that enables easy deployment and experimentation for applications operating at city sca… ▽ More City-scale sensing holds the promise of enabling a deeper understanding of our urban environments. However, a city-scale deployment requires physical installation, power management, and communications---all challenging tasks standing between a good idea and a realized one. This indicates the need for a platform that enables easy deployment and experimentation for applications operating at city scale. To address these challenges, we present Signpost, a modular, energy-harvesting platform for city-scale sensing. Signpost simplifies deployment by eliminating the need for connection to wired infrastructure and instead harvesting energy from an integrated solar panel. The platform furnishes the key resources necessary to support multiple, pluggable sensor modules while providing fair, safe, and reliable sharing in the face of dynamic energy constraints. We deploy Signpost with several sensor modules, showing the viability of an energy-harvesting, multi-tenant, sensing system, and evaluate its ability to support sensing applications. We believe Signpost reduces the difficulty inherent in city-scale deployments, enables new experimentation, and provides improved insights into urban health. △ Less

Submitted 21 February, 2018; originally announced February 2018.

Comments: Published in the proceedings of the 17th ACM/IEEE Conference on Information Processing in Sensor Networks (IPSN'18)

Showing 1–50 of 68 results for author: Dutta, P