-
Stranger Danger! Identifying and Avoiding Unpredictable Pedestrians in RL-based Social Robot Navigation
Authors:
Sara Pohland,
Alvin Tan,
Prabal Dutta,
Claire Tomlin
Abstract:
Reinforcement learning (RL) methods for social robot navigation show great success navigating robots through large crowds of people, but the performance of these learning-based methods tends to degrade in particularly challenging or unfamiliar situations due to the models' dependency on representative training data. To ensure human safety and comfort, it is critical that these algorithms handle un…
▽ More
Reinforcement learning (RL) methods for social robot navigation show great success navigating robots through large crowds of people, but the performance of these learning-based methods tends to degrade in particularly challenging or unfamiliar situations due to the models' dependency on representative training data. To ensure human safety and comfort, it is critical that these algorithms handle uncommon cases appropriately, but the low frequency and wide diversity of such situations present a significant challenge for these data-driven methods. To overcome this challenge, we propose modifications to the learning process that encourage these RL policies to maintain additional caution in unfamiliar situations. Specifically, we improve the Socially Attentive Reinforcement Learning (SARL) policy by (1) modifying the training process to systematically introduce deviations into a pedestrian model, (2) updating the value network to estimate and utilize pedestrian-unpredictability features, and (3) implementing a reward function to learn an effective response to pedestrian unpredictability. Compared to the original SARL policy, our modified policy maintains similar navigation times and path lengths, while reducing the number of collisions by 82% and reducing the proportion of time spent in the pedestrians' personal space by up to 19 percentage points for the most difficult cases. We also describe how to apply these modifications to other RL policies and demonstrate that some key high-level behaviors of our approach transfer to a physical robot.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
On Fourier analysis of sparse Boolean functions over certain Abelian groups
Authors:
Sourav Chakraborty,
Swarnalipa Datta,
Pranjal Dutta,
Arijit Ghosh,
Swagato Sanyal
Abstract:
Given an Abelian group G, a Boolean-valued function f: G -> {-1,+1}, is said to be s-sparse, if it has at most s-many non-zero Fourier coefficients over the domain G. In a seminal paper, Gopalan et al. proved "Granularity" for Fourier coefficients of Boolean valued functions over Z_2^n, that have found many diverse applications in theoretical computer science and combinatorics. They also studied s…
▽ More
Given an Abelian group G, a Boolean-valued function f: G -> {-1,+1}, is said to be s-sparse, if it has at most s-many non-zero Fourier coefficients over the domain G. In a seminal paper, Gopalan et al. proved "Granularity" for Fourier coefficients of Boolean valued functions over Z_2^n, that have found many diverse applications in theoretical computer science and combinatorics. They also studied structural results for Boolean functions over Z_2^n which are approximately Fourier-sparse. In this work, we obtain structural results for approximately Fourier-sparse Boolean valued functions over Abelian groups G of the form,G:= Z_{p_1}^{n_1} \times ... \times Z_{p_t}^{n_t}, for distinct primes p_i. We also obtain a lower bound of the form 1/(m^{2}s)^ceiling(phi(m)/2), on the absolute value of the smallest non-zero Fourier coefficient of an s-sparse function, where m=p_1 ... p_t, and phi(m)=(p_1-1) ... (p_t-1). We carefully apply probabilistic techniques from Gopalan et al., to obtain our structural results, and use some non-trivial results from algebraic number theory to get the lower bound.
We construct a family of at most s-sparse Boolean functions over Z_p^n, where p > 2, for arbitrarily large enough s, where the minimum non-zero Fourier coefficient is 1/omega(n). The "Granularity" result of Gopalan et al. implies that the absolute values of non-zero Fourier coefficients of any s-sparse Boolean valued function over Z_2^n are 1/O(s). So, our result shows that one cannot expect such a lower bound for general Abelian groups.
Using our new structural results on the Fourier coefficients of sparse functions, we design an efficient testing algorithm for Fourier-sparse Boolean functions, thata requires poly((ms)^phi(m),1/epsilon)-many queries. Further, we prove an Omega(sqrt{s}) lower bound on the query complexity of any adaptive sparsity testing algorithm.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Are Vision xLSTM Embedded UNet More Reliable in Medical 3D Image Segmentation?
Authors:
Pallabi Dutta,
Soham Bose,
Swalpa Kumar Roy,
Sushmita Mitra
Abstract:
The advancement of developing efficient medical image segmentation has evolved from initial dependence on Convolutional Neural Networks (CNNs) to the present investigation of hybrid models that combine CNNs with Vision Transformers. Furthermore, there is an increasing focus on creating architectures that are both high-performing in medical image segmentation tasks and computationally efficient to…
▽ More
The advancement of developing efficient medical image segmentation has evolved from initial dependence on Convolutional Neural Networks (CNNs) to the present investigation of hybrid models that combine CNNs with Vision Transformers. Furthermore, there is an increasing focus on creating architectures that are both high-performing in medical image segmentation tasks and computationally efficient to be deployed on systems with limited resources. Although transformers have several advantages like capturing global dependencies in the input data, they face challenges such as high computational and memory complexity. This paper investigates the integration of CNNs and Vision Extended Long Short-Term Memory (Vision-xLSTM) models by introducing a novel approach called UVixLSTM. The Vision-xLSTM blocks captures temporal and global relationships within the patches extracted from the CNN feature maps. The convolutional feature reconstruction path upsamples the output volume from the Vision-xLSTM blocks to produce the segmentation output. Our primary objective is to propose that Vision-xLSTM forms a reliable backbone for medical image segmentation tasks, offering excellent segmentation performance and reduced computational complexity. UVixLSTM exhibits superior performance compared to state-of-the-art networks on the publicly-available Synapse dataset. Code is available at: https://github.com/duttapallabi2907/UVixLSTM
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Input Guided Multiple Deconstruction Single Reconstruction neural network models for Matrix Factorization
Authors:
Prasun Dutta,
Rajat K. De
Abstract:
Referring back to the original text in the course of hierarchical learning is a common human trait that ensures the right direction of learning. The models developed based on the concept of Non-negative Matrix Factorization (NMF), in this paper are inspired by this idea. They aim to deal with high-dimensional data by discovering its low rank approximation by determining a unique pair of factor mat…
▽ More
Referring back to the original text in the course of hierarchical learning is a common human trait that ensures the right direction of learning. The models developed based on the concept of Non-negative Matrix Factorization (NMF), in this paper are inspired by this idea. They aim to deal with high-dimensional data by discovering its low rank approximation by determining a unique pair of factor matrices. The model, named Input Guided Multiple Deconstruction Single Reconstruction neural network for Non-negative Matrix Factorization (IG-MDSR-NMF), ensures the non-negativity constraints of both factors. Whereas Input Guided Multiple Deconstruction Single Reconstruction neural network for Relaxed Non-negative Matrix Factorization (IG-MDSR-RNMF) introduces a novel idea of factorization with only the basis matrix adhering to the non-negativity criteria. This relaxed version helps the model to learn more enriched low dimensional embedding of the original data matrix. The competency of preserving the local structure of data in its low rank embedding produced by both the models has been appropriately verified. The superiority of low dimensional embedding over that of the original data justifying the need for dimension reduction has been established. The primacy of both the models has also been validated by comparing their performances separately with that of nine other established dimension reduction algorithms on five popular datasets. Moreover, computational complexity of the models and convergence analysis have also been presented testifying to the supremacy of the models.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Fixed-parameter debordering of Waring rank
Authors:
Pranjal Dutta,
Fulvio Gesmundo,
Christian Ikenmeyer,
Gorav Jindal,
Vladimir Lysikov
Abstract:
Border complexity measures are defined via limits (or topological closures), so that any function which can approximated arbitrarily closely by low complexity functions itself has low border complexity. Debordering is the task of proving an upper bound on some non-border complexity measure in terms of a border complexity measure, thus getting rid of limits.
Debordering is at the heart of underst…
▽ More
Border complexity measures are defined via limits (or topological closures), so that any function which can approximated arbitrarily closely by low complexity functions itself has low border complexity. Debordering is the task of proving an upper bound on some non-border complexity measure in terms of a border complexity measure, thus getting rid of limits.
Debordering is at the heart of understanding the difference between Valiant's determinant vs permanent conjecture, and Mulmuley and Sohoni's variation which uses border determinantal complexity. The debordering of matrix multiplication tensors by Bini played a pivotal role in the development of efficient matrix multiplication algorithms. Consequently, debordering finds applications in both establishing computational complexity lower bounds and facilitating algorithm design. Currently, very few debordering results are known.
In this work, we study the question of debordering the border Waring rank of polynomials. Waring and border Waring rank are very well studied measures in the context of invariant theory, algebraic geometry, and matrix multiplication algorithms. For the first time, we obtain a Waring rank upper bound that is exponential in the border Waring rank and only linear in the degree. All previous known results were exponential in the degree. For polynomials with constant border Waring rank, our results imply an upper bound on the Waring rank linear in degree, which previously was only known for polynomials with border Waring rank at most 5.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
Real-Time Human Fall Detection using a Lightweight Pose Estimation Technique
Authors:
Ekram Alam,
Abu Sufian,
Paramartha Dutta,
Marco Leo
Abstract:
The elderly population is increasing rapidly around the world. There are no enough caretakers for them. Use of AI-based in-home medical care systems is gaining momentum due to this. Human fall detection is one of the most important tasks of medical care system for the aged people. Human fall is a common problem among elderly people. Detection of a fall and providing medical help as early as possib…
▽ More
The elderly population is increasing rapidly around the world. There are no enough caretakers for them. Use of AI-based in-home medical care systems is gaining momentum due to this. Human fall detection is one of the most important tasks of medical care system for the aged people. Human fall is a common problem among elderly people. Detection of a fall and providing medical help as early as possible is very important to reduce any further complexity. The chances of death and other medical complications can be reduced by detecting and providing medical help as early as possible after the fall. There are many state-of-the-art fall detection techniques available these days, but the majority of them need very high computing power. In this paper, we proposed a lightweight and fast human fall detection system using pose estimation. We used `Movenet' for human joins key-points extraction. Our proposed method can work in real-time on any low-computing device with any basic camera. All computation can be processed locally, so there is no problem of privacy of the subject. We used two datasets `GMDCSA' and `URFD' for the experiment. We got the sensitivity value of 0.9375 and 0.9167 for the dataset `GMDCSA' and `URFD' respectively. The source code and the dataset GMDCSA of our work are available online to access.
△ Less
Submitted 3 January, 2024;
originally announced January 2024.
-
Homogeneous Algebraic Complexity Theory and Algebraic Formulas
Authors:
Pranjal Dutta,
Fulvio Gesmundo,
Christian Ikenmeyer,
Gorav Jindal,
Vladimir Lysikov
Abstract:
We study algebraic complexity classes and their complete polynomials under \emph{homogeneous linear} projections, not just under the usual affine linear projections that were originally introduced by Valiant in 1979. These reductions are weaker yet more natural from a geometric complexity theory (GCT) standpoint, because the corresponding orbit closure formulations do not require the padding of po…
▽ More
We study algebraic complexity classes and their complete polynomials under \emph{homogeneous linear} projections, not just under the usual affine linear projections that were originally introduced by Valiant in 1979. These reductions are weaker yet more natural from a geometric complexity theory (GCT) standpoint, because the corresponding orbit closure formulations do not require the padding of polynomials. We give the \emph{first} complete polynomials for VF, the class of sequences of polynomials that admit small algebraic formulas, under homogeneous linear projections: The sum of the entries of the non-commutative elementary symmetric polynomial in 3 by 3 matrices of homogeneous linear forms.
Even simpler variants of the elementary symmetric polynomial are hard for the topological closure of a large subclass of VF: the sum of the entries of the non-commutative elementary symmetric polynomial in 2 by 2 matrices of homogeneous linear forms, and homogeneous variants of the continuant polynomial (Bringmann, Ikenmeyer, Zuiddam, JACM '18). This requires a careful study of circuits with arity-3 product gates.
△ Less
Submitted 28 November, 2023;
originally announced November 2023.
-
Deep Representation Learning for Prediction of Temporal Event Sets in the Continuous Time Domain
Authors:
Parag Dutta,
Kawin Mayilvaghanan,
Pratyaksha Sinha,
Ambedkar Dukkipati
Abstract:
Temporal Point Processes (TPP) play an important role in predicting or forecasting events. Although these problems have been studied extensively, predicting multiple simultaneously occurring events can be challenging. For instance, more often than not, a patient gets admitted to a hospital with multiple conditions at a time. Similarly people buy more than one stock and multiple news breaks out at…
▽ More
Temporal Point Processes (TPP) play an important role in predicting or forecasting events. Although these problems have been studied extensively, predicting multiple simultaneously occurring events can be challenging. For instance, more often than not, a patient gets admitted to a hospital with multiple conditions at a time. Similarly people buy more than one stock and multiple news breaks out at the same time. Moreover, these events do not occur at discrete time intervals, and forecasting event sets in the continuous time domain remains an open problem. Naive approaches for extending the existing TPP models for solving this problem lead to dealing with an exponentially large number of events or ignoring set dependencies among events. In this work, we propose a scalable and efficient approach based on TPPs to solve this problem. Our proposed approach incorporates contextual event embeddings, temporal information, and domain features to model the temporal event sets. We demonstrate the effectiveness of our approach through extensive experiments on multiple datasets, showing that our model outperforms existing methods in terms of prediction metrics and computational efficiency. To the best of our knowledge, this is the first work that solves the problem of predicting event set intensities in the continuous time domain by using TPPs.
△ Less
Submitted 29 September, 2023;
originally announced September 2023.
-
Rate-Induced Transitions in Networked Complex Adaptive Systems: Exploring Dynamics and Management Implications Across Ecological, Social, and Socioecological Systems
Authors:
Vítor V. Vasconcelos,
Flávia M. D. Marquitti,
Theresa Ong,
Lisa C. McManus,
Marcus Aguiar,
Amanda B. Campos,
Partha S. Dutta,
Kristen Jovanelly,
Victoria Junquera,
Jude Kong,
Elisabeth H. Krueger,
Simon A. Levin,
Wenying Liao,
Mingzhen Lu,
Dhruv Mittal,
Mercedes Pascual,
Flávio L. Pinheiro,
Juan Rocha,
Fernando P. Santos,
Peter Sloot,
Chenyang,
Su,
Benton Taylor,
Eden Tekwa,
Sjoerd Terpstra
, et al. (5 additional authors not shown)
Abstract:
Complex adaptive systems (CASs), from ecosystems to economies, are open systems and inherently dependent on external conditions. While a system can transition from one state to another based on the magnitude of change in external conditions, the rate of change -- irrespective of magnitude -- may also lead to system state changes due to a phenomenon known as a rate-induced transition (RIT). This st…
▽ More
Complex adaptive systems (CASs), from ecosystems to economies, are open systems and inherently dependent on external conditions. While a system can transition from one state to another based on the magnitude of change in external conditions, the rate of change -- irrespective of magnitude -- may also lead to system state changes due to a phenomenon known as a rate-induced transition (RIT). This study presents a novel framework that captures RITs in CASs through a local model and a network extension where each node contributes to the structural adaptability of others. Our findings reveal how RITs occur at a critical environmental change rate, with lower-degree nodes tipping first due to fewer connections and reduced adaptive capacity. High-degree nodes tip later as their adaptability sources (lower-degree nodes) collapse. This pattern persists across various network structures. Our study calls for an extended perspective when managing CASs, emphasizing the need to focus not only on thresholds of external conditions but also the rate at which those conditions change, particularly in the context of the collapse of surrounding systems that contribute to the focal system's resilience. Our analytical method opens a path to designing management policies that mitigate RIT impacts and enhance resilience in ecological, social, and socioecological systems. These policies could include controlling environmental change rates, fostering system adaptability, implementing adaptive management strategies, and building capacity and knowledge exchange. Our study contributes to the understanding of RIT dynamics and informs effective management strategies for complex adaptive systems in the face of rapid environmental change.
△ Less
Submitted 14 September, 2023;
originally announced September 2023.
-
DNABERT-2: Efficient Foundation Model and Benchmark For Multi-Species Genome
Authors:
Zhihan Zhou,
Yanrong Ji,
Weijian Li,
Pratik Dutta,
Ramana Davuluri,
Han Liu
Abstract:
Decoding the linguistic intricacies of the genome is a crucial problem in biology, and pre-trained foundational models such as DNABERT and Nucleotide Transformer have made significant strides in this area. Existing works have largely hinged on k-mer, fixed-length permutations of A, T, C, and G, as the token of the genome language due to its simplicity. However, we argue that the computation and sa…
▽ More
Decoding the linguistic intricacies of the genome is a crucial problem in biology, and pre-trained foundational models such as DNABERT and Nucleotide Transformer have made significant strides in this area. Existing works have largely hinged on k-mer, fixed-length permutations of A, T, C, and G, as the token of the genome language due to its simplicity. However, we argue that the computation and sample inefficiencies introduced by k-mer tokenization are primary obstacles in developing large genome foundational models. We provide conceptual and empirical insights into genome tokenization, building on which we propose to replace k-mer tokenization with Byte Pair Encoding (BPE), a statistics-based data compression algorithm that constructs tokens by iteratively merging the most frequent co-occurring genome segment in the corpus. We demonstrate that BPE not only overcomes the limitations of k-mer tokenization but also benefits from the computational efficiency of non-overlapping tokenization. Based on these insights, we introduce DNABERT-2, a refined genome foundation model that adapts an efficient tokenizer and employs multiple strategies to overcome input length constraints, reduce time and memory expenditure, and enhance model capability. Furthermore, we identify the absence of a comprehensive and standardized benchmark for genome understanding as another significant impediment to fair comparative analysis. In response, we propose the Genome Understanding Evaluation (GUE), a comprehensive multi-species genome classification dataset that amalgamates $36$ distinct datasets across $9$ tasks, with input lengths ranging from $70$ to $10000$. Through comprehensive experiments on the GUE benchmark, we demonstrate that DNABERT-2 achieves comparable performance to the state-of-the-art model with $21 \times$ fewer parameters and approximately $92 \times$ less GPU time in pre-training.
△ Less
Submitted 18 March, 2024; v1 submitted 26 June, 2023;
originally announced June 2023.
-
Deterministic identity testing paradigms for bounded top-fanin depth-4 circuits
Authors:
Pranjal Dutta,
Prateek Dwivedi,
Nitin Saxena
Abstract:
Polynomial Identity Testing (PIT) is a fundamental computational problem. The famous depth-$4$ reduction result by Agrawal and Vinay (FOCS 2008) has made PIT for depth-$4$ circuits an enticing pursuit. A restricted depth-4 circuit computing a $n$-variate degree-$d$ polynomial of the form $\sum_{i = 1}^{k} \prod_{j} g_{ij}$, where $°g_{ij} \leq δ$ is called $Σ^{[k]}ΠΣΠ^{[δ]}$ circuit. On further re…
▽ More
Polynomial Identity Testing (PIT) is a fundamental computational problem. The famous depth-$4$ reduction result by Agrawal and Vinay (FOCS 2008) has made PIT for depth-$4$ circuits an enticing pursuit. A restricted depth-4 circuit computing a $n$-variate degree-$d$ polynomial of the form $\sum_{i = 1}^{k} \prod_{j} g_{ij}$, where $°g_{ij} \leq δ$ is called $Σ^{[k]}ΠΣΠ^{[δ]}$ circuit. On further restricting $g_{ij}$ to be sum of univariates we obtain $Σ^{[k]}ΠΣ\wedge$ circuits. The largely open, special-cases of $Σ^{[k]}ΠΣΠ^{[δ]}$ for constant $k$ and $δ$, and $Σ^{[k]}ΠΣ\wedge$ have been a source of many great ideas in the last two decades. For eg. depth-$3$ ideas of Dvir and Shpilka (STOC 2005), Kayal and Saxena (CCC 2006), and Saxena and Seshadhri (FOCS 2010 and STOC 2011). Further, depth-$4$ ideas of Beecken, Mittmann and Saxena (ICALP 2011), Saha, Saxena and Saptharishi (Comput.Compl. 2013), Forbes (FOCS 2015), and Kumar and Saraf (CCC 2016). Additionally, geometric Sylvester-Gallai ideas of Kayal and Saraf (FOCS 2009), Shpilka (STOC 2019), and Peleg and Shpilka (CCC 2020, STOC 2021). Very recently, a subexponential-time blackbox PIT algorithm for constant-depth circuits was obtained via lower bound breakthrough of Limaye, Srinivasan, Tavenas (FOCS 2021). We solve two of the basic underlying open problems in this work.
We give the first polynomial-time PIT for $Σ^{[k]}ΠΣ\wedge$. We also give the first quasipolynomial time blackbox PIT for both $Σ^{[k]}ΠΣ\wedge$ and $Σ^{[k]}ΠΣΠ^{[δ]}$. A key technical ingredient in all the three algorithms is how the logarithmic derivative, and its power-series, modify the top $Π$-gate to $\wedge$.
△ Less
Submitted 22 April, 2023;
originally announced April 2023.
-
Composite Deep Network with Feature Weighting for Improved Delineation of COVID Infection in Lung CT
Authors:
Pallabi Dutta,
Sushmita Mitra
Abstract:
An early effective screening and grading of COVID-19 has become imperative towards optimizing the limited available resources of the medical facilities. An automated segmentation of the infected volumes in lung CT is expected to significantly aid in the diagnosis and care of patients. However, an accurate demarcation of lesions remains problematic due to their irregular structure and location(s) w…
▽ More
An early effective screening and grading of COVID-19 has become imperative towards optimizing the limited available resources of the medical facilities. An automated segmentation of the infected volumes in lung CT is expected to significantly aid in the diagnosis and care of patients. However, an accurate demarcation of lesions remains problematic due to their irregular structure and location(s) within the lung. A novel deep learning architecture, Composite Deep network with Feature Weighting (CDNetFW), is proposed for efficient delineation of infected regions from lung CT images. Initially a coarser-segmentation is performed directly at shallower levels, thereby facilitating discovery of robust and discriminatory characteristics in the hidden layers. The novel feature weighting module helps prioritise relevant feature maps to be probed, along with those regions containing crucial information within these maps. This is followed by estimating the severity of the disease.The deep network CDNetFW has been shown to outperform several state-of-the-art architectures in the COVID-19 lesion segmentation task, as measured by experimental results on CT slices from publicly available datasets, especially when it comes to defining structures involving complex geometries.
△ Less
Submitted 17 February, 2023; v1 submitted 17 January, 2023;
originally announced January 2023.
-
Controlling Commercial Cooling Systems Using Reinforcement Learning
Authors:
Jerry Luo,
Cosmin Paduraru,
Octavian Voicu,
Yuri Chervonyi,
Scott Munns,
Jerry Li,
Crystal Qian,
Praneet Dutta,
Jared Quincy Davis,
Ningjia Wu,
Xingwei Yang,
Chu-Ming Chang,
Ted Li,
Rob Rose,
Mingyan Fan,
Hootan Nakhost,
Tinglin Liu,
Brian Kirkman,
Frank Altamura,
Lee Cline,
Patrick Tonker,
Joel Gouker,
Dave Uden,
Warren Buddy Bryan,
Jason Law
, et al. (11 additional authors not shown)
Abstract:
This paper is a technical overview of DeepMind and Google's recent work on reinforcement learning for controlling commercial cooling systems. Building on expertise that began with cooling Google's data centers more efficiently, we recently conducted live experiments on two real-world facilities in partnership with Trane Technologies, a building management system provider. These live experiments ha…
▽ More
This paper is a technical overview of DeepMind and Google's recent work on reinforcement learning for controlling commercial cooling systems. Building on expertise that began with cooling Google's data centers more efficiently, we recently conducted live experiments on two real-world facilities in partnership with Trane Technologies, a building management system provider. These live experiments had a variety of challenges in areas such as evaluation, learning from offline data, and constraint satisfaction. Our paper describes these challenges in the hope that awareness of them will benefit future applied RL work. We also describe the way we adapted our RL system to deal with these challenges, resulting in energy savings of approximately 9% and 13% respectively at the two live experiment sites.
△ Less
Submitted 14 December, 2022; v1 submitted 11 November, 2022;
originally announced November 2022.
-
De-bordering and Geometric Complexity Theory for Waring rank and related models
Authors:
Pranjal Dutta,
Fulvio Gesmundo,
Christian Ikenmeyer,
Gorav Jindal,
Vladimir Lysikov
Abstract:
De-bordering is the task of proving that a border complexity measure is bounded from below, by a non-border complexity measure. This task is at the heart of understanding the difference between Valiant's determinant vs permanent conjecture, and Mulmuley and Sohoni's Geometric Complexity Theory (GCT) approach to settle the P \neq NP conjecture. Currently, very few de-bordering results are known.…
▽ More
De-bordering is the task of proving that a border complexity measure is bounded from below, by a non-border complexity measure. This task is at the heart of understanding the difference between Valiant's determinant vs permanent conjecture, and Mulmuley and Sohoni's Geometric Complexity Theory (GCT) approach to settle the P \neq NP conjecture. Currently, very few de-bordering results are known.
In this work, we study the question of de-bordering the border Waring rank of polynomials. Waring and border Waring rank are very well studied measures, in the context of invariant theory, algebraic geometry and matrix multiplication algorithms. For the first time, we obtain a Waring rank upper bound that is exponential in the border Waring rank and only *linear* in the degree. All previous results were known to be exponential in the degree.
According to Kumar's recent surprising result (ToCT'20), a small border Waring rank implies that the polynomial can be approximated as a sum of a constant and a small product of linear polynomials. We prove the converse of Kumar's result, and in this way we de-border Kumar's complexity, and obtain a new formulation of border Waring rank, up to a factor of the degree. We phrase this new formulation as the orbit closure problem of the product-plus-power polynomial, and we successfully de-border this orbit closure. We fully implement the GCT approach against the power sum, and we generalize the ideas of Ikenmeyer-Kandasamy (STOC'20) to this new orbit closure. In this way, we obtain new multiplicity obstructions that are constructed from just the symmetries of the points and representation theoretic branching rules, rather than explicit multilinear computations.
Furthermore, we realize that the generalization of our converse of Kumar's theorem to square matrices gives a homogeneous formulation of Ben-Or and Cleve (SICOMP'92). This results ...
△ Less
Submitted 13 April, 2023; v1 submitted 13 November, 2022;
originally announced November 2022.
-
PatchRot: A Self-Supervised Technique for Training Vision Transformers
Authors:
Sachin Chhabra,
Prabal Bijoy Dutta,
Hemanth Venkateswara,
Baoxin Li
Abstract:
Vision transformers require a huge amount of labeled data to outperform convolutional neural networks. However, labeling a huge dataset is a very expensive process. Self-supervised learning techniques alleviate this problem by learning features similar to supervised learning in an unsupervised way. In this paper, we propose a self-supervised technique PatchRot that is crafted for vision transforme…
▽ More
Vision transformers require a huge amount of labeled data to outperform convolutional neural networks. However, labeling a huge dataset is a very expensive process. Self-supervised learning techniques alleviate this problem by learning features similar to supervised learning in an unsupervised way. In this paper, we propose a self-supervised technique PatchRot that is crafted for vision transformers. PatchRot rotates images and image patches and trains the network to predict the rotation angles. The network learns to extract both global and local features from an image. Our extensive experiments on different datasets showcase PatchRot training learns rich features which outperform supervised learning and compared baseline.
△ Less
Submitted 27 October, 2022;
originally announced October 2022.
-
Full-scale Deeply Supervised Attention Network for Segmenting COVID-19 Lesions
Authors:
Pallabi Dutta,
Sushmita Mitra
Abstract:
Automated delineation of COVID-19 lesions from lung CT scans aids the diagnosis and prognosis for patients. The asymmetric shapes and positioning of the infected regions make the task extremely difficult. Capturing information at multiple scales will assist in deciphering features, at global and local levels, to encompass lesions of variable size and texture. We introduce the Full-scale Deeply Sup…
▽ More
Automated delineation of COVID-19 lesions from lung CT scans aids the diagnosis and prognosis for patients. The asymmetric shapes and positioning of the infected regions make the task extremely difficult. Capturing information at multiple scales will assist in deciphering features, at global and local levels, to encompass lesions of variable size and texture. We introduce the Full-scale Deeply Supervised Attention Network (FuDSA-Net), for efficient segmentation of corona-infected lung areas in CT images. The model considers activation responses from all levels of the encoding path, encompassing multi-scalar features acquired at different levels of the network. This helps segment target regions (lesions) of varying shape, size and contrast. Incorporation of the entire gamut of multi-scalar characteristics into the novel attention mechanism helps prioritize the selection of activation responses and locations containing useful information. Determining robust and discriminatory features along the decoder path is facilitated with deep supervision. Connections in the decoder arm are remodeled to handle the issue of vanishing gradient. As observed from the experimental results, FuDSA-Net surpasses other state-of-the-art architectures; especially, when it comes to characterizing complicated geometries of the lesions.
△ Less
Submitted 27 October, 2022;
originally announced October 2022.
-
Optimizing Industrial HVAC Systems with Hierarchical Reinforcement Learning
Authors:
William Wong,
Praneet Dutta,
Octavian Voicu,
Yuri Chervonyi,
Cosmin Paduraru,
Jerry Luo
Abstract:
Reinforcement learning (RL) techniques have been developed to optimize industrial cooling systems, offering substantial energy savings compared to traditional heuristic policies. A major challenge in industrial control involves learning behaviors that are feasible in the real world due to machinery constraints. For example, certain actions can only be executed every few hours while other actions c…
▽ More
Reinforcement learning (RL) techniques have been developed to optimize industrial cooling systems, offering substantial energy savings compared to traditional heuristic policies. A major challenge in industrial control involves learning behaviors that are feasible in the real world due to machinery constraints. For example, certain actions can only be executed every few hours while other actions can be taken more frequently. Without extensive reward engineering and experimentation, an RL agent may not learn realistic operation of machinery. To address this, we use hierarchical reinforcement learning with multiple agents that control subsets of actions according to their operation time scales. Our hierarchical approach achieves energy savings over existing baselines while maintaining constraints such as operating chillers within safe bounds in a simulated HVAC control environment.
△ Less
Submitted 16 September, 2022;
originally announced September 2022.
-
Semi-analytical Industrial Cooling System Model for Reinforcement Learning
Authors:
Yuri Chervonyi,
Praneet Dutta,
Piotr Trochim,
Octavian Voicu,
Cosmin Paduraru,
Crystal Qian,
Emre Karagozler,
Jared Quincy Davis,
Richard Chippendale,
Gautam Bajaj,
Sims Witherspoon,
Jerry Luo
Abstract:
We present a hybrid industrial cooling system model that embeds analytical solutions within a multi-physics simulation. This model is designed for reinforcement learning (RL) applications and balances simplicity with simulation fidelity and interpretability. The model's fidelity is evaluated against real world data from a large scale cooling system. This is followed by a case study illustrating ho…
▽ More
We present a hybrid industrial cooling system model that embeds analytical solutions within a multi-physics simulation. This model is designed for reinforcement learning (RL) applications and balances simplicity with simulation fidelity and interpretability. The model's fidelity is evaluated against real world data from a large scale cooling system. This is followed by a case study illustrating how the model can be used for RL research. For this, we develop an industrial task suite that allows specifying different problem settings and levels of complexity, and use it to evaluate the performance of different RL algorithms.
△ Less
Submitted 26 July, 2022;
originally announced July 2022.
-
Vision-based Human Fall Detection Systems using Deep Learning: A Review
Authors:
Ekram Alam,
Abu Sufian,
Paramartha Dutta,
Marco Leo
Abstract:
Human fall is one of the very critical health issues, especially for elders and disabled people living alone. The number of elder populations is increasing steadily worldwide. Therefore, human fall detection is becoming an effective technique for assistive living for those people. For assistive living, deep learning and computer vision have been used largely. In this review article, we discuss dee…
▽ More
Human fall is one of the very critical health issues, especially for elders and disabled people living alone. The number of elder populations is increasing steadily worldwide. Therefore, human fall detection is becoming an effective technique for assistive living for those people. For assistive living, deep learning and computer vision have been used largely. In this review article, we discuss deep learning (DL)-based state-of-the-art non-intrusive (vision-based) fall detection techniques. We also present a survey on fall detection benchmark datasets. For a clear understanding, we briefly discuss different metrics which are used to evaluate the performance of the fall detection systems. This article also gives a future direction on vision-based human fall detection techniques.
△ Less
Submitted 22 July, 2022;
originally announced July 2022.
-
POET: Training Neural Networks on Tiny Devices with Integrated Rematerialization and Paging
Authors:
Shishir G. Patil,
Paras Jain,
Prabal Dutta,
Ion Stoica,
Joseph E. Gonzalez
Abstract:
Fine-tuning models on edge devices like mobile phones would enable privacy-preserving personalization over sensitive data. However, edge training has historically been limited to relatively small models with simple architectures because training is both memory and energy intensive. We present POET, an algorithm to enable training large neural networks on memory-scarce battery-operated edge devices…
▽ More
Fine-tuning models on edge devices like mobile phones would enable privacy-preserving personalization over sensitive data. However, edge training has historically been limited to relatively small models with simple architectures because training is both memory and energy intensive. We present POET, an algorithm to enable training large neural networks on memory-scarce battery-operated edge devices. POET jointly optimizes the integrated search search spaces of rematerialization and paging, two algorithms to reduce the memory consumption of backpropagation. Given a memory budget and a run-time constraint, we formulate a mixed-integer linear program (MILP) for energy-optimal training. Our approach enables training significantly larger models on embedded devices while reducing energy consumption while not modifying mathematical correctness of backpropagation. We demonstrate that it is possible to fine-tune both ResNet-18 and BERT within the memory constraints of a Cortex-M class embedded device while outperforming current edge training methods in energy efficiency. POET is an open-source project available at https://github.com/ShishirPatil/poet
△ Less
Submitted 15 July, 2022;
originally announced July 2022.
-
Optimization of Temperature and Relative Humidity in an Automatic Egg Incubator Using Mamdani Interference System
Authors:
Pramit Dutta,
Nafisa Anjum
Abstract:
Temperature and humidity are two of the rudimentary factors that must be controlled during egg incubation. Improper temperature and humidity levels during the incubation period often result in unwanted conditions. This paper proposes the design of an efficient Mamdani fuzzy interference system instead of the widely used Takagi-Sugeno system in this field for controlling the temperature and humidit…
▽ More
Temperature and humidity are two of the rudimentary factors that must be controlled during egg incubation. Improper temperature and humidity levels during the incubation period often result in unwanted conditions. This paper proposes the design of an efficient Mamdani fuzzy interference system instead of the widely used Takagi-Sugeno system in this field for controlling the temperature and humidity levels of an egg incubator. Though the optimum incubation temperature and humidity levels used here are that of chicken egg, the proposed methodology is applicable to other avian species as well. Theinput functions have been used here as per estimated values forsafe hatching using Mamdani whereas defuzzification method, COA, has been applied for output. From the model output,a stabilized heat from temperature level and fan speed to control the humidity level of an egg incubator can be obtained. This maximizes the hatching rate of healthy chicks under any conditions in the field.
△ Less
Submitted 17 June, 2022;
originally announced July 2022.
-
Identifying Counterfeit Products using Blockchain Technology in Supply Chain System
Authors:
Nafisa Anjum,
Pramit Dutta
Abstract:
With the advent of globalization and the evergrowing rate of technology, the volume of production as well as ease of procuring counterfeit goods has become unprecedented. Be it food, drug or luxury items, all kinds of industrial manufacturers and distributors are now seeking greater transparency in supply chain operations with a view to deter counterfeiting. This paper introduces a decentralized B…
▽ More
With the advent of globalization and the evergrowing rate of technology, the volume of production as well as ease of procuring counterfeit goods has become unprecedented. Be it food, drug or luxury items, all kinds of industrial manufacturers and distributors are now seeking greater transparency in supply chain operations with a view to deter counterfeiting. This paper introduces a decentralized Blockchain based application system (DApp) with a view to identifying counterfeit products in the supply chain system. With the rapid rise of Blockchain technology, it has become known that data recorded within Blockchain is immutable and secure. Hence, the proposed project here uses this concept to handle the transfer of ownership of products. A consumer can verify the product distribution and ownership information scanning a Quick Response (QR) code generated by the DApp for each product linked to the Blockchain.
△ Less
Submitted 17 June, 2022;
originally announced June 2022.
-
COVID-19 Detection using Transfer Learning with Convolutional Neural Network
Authors:
Pramit Dutta,
Tanny Roy,
Nafisa Anjum
Abstract:
The Novel Coronavirus disease 2019 (COVID-19) is a fatal infectious disease, first recognized in December 2019 in Wuhan, Hubei, China, and has gone on an epidemic situation. Under these circumstances, it became more important to detect COVID-19 in infected people. Nowadays, the testing kits are gradually lessening in number compared to the number of infected population. Under recent prevailing con…
▽ More
The Novel Coronavirus disease 2019 (COVID-19) is a fatal infectious disease, first recognized in December 2019 in Wuhan, Hubei, China, and has gone on an epidemic situation. Under these circumstances, it became more important to detect COVID-19 in infected people. Nowadays, the testing kits are gradually lessening in number compared to the number of infected population. Under recent prevailing conditions, the diagnosis of lung disease by analyzing chest CT (Computed Tomography) images has become an important tool for both diagnosis and prophecy of COVID-19 patients. In this study, a Transfer learning strategy (CNN) for detecting COVID-19 infection from CT images has been proposed. In the proposed model, a multilayer Convolutional neural network (CNN) with Transfer learning model Inception V3 has been designed. Similar to CNN, it uses convolution and pooling to extract features, but this transfer learning model contains weights of dataset Imagenet. Thus it can detect features very effectively which gives it an upper hand for achieving better accuracy.
△ Less
Submitted 17 June, 2022;
originally announced June 2022.
-
Multi-Classification of Brain Tumor Images Using Transfer Learning Based Deep Neural Network
Authors:
Pramit Dutta,
Khaleda Akhter Sathi,
Md. Saiful Islam
Abstract:
In recent advancement towards computer based diagnostics system, the classification of brain tumor images is a challenging task. This paper mainly focuses on elevating the classification accuracy of brain tumor images with transfer learning based deep neural network. The classification approach is started with the image augmentation operation including rotation, zoom, hori-zontal flip, width shift…
▽ More
In recent advancement towards computer based diagnostics system, the classification of brain tumor images is a challenging task. This paper mainly focuses on elevating the classification accuracy of brain tumor images with transfer learning based deep neural network. The classification approach is started with the image augmentation operation including rotation, zoom, hori-zontal flip, width shift, height shift, and shear to increase the diversity in image datasets. Then the general features of the input brain tumor images are extracted based on a pre-trained transfer learning method comprised of Inception-v3. Fi-nally, the deep neural network with 4 customized layers is employed for classi-fying the brain tumors in most frequent brain tumor types as meningioma, glioma, and pituitary. The proposed model acquires an effective performance with an overall accuracy of 96.25% which is much improved than some existing multi-classification methods. Whereas, the fine-tuning of hyper-parameters and inclusion of customized DNN with the Inception-v3 model results in an im-provement of the classification accuracy.
△ Less
Submitted 17 June, 2022;
originally announced June 2022.
-
ViT-BEVSeg: A Hierarchical Transformer Network for Monocular Birds-Eye-View Segmentation
Authors:
Pramit Dutta,
Ganesh Sistu,
Senthil Yogamani,
Edgar Galván,
John McDonald
Abstract:
Generating a detailed near-field perceptual model of the environment is an important and challenging problem in both self-driving vehicles and autonomous mobile robotics. A Bird Eye View (BEV) map, providing a panoptic representation, is a commonly used approach that provides a simplified 2D representation of the vehicle surroundings with accurate semantic level segmentation for many downstream ta…
▽ More
Generating a detailed near-field perceptual model of the environment is an important and challenging problem in both self-driving vehicles and autonomous mobile robotics. A Bird Eye View (BEV) map, providing a panoptic representation, is a commonly used approach that provides a simplified 2D representation of the vehicle surroundings with accurate semantic level segmentation for many downstream tasks. Current state-of-the art approaches to generate BEV-maps employ a Convolutional Neural Network (CNN) backbone to create feature-maps which are passed through a spatial transformer to project the derived features onto the BEV coordinate frame. In this paper, we evaluate the use of vision transformers (ViT) as a backbone architecture to generate BEV maps. Our network architecture, ViT-BEVSeg, employs standard vision transformers to generate a multi-scale representation of the input image. The resulting representation is then provided as an input to a spatial transformer decoder module which outputs segmentation maps in the BEV grid. We evaluate our approach on the nuScenes dataset demonstrating a considerable improvement in the performance relative to state-of-the-art approaches.
△ Less
Submitted 31 May, 2022;
originally announced May 2022.
-
A Multi-domain Magneto Tunnel Junction for Racetrack Nanowire Strips
Authors:
Prayash Dutta,
Albert Lee,
Kang L. Wang,
Alex K. Jones,
Sanjukta Bhanja
Abstract:
Domain-wall memory (DWM) has SRAM class access performance, low energy, high endurance, high density, and CMOS compatibility. Recently, shift reliability and processing-using-memory (PuM) proposals developed a need to count the number of parallel or anti-parallel domains in a portion of the DWM nanowire. In this paper we propose a multi-domain magneto-tunnel junction (MTJ) that can detect differen…
▽ More
Domain-wall memory (DWM) has SRAM class access performance, low energy, high endurance, high density, and CMOS compatibility. Recently, shift reliability and processing-using-memory (PuM) proposals developed a need to count the number of parallel or anti-parallel domains in a portion of the DWM nanowire. In this paper we propose a multi-domain magneto-tunnel junction (MTJ) that can detect different resistance levels as a function of a the number of parallel or anti-parallel domains. Using detailed micromagnetic simulation with LLG, we demonstrate the multi-domain MTJ, study the benefit of its macro-size on resilience to process variation and present a macro-model for scaling the size of the multi-domain MTJ. Our results indicate scalability to seven-domains while maintaining a 16.3mV sense margin.
△ Less
Submitted 25 May, 2022;
originally announced May 2022.
-
CRUSH: Contextually Regularized and User anchored Self-supervised Hate speech Detection
Authors:
Souvic Chakraborty,
Parag Dutta,
Sumegh Roychowdhury,
Animesh Mukherjee
Abstract:
The last decade has witnessed a surge in the interaction of people through social networking platforms. While there are several positive aspects of these social platforms, the proliferation has led them to become the breeding ground for cyber-bullying and hate speech. Recent advances in NLP have often been used to mitigate the spread of such hateful content. Since the task of hate speech detection…
▽ More
The last decade has witnessed a surge in the interaction of people through social networking platforms. While there are several positive aspects of these social platforms, the proliferation has led them to become the breeding ground for cyber-bullying and hate speech. Recent advances in NLP have often been used to mitigate the spread of such hateful content. Since the task of hate speech detection is usually applicable in the context of social networks, we introduce CRUSH, a framework for hate speech detection using user-anchored self-supervision and contextual regularization. Our proposed approach secures ~ 1-12% improvement in test set metrics over best performing previous approaches on two types of tasks and multiple popular english social media datasets.
△ Less
Submitted 4 May, 2022; v1 submitted 13 April, 2022;
originally announced April 2022.
-
Efficient reductions and algorithms for variants of Subset Sum
Authors:
Pranjal Dutta,
Mahesh Sreekumar Rajasree
Abstract:
Given $(a_1, \dots, a_n, t) \in \mathbb{Z}_{\geq 0}^{n + 1}$, the Subset Sum problem ($\mathsf{SSUM}$) is to decide whether there exists $S \subseteq [n]$ such that $\sum_{i \in S} a_i = t$. There is a close variant of the $\mathsf{SSUM}$, called $\mathsf{Subset~Product}$. Given positive integers $a_1, ..., a_n$ and a target integer $t$, the $\mathsf{Subset~Product}$ problem asks to determine whet…
▽ More
Given $(a_1, \dots, a_n, t) \in \mathbb{Z}_{\geq 0}^{n + 1}$, the Subset Sum problem ($\mathsf{SSUM}$) is to decide whether there exists $S \subseteq [n]$ such that $\sum_{i \in S} a_i = t$. There is a close variant of the $\mathsf{SSUM}$, called $\mathsf{Subset~Product}$. Given positive integers $a_1, ..., a_n$ and a target integer $t$, the $\mathsf{Subset~Product}$ problem asks to determine whether there exists a subset $S \subseteq [n]$ such that $\prod_{i \in S} a_i=t$. There is a pseudopolynomial time dynamic programming algorithm, due to Bellman (1957) which solves the $\mathsf{SSUM}$ and $\mathsf{Subset~Product}$ in $O(nt)$ time and $O(t)$ space.
In the first part, we present {\em search} algorithms for variants of the Subset Sum problem. Our algorithms are parameterized by $k$, which is a given upper bound on the number of realisable sets (i.e.,~number of solutions, summing exactly $t$). We show that $\mathsf{SSUM}$ with a unique solution is already NP-hard, under randomized reduction. This makes the regime of parametrized algorithms, in terms of $k$, very interesting.
Subsequently, we present an $\tilde{O}(k\cdot (n+t))$ time deterministic algorithm, which finds the hamming weight of all the realisable sets for a subset sum instance. We also give a poly$(knt)$-time and $O(\log(knt))$-space deterministic algorithm that finds all the realisable sets for a subset sum instance.
In the latter part, we present a simple and elegant randomized $\tilde{O}(n + t)$ time algorithm for $\mathsf{Subset~Product}$. Moreover, we also present a poly$(nt)$ time and $O(\log^2 (nt))$ space deterministic algorithm for the same. We study these problems in the unbounded setting as well. Our algorithms use multivariate FFT, power series and number-theoretic techniques, introduced by Jin and Wu (SOSA'19) and Kane (2010).
△ Less
Submitted 1 June, 2022; v1 submitted 21 December, 2021;
originally announced December 2021.
-
Biologically Inspired Oscillating Activation Functions Can Bridge the Performance Gap between Biological and Artificial Neurons
Authors:
Matthew Mithra Noel,
Shubham Bharadwaj,
Venkataraman Muthiah-Nakarajan,
Praneet Dutta,
Geraldine Bessie Amali
Abstract:
The recent discovery of special human neocortical pyramidal neurons that can individually learn the XOR function highlights the significant performance gap between biological and artificial neurons. The output of these pyramidal neurons first increases to a maximum with input and then decreases. Artificial neurons with similar characteristics can be designed with oscillating activation functions.…
▽ More
The recent discovery of special human neocortical pyramidal neurons that can individually learn the XOR function highlights the significant performance gap between biological and artificial neurons. The output of these pyramidal neurons first increases to a maximum with input and then decreases. Artificial neurons with similar characteristics can be designed with oscillating activation functions. Oscillating activation functions have multiple zeros allowing single neurons to have multiple hyper-planes in their decision boundary. This enables even single neurons to learn the XOR function. This paper proposes four new oscillating activation functions inspired by human pyramidal neurons that can also individually learn the XOR function. Oscillating activation functions are non-saturating for all inputs unlike popular activation functions, leading to improved gradient flow and faster convergence. Using oscillating activation functions instead of popular monotonic or non-monotonic single-zero activation functions enables neural networks to train faster and solve classification problems with fewer layers. An extensive comparison of 23 activation functions on CIFAR 10, CIFAR 100, and Imagentte benchmarks is presented and the oscillating activation functions proposed in this paper are shown to outperform all known popular activation functions.
△ Less
Submitted 10 May, 2023; v1 submitted 7 November, 2021;
originally announced November 2021.
-
Growing Cosine Unit: A Novel Oscillatory Activation Function That Can Speedup Training and Reduce Parameters in Convolutional Neural Networks
Authors:
Mathew Mithra Noel,
Arunkumar L,
Advait Trivedi,
Praneet Dutta
Abstract:
Convolutional neural networks have been successful in solving many socially important and economically significant problems. This ability to learn complex high-dimensional functions hierarchically can be attributed to the use of nonlinear activation functions. A key discovery that made training deep networks feasible was the adoption of the Rectified Linear Unit (ReLU) activation function to allev…
▽ More
Convolutional neural networks have been successful in solving many socially important and economically significant problems. This ability to learn complex high-dimensional functions hierarchically can be attributed to the use of nonlinear activation functions. A key discovery that made training deep networks feasible was the adoption of the Rectified Linear Unit (ReLU) activation function to alleviate the vanishing gradient problem caused by using saturating activation functions. Since then, many improved variants of the ReLU activation have been proposed. However, a majority of activation functions used today are non-oscillatory and monotonically increasing due to their biological plausibility. This paper demonstrates that oscillatory activation functions can improve gradient flow and reduce network size. Two theorems on limits of non-oscillatory activation functions are presented. A new oscillatory activation function called Growing Cosine Unit(GCU) defined as $C(z) = z\cos z$ that outperforms Sigmoids, Swish, Mish and ReLU on a variety of architectures and benchmarks is presented. The GCU activation has multiple zeros enabling single GCU neurons to have multiple hyperplanes in the decision boundary. This allows single GCU neurons to learn the XOR function without feature engineering. Experimental results indicate that replacing the activation function in the convolution layers with the GCU activation function significantly improves performance on CIFAR-10, CIFAR-100 and Imagenette.
△ Less
Submitted 12 January, 2023; v1 submitted 29 August, 2021;
originally announced August 2021.
-
PIRM: Processing In Racetrack Memories
Authors:
Sebastien Ollivier,
Stephen Longofono,
Prayash Dutta,
Jingtong Hu,
Sanjukta Bhanja,
Alex K. Jones
Abstract:
The growth in data needs of modern applications has created significant challenges for modern systems leading a "memory wall." Spintronic Domain Wall Memory (DWM), related to Spin-Transfer Torque Memory (STT-MRAM), provides near-SRAM read/write performance, energy savings and nonvolatility, potential for extremely high storage density, and does not have significant endurance limitations. However,…
▽ More
The growth in data needs of modern applications has created significant challenges for modern systems leading a "memory wall." Spintronic Domain Wall Memory (DWM), related to Spin-Transfer Torque Memory (STT-MRAM), provides near-SRAM read/write performance, energy savings and nonvolatility, potential for extremely high storage density, and does not have significant endurance limitations. However, DWM's benefits cannot address data access latency and throughput limitations of memory bus bandwidth. We propose PIRM, a DWM-based in-memory computing solution that leverages the properties of DWM nanowires and allows them to serve as polymorphic gates. While normally DWM is accessed by applying spin polarized currents orthogonal to the nanowire at access points to read individual bits, transverse access along the DWM nanowire allows the differentiation of the aggregate resistance of multiple bits in the nanowire, akin to a multilevel cell. PIRM leverages this transverse reading to directly provide bulk-bitwise logic of multiple adjacent operands in the nanowire, simultaneously. Based on this in-memory logic, PIRM provides a technique to conduct multi-operand addition and two operand multiplication using transverse access. PIRM provides a 1.6x speedup compared to the leading DRAM PIM technique for query applications that leverage bulk bitwise operations. Compared to the leading PIM technique for DWM, PIRM improves performance by 6.9x, 2.3x and energy by 5.5x, 3.4x for 8-bit addition and multiplication, respectively. For arithmetic heavy benchmarks, PIRM reduces access latency by 2.1x, while decreasing energy consumption by 25.2x for a reasonable 10% area overhead versus non-PIM DWM.
△ Less
Submitted 1 August, 2022; v1 submitted 2 August, 2021;
originally announced August 2021.
-
Multimodal Graph-based Transformer Framework for Biomedical Relation Extraction
Authors:
Sriram Pingali,
Shweta Yadav,
Pratik Dutta,
Sriparna Saha
Abstract:
The recent advancement of pre-trained Transformer models has propelled the development of effective text mining models across various biomedical tasks. However, these models are primarily learned on the textual data and often lack the domain knowledge of the entities to capture the context beyond the sentence. In this study, we introduced a novel framework that enables the model to learn multi-omn…
▽ More
The recent advancement of pre-trained Transformer models has propelled the development of effective text mining models across various biomedical tasks. However, these models are primarily learned on the textual data and often lack the domain knowledge of the entities to capture the context beyond the sentence. In this study, we introduced a novel framework that enables the model to learn multi-omnics biological information about entities (proteins) with the help of additional multi-modal cues like molecular structure. Towards this, rather developing modality-specific architectures, we devise a generalized and optimized graph based multi-modal learning mechanism that utilizes the GraphBERT model to encode the textual and molecular structure information and exploit the underlying features of various modalities to enable end-to-end learning. We evaluated our proposed method on ProteinProtein Interaction task from the biomedical corpus, where our proposed generalized approach is observed to be benefited by the additional domain-specific modality.
△ Less
Submitted 1 July, 2021;
originally announced July 2021.
-
Active$^2$ Learning: Actively reducing redundancies in Active Learning methods for Sequence Tagging and Machine Translation
Authors:
Rishi Hazra,
Parag Dutta,
Shubham Gupta,
Mohammed Abdul Qaathir,
Ambedkar Dukkipati
Abstract:
While deep learning is a powerful tool for natural language processing (NLP) problems, successful solutions to these problems rely heavily on large amounts of annotated samples. However, manually annotating data is expensive and time-consuming. Active Learning (AL) strategies reduce the need for huge volumes of labeled data by iteratively selecting a small number of examples for manual annotation…
▽ More
While deep learning is a powerful tool for natural language processing (NLP) problems, successful solutions to these problems rely heavily on large amounts of annotated samples. However, manually annotating data is expensive and time-consuming. Active Learning (AL) strategies reduce the need for huge volumes of labeled data by iteratively selecting a small number of examples for manual annotation based on their estimated utility in training the given model. In this paper, we argue that since AL strategies choose examples independently, they may potentially select similar examples, all of which may not contribute significantly to the learning process. Our proposed approach, Active$\mathbf{^2}$ Learning (A$\mathbf{^2}$L), actively adapts to the deep learning model being trained to eliminate further such redundant examples chosen by an AL strategy. We show that A$\mathbf{^2}$L is widely applicable by using it in conjunction with several different AL strategies and NLP tasks. We empirically demonstrate that the proposed approach is further able to reduce the data requirements of state-of-the-art AL strategies by an absolute percentage reduction of $\approx\mathbf{3-25\%}$ on multiple NLP tasks while achieving the same performance with no additional computation overhead.
△ Less
Submitted 3 April, 2021; v1 submitted 11 March, 2021;
originally announced March 2021.
-
Inferring temporal dynamics from cross-sectional data using Langevin dynamics
Authors:
Pritha Dutta,
Rick Quax,
Loes Crielaard,
Peter M. A. Sloot
Abstract:
Cross-sectional studies are widely prevalent since they are more feasible to conduct compared to longitudinal studies. However, cross-sectional data lack the temporal information required to study the evolution of the underlying processes. Nevertheless, this is essential to develop predictive computational models which is the first step towards causal modelling. We propose a method for inferring c…
▽ More
Cross-sectional studies are widely prevalent since they are more feasible to conduct compared to longitudinal studies. However, cross-sectional data lack the temporal information required to study the evolution of the underlying processes. Nevertheless, this is essential to develop predictive computational models which is the first step towards causal modelling. We propose a method for inferring computational models from cross-sectional data using Langevin dynamics. This method can be applied to any system that can be described as effectively following a free energy landscape, such as protein folding, stem cell differentiation and reprogramming, and social systems involving human interaction and social norms. A crucial assumption in our method is that the data-points are gathered from a system in (local) equilibrium. The result is a set of stochastic differential equations which capture the temporal dynamics, by assuming that groups of data-points are subject to the same free energy landscape and amount of noise. Our method is a 'baseline' method which initiates the development of computational models which can be iteratively enhanced through the inclusion of expert knowledge. We validate the proposed method against two population-based longitudinal datasets and observe significant predictive power in comparison with random choice algorithms. We also show how the predictive power of our 'baseline' model can be enhanced by incorporating domain expert knowledge. Our method addresses an important obstacle for model development in fields dominated by cross-sectional datasets.
△ Less
Submitted 23 February, 2021;
originally announced February 2021.
-
Game Plan: What AI can do for Football, and What Football can do for AI
Authors:
Karl Tuyls,
Shayegan Omidshafiei,
Paul Muller,
Zhe Wang,
Jerome Connor,
Daniel Hennes,
Ian Graham,
William Spearman,
Tim Waskett,
Dafydd Steele,
Pauline Luc,
Adria Recasens,
Alexandre Galashov,
Gregory Thornton,
Romuald Elie,
Pablo Sprechmann,
Pol Moreno,
Kris Cao,
Marta Garnelo,
Praneet Dutta,
Michal Valko,
Nicolas Heess,
Alex Bridgland,
Julien Perolat,
Bart De Vylder
, et al. (11 additional authors not shown)
Abstract:
The rapid progress in artificial intelligence (AI) and machine learning has opened unprecedented analytics possibilities in various team and individual sports, including baseball, basketball, and tennis. More recently, AI techniques have been applied to football, due to a huge increase in data collection by professional teams, increased computational power, and advances in machine learning, with t…
▽ More
The rapid progress in artificial intelligence (AI) and machine learning has opened unprecedented analytics possibilities in various team and individual sports, including baseball, basketball, and tennis. More recently, AI techniques have been applied to football, due to a huge increase in data collection by professional teams, increased computational power, and advances in machine learning, with the goal of better addressing new scientific challenges involved in the analysis of both individual players' and coordinated teams' behaviors. The research challenges associated with predictive and prescriptive football analytics require new developments and progress at the intersection of statistical learning, game theory, and computer vision. In this paper, we provide an overarching perspective highlighting how the combination of these fields, in particular, forms a unique microcosm for AI research, while offering mutual benefits for professional teams, spectators, and broadcasters in the years to come. We illustrate that this duality makes football analytics a game changer of tremendous value, in terms of not only changing the game of football itself, but also in terms of what this domain can mean for the field of AI. We review the state-of-the-art and exemplify the types of analysis enabled by combining the aforementioned fields, including illustrative examples of counterfactual analysis using predictive models, and the combination of game-theoretic analysis of penalty kicks with statistical learning of player attributes. We conclude by highlighting envisioned downstream impacts, including possibilities for extensions to other sports (real and virtual).
△ Less
Submitted 18 November, 2020;
originally announced November 2020.
-
Collusion-Resistant Identity-based Proxy Re-Encryption: Lattice-based Constructions in Standard Model
Authors:
Priyanka Dutta,
Willy Susilo,
Dung Hoang Duong,
Partha Sarathi Roy
Abstract:
The concept of proxy re-encryption (PRE) dates back to the work of Blaze, Bleumer, and Strauss in 1998. PRE offers delegation of decryption rights, i.e., it securely enables the re-encryption of ciphertexts from one key to another, without relying on trusted parties. PRE allows a semi-trusted third party termed as a ``proxy" to securely divert encrypted files of user A (delegator) to user B (deleg…
▽ More
The concept of proxy re-encryption (PRE) dates back to the work of Blaze, Bleumer, and Strauss in 1998. PRE offers delegation of decryption rights, i.e., it securely enables the re-encryption of ciphertexts from one key to another, without relying on trusted parties. PRE allows a semi-trusted third party termed as a ``proxy" to securely divert encrypted files of user A (delegator) to user B (delegatee) without revealing any information about the underlying files to the proxy. To eliminate the necessity of having a costly certificate verification process, Green and Ateniese introduced an identity-based PRE (IB-PRE). The potential applicability of IB-PRE sprung up a long line of intensive research from its first instantiation. Unfortunately, till today, there is no collusion-Resistant unidirectional IB-PRE secure in the standard model, which can withstand quantum attack. In this paper, we present the first concrete constructions of collusion-Resistant unidirectional IB-PRE, for both selective and adaptive identity, which are secure in standard model based on the hardness of learning with error problem.
△ Less
Submitted 16 November, 2020;
originally announced November 2020.
-
CoVista: A Unified View on Privacy Sensitive Mobile Contact Tracing Effort
Authors:
David Culler,
Prabal Dutta,
Gabe Fierro,
Joseph E. Gonzalez,
Nathan Pemberton,
Johann Schleier-Smith,
K. Shankari,
Alvin Wan,
Thomas Zachariah
Abstract:
Governments around the world have become increasingly frustrated with tech giants dictating public health policy. The software created by Apple and Google enables individuals to track their own potential exposure through collated exposure notifications. However, the same software prohibits location tracking, denying key information needed by public health officials for robust contract tracing. Thi…
▽ More
Governments around the world have become increasingly frustrated with tech giants dictating public health policy. The software created by Apple and Google enables individuals to track their own potential exposure through collated exposure notifications. However, the same software prohibits location tracking, denying key information needed by public health officials for robust contract tracing. This information is needed to treat and isolate COVID-19 positive people, identify transmission hotspots, and protect against continued spread of infection. In this article, we present two simple ideas: the lighthouse and the covid-commons that address the needs of public health authorities while preserving the privacy-sensitive goals of the Apple and google exposure notification protocols.
△ Less
Submitted 27 May, 2020;
originally announced May 2020.
-
Lattice-based Unidirectional IBPRE Secure in Standard Model
Authors:
Priyanka Dutta,
Willy Susilo,
Dung Hoang Duong,
Joonsang Baek,
Partha Sarathi Roy
Abstract:
Proxy re-encryption (PRE) securely enables the re-encryption of ciphertexts from one key to another, without relying on trusted parties, i.e., it offers delegation of decryption rights. PRE allows a semi-trusted third party termed as a "proxy" to securely divert encrypted files of user A (delegator) to user B (delegatee) without revealing any information about the underlying files to the proxy. To…
▽ More
Proxy re-encryption (PRE) securely enables the re-encryption of ciphertexts from one key to another, without relying on trusted parties, i.e., it offers delegation of decryption rights. PRE allows a semi-trusted third party termed as a "proxy" to securely divert encrypted files of user A (delegator) to user B (delegatee) without revealing any information about the underlying files to the proxy. To eliminate the necessity of having a costly certificate verification process, Green and Ateniese introduced an identity-based PRE (IB-PRE). The potential applicability of IB-PRE leads to intensive research from its first instantiation. Unfortunately, till today, there is no unidirectional IB-PRE secure in the standard model, which can withstand quantum attack. In this paper, we provide, for the first time, a concrete construction of unidirectional IB-PRE which is secure in standard model based on the hardness of learning with error problem. Our technique is to use the novel trapdoor delegation technique of Micciancio and Peikert. The way we use trapdoor delegation technique may prove useful for functionalities other than proxy re-encryption as well.
△ Less
Submitted 14 May, 2020;
originally announced May 2020.
-
Evolution of Image Segmentation using Deep Convolutional Neural Network: A Survey
Authors:
Farhana Sultana,
Abu Sufian,
Paramartha Dutta
Abstract:
From the autonomous car driving to medical diagnosis, the requirement of the task of image segmentation is everywhere. Segmentation of an image is one of the indispensable tasks in computer vision. This task is comparatively complicated than other vision tasks as it needs low-level spatial information. Basically, image segmentation can be of two types: semantic segmentation and instance segmentati…
▽ More
From the autonomous car driving to medical diagnosis, the requirement of the task of image segmentation is everywhere. Segmentation of an image is one of the indispensable tasks in computer vision. This task is comparatively complicated than other vision tasks as it needs low-level spatial information. Basically, image segmentation can be of two types: semantic segmentation and instance segmentation. The combined version of these two basic tasks is known as panoptic segmentation. In the recent era, the success of deep convolutional neural networks (CNN) has influenced the field of segmentation greatly and gave us various successful models to date. In this survey, we are going to take a glance at the evolution of both semantic and instance segmentation work based on CNN. We have also specified comparative architectural details of some state-of-the-art models and discuss their training details to present a lucid understanding of hyper-parameter tuning of those models. We have also drawn a comparison among the performance of those models on different datasets. Lastly, we have given a glimpse of some state-of-the-art panoptic segmentation models.
△ Less
Submitted 29 May, 2020; v1 submitted 13 January, 2020;
originally announced January 2020.
-
3D Conditional Generative Adversarial Networks to enable large-scale seismic image enhancement
Authors:
Praneet Dutta,
Bruce Power,
Adam Halpert,
Carlos Ezequiel,
Aravind Subramanian,
Chanchal Chatterjee,
Sindhu Hari,
Kenton Prindle,
Vishal Vaddina,
Andrew Leach,
Raj Domala,
Laura Bandura,
Massimo Mascaro
Abstract:
We propose GAN-based image enhancement models for frequency enhancement of 2D and 3D seismic images. Seismic imagery is used to understand and characterize the Earth's subsurface for energy exploration. Because these images often suffer from resolution limitations and noise contamination, our proposed method performs large-scale seismic volume frequency enhancement and denoising. The enhanced imag…
▽ More
We propose GAN-based image enhancement models for frequency enhancement of 2D and 3D seismic images. Seismic imagery is used to understand and characterize the Earth's subsurface for energy exploration. Because these images often suffer from resolution limitations and noise contamination, our proposed method performs large-scale seismic volume frequency enhancement and denoising. The enhanced images reduce uncertainty and improve decisions about issues, such as optimal well placement, that often rely on low signal-to-noise ratio (SNR) seismic volumes. We explored the impact of adding lithology class information to the models, resulting in improved performance on PSNR and SSIM metrics over a baseline model with no conditional information.
△ Less
Submitted 15 November, 2019;
originally announced November 2019.
-
Active$^2$ Learning: Actively reducing redundancies in Active Learning methods for Sequence Tagging and Machine Translation
Authors:
Rishi Hazra,
Parag Dutta,
Shubham Gupta,
Mohammed Abdul Qaathir,
Ambedkar Dukkipati
Abstract:
While deep learning is a powerful tool for natural language processing (NLP) problems, successful solutions to these problems rely heavily on large amounts of annotated samples. However, manually annotating data is expensive and time-consuming. Active Learning (AL) strategies reduce the need for huge volumes of labeled data by iteratively selecting a small number of examples for manual annotation…
▽ More
While deep learning is a powerful tool for natural language processing (NLP) problems, successful solutions to these problems rely heavily on large amounts of annotated samples. However, manually annotating data is expensive and time-consuming. Active Learning (AL) strategies reduce the need for huge volumes of labeled data by iteratively selecting a small number of examples for manual annotation based on their estimated utility in training the given model. In this paper, we argue that since AL strategies choose examples independently, they may potentially select similar examples, all of which may not contribute significantly to the learning process. Our proposed approach, Active$\mathbf{^2}$ Learning (A$\mathbf{^2}$L), actively adapts to the deep learning model being trained to eliminate such redundant examples chosen by an AL strategy. We show that A$\mathbf{^2}$L is widely applicable by using it in conjunction with several different AL strategies and NLP tasks. We empirically demonstrate that the proposed approach is further able to reduce the data requirements of state-of-the-art AL strategies by $\approx \mathbf{3-25\%}$ on an absolute scale on multiple NLP tasks while achieving the same performance with virtually no additional computation overhead.
△ Less
Submitted 6 April, 2021; v1 submitted 1 November, 2019;
originally announced November 2019.
-
Minus HELLO: HELLO Devoid Protocols for Energy Preservation in Mobile Ad Hoc Networks
Authors:
Anuradha Banerjee,
Abu Sufian,
Paramartha Dutta,
M M Hafizur Rahman
Abstract:
In mobile ad-hoc networks, nodes have to transmit HELLO or Route Maintenance messages at regular intervals, and all nodes residing within its radio range, reply with an acknowledgment message informing their node identifier, current location, and radio-range. Regular transmitting these messages consume a significant amount of battery power in nodes, especially when the set of down-link neighbors d…
▽ More
In mobile ad-hoc networks, nodes have to transmit HELLO or Route Maintenance messages at regular intervals, and all nodes residing within its radio range, reply with an acknowledgment message informing their node identifier, current location, and radio-range. Regular transmitting these messages consume a significant amount of battery power in nodes, especially when the set of down-link neighbors does not change over time and the radio-range of the sender node is large. The present article focuses on this aspect and tries to eliminate the number of HELLO messages in existing state-of-art protocols. Also, it shortens radio-ranges of nodes whenever possible. Simulation results show that the average lifetime of nodes greatly increases in proposed Minus HELLO devoid routing protocols along with a great increase in network throughput. Also, the required number of route re-discovery reduces.
△ Less
Submitted 8 September, 2020; v1 submitted 25 October, 2019;
originally announced October 2019.
-
AutoML for Contextual Bandits
Authors:
Praneet Dutta,
Joe Cheuk,
Jonathan S Kim,
Massimo Mascaro
Abstract:
Contextual Bandits is one of the widely popular techniques used in applications such as personalization, recommendation systems, mobile health, causal marketing etc . As a dynamic approach, it can be more efficient than standard A/B testing in minimizing regret. We propose an end to end automated meta-learning pipeline to approximate the optimal Q function for contextual bandits problems. We see t…
▽ More
Contextual Bandits is one of the widely popular techniques used in applications such as personalization, recommendation systems, mobile health, causal marketing etc . As a dynamic approach, it can be more efficient than standard A/B testing in minimizing regret. We propose an end to end automated meta-learning pipeline to approximate the optimal Q function for contextual bandits problems. We see that our model is able to perform much better than random exploration, being more regret efficient and able to converge with a limited number of samples, while remaining very general and easy to use due to the meta-learning approach. We used a linearly annealed e-greedy exploration policy to define the exploration vs exploitation schedule. We tested the system on a synthetic environment to characterize it fully and we evaluated it on some open source datasets to benchmark against prior work. We see that our model outperforms or performs comparatively to other models while requiring no tuning nor feature engineering.
△ Less
Submitted 1 February, 2022; v1 submitted 7 September, 2019;
originally announced September 2019.
-
Fuzzy Route Switching For Energy Preservation (FEP) in Ad Hoc Networks
Authors:
A. Banerjee,
P. Dutta,
A. Sufian
Abstract:
Nodes in ad hoc networks have limited battery power. Hence they require an energy-efficient technique to improve average network performance. Maintaining energy-efficiency in ad hoc networks is really challenging because highest energy efficiency is achieved if all the nodes are always switched off and energy-efficiency will be minimum if all the nodes are fully operational i.e. always turned-on.…
▽ More
Nodes in ad hoc networks have limited battery power. Hence they require an energy-efficient technique to improve average network performance. Maintaining energy-efficiency in ad hoc networks is really challenging because highest energy efficiency is achieved if all the nodes are always switched off and energy-efficiency will be minimum if all the nodes are fully operational i.e. always turned-on. Energy preservation requires redirection of data packets through some other routes having good performance. This improves the data packet delivery ratio and the number of alive nodes decreasing the cost of messages.
△ Less
Submitted 26 May, 2019;
originally announced June 2019.
-
Fuzzy-Controlled Scheduling of Route-Request Packets (FSRR) in Mobile Ad Hoc Networks
Authors:
A. Sufian,
A. Banerjee,
P. Dutta
Abstract:
In ad hoc networks, the scheduling of route-request packets should be different from that of message packets, because during transmission of message packets the location of the destination is known whereas in route discovery this is not known in most of the cases. The router has to depend upon the last known location, if any, of the destination to determine the center and radius of the circle that…
▽ More
In ad hoc networks, the scheduling of route-request packets should be different from that of message packets, because during transmission of message packets the location of the destination is known whereas in route discovery this is not known in most of the cases. The router has to depend upon the last known location, if any, of the destination to determine the center and radius of the circle that embeds all possible current position of the destination. Route-request packets generated from the source are directed towards this circle i.e., directional route discovery can be applied. Otherwise, when no earlier location of the destination is known the route-requested has to be broadcast in the whole network consuming a significant amount of time than directional route discovery. The present article proposes fuzzy controlled scheduling of route-request packets in particular that greatly reduces the average delay in route discovery in ad hoc networks.
△ Less
Submitted 26 May, 2019;
originally announced June 2019.
-
Cheat-Proof Communication through Cluster Head (C3H) in Mobile Ad Hoc Network
Authors:
A. Sufian,
A. Banerjee,
P. Dutta
Abstract:
The mobile ad hoc network (MANET) is a wireless network based on a group of mobile nodes without any centralised infrastructure. In civilian data communication, all nodes cannot be homogeneous-type and not do a specific data communication. Therefore, node co-operation and cheat-proof are essential characteristics for successfully running MANETs in civilian data communication. Denial of service and…
▽ More
The mobile ad hoc network (MANET) is a wireless network based on a group of mobile nodes without any centralised infrastructure. In civilian data communication, all nodes cannot be homogeneous-type and not do a specific data communication. Therefore, node co-operation and cheat-proof are essential characteristics for successfully running MANETs in civilian data communication. Denial of service and malicious behaviour of the node are the main concerns in securing successful communication in MANETs. This scheme proposed a generic solution to preventing malicious behaviour of the node by the cluster head through the single hop node clustering strategy.
△ Less
Submitted 28 May, 2019;
originally announced May 2019.
-
Data Load Balancing In Mobile Ad Hoc Network Using Fuzzy Logic (DBMF)
Authors:
A. Sufian,
F. Sultana,
P. Dutta
Abstract:
Volume and movement of data rapidly increasing in every type of data communications and networking, and ad hoc networks are not spared from these challenges. Traditional Multipath routing protocols in Mobile Ad-hoc Networks (MANETs) did not focus on data load distribution and balancing as much as required. In this scheme, we have proposed data load distribution and balancing through multiple paths…
▽ More
Volume and movement of data rapidly increasing in every type of data communications and networking, and ad hoc networks are not spared from these challenges. Traditional Multipath routing protocols in Mobile Ad-hoc Networks (MANETs) did not focus on data load distribution and balancing as much as required. In this scheme, we have proposed data load distribution and balancing through multiple paths simultaneously. We have considered three important parameters of ad hoc network those are: mobility of node, the energy of node and packet drop rate at a node. This scheme combines these three metrics using fuzzy logic to get the decisive parameter. We have shown improvement of this scheme over similar kind of protocols in NS-2 network simulator.
△ Less
Submitted 28 May, 2019;
originally announced May 2019.
-
Advancements in Image Classification using Convolutional Neural Network
Authors:
Farhana Sultana,
A. Sufian,
Paramartha Dutta
Abstract:
Convolutional Neural Network (CNN) is the state-of-the-art for image classification task. Here we have briefly discussed different components of CNN. In this paper, We have explained different CNN architectures for image classification. Through this paper, we have shown advancements in CNN from LeNet-5 to latest SENet model. We have discussed the model description and training details of each mode…
▽ More
Convolutional Neural Network (CNN) is the state-of-the-art for image classification task. Here we have briefly discussed different components of CNN. In this paper, We have explained different CNN architectures for image classification. Through this paper, we have shown advancements in CNN from LeNet-5 to latest SENet model. We have discussed the model description and training details of each model. We have also drawn a comparison among those models.
△ Less
Submitted 8 May, 2019;
originally announced May 2019.
-
A Review of Object Detection Models based on Convolutional Neural Network
Authors:
F. Sultana,
A. Sufian,
P. Dutta
Abstract:
Convolutional Neural Network (CNN) has become the state-of-the-art for object detection in image task. In this chapter, we have explained different state-of-the-art CNN based object detection models. We have made this review with categorization those detection models according to two different approaches: two-stage approach and one-stage approach. Through this chapter, it has shown advancements in…
▽ More
Convolutional Neural Network (CNN) has become the state-of-the-art for object detection in image task. In this chapter, we have explained different state-of-the-art CNN based object detection models. We have made this review with categorization those detection models according to two different approaches: two-stage approach and one-stage approach. Through this chapter, it has shown advancements in object detection models from R-CNN to latest RefineDet. It has also discussed the model description and training details of each model. Here, we have also drawn a comparison among those models.
△ Less
Submitted 1 October, 2019; v1 submitted 5 May, 2019;
originally announced May 2019.
-
The Signpost Platform for City-Scale Sensing
Authors:
Joshua Adkins,
Branden Ghena,
Neal Jackson,
Pat Pannuto,
Samuel Rohrer,
Bradford Campbell,
Prabal Dutta
Abstract:
City-scale sensing holds the promise of enabling a deeper understanding of our urban environments. However, a city-scale deployment requires physical installation, power management, and communications---all challenging tasks standing between a good idea and a realized one. This indicates the need for a platform that enables easy deployment and experimentation for applications operating at city sca…
▽ More
City-scale sensing holds the promise of enabling a deeper understanding of our urban environments. However, a city-scale deployment requires physical installation, power management, and communications---all challenging tasks standing between a good idea and a realized one. This indicates the need for a platform that enables easy deployment and experimentation for applications operating at city scale. To address these challenges, we present Signpost, a modular, energy-harvesting platform for city-scale sensing. Signpost simplifies deployment by eliminating the need for connection to wired infrastructure and instead harvesting energy from an integrated solar panel. The platform furnishes the key resources necessary to support multiple, pluggable sensor modules while providing fair, safe, and reliable sharing in the face of dynamic energy constraints. We deploy Signpost with several sensor modules, showing the viability of an energy-harvesting, multi-tenant, sensing system, and evaluate its ability to support sensing applications. We believe Signpost reduces the difficulty inherent in city-scale deployments, enables new experimentation, and provides improved insights into urban health.
△ Less
Submitted 21 February, 2018;
originally announced February 2018.