subscribe to arXiv mailings

Uncertainty-Aware PPG-2-ECG for Enhanced Cardiovascular Diagnosis using Diffusion Models

Authors: Omer Belhasin, Idan Kligvasser, George Leifman, Regev Cohen, Erin Rainaldi, Li-Fang Cheng, Nishant Verma, Paul Varghese, Ehud Rivlin, Michael Elad

Abstract: Analyzing the cardiovascular system condition via Electrocardiography (ECG) is a common and highly effective approach, and it has been practiced and perfected over many decades. ECG sensing is non-invasive and relatively easy to acquire, and yet it is still cumbersome for holter monitoring tests that may span over hours and even days. A possible alternative in this context is Photoplethysmography… ▽ More Analyzing the cardiovascular system condition via Electrocardiography (ECG) is a common and highly effective approach, and it has been practiced and perfected over many decades. ECG sensing is non-invasive and relatively easy to acquire, and yet it is still cumbersome for holter monitoring tests that may span over hours and even days. A possible alternative in this context is Photoplethysmography (PPG): An optically-based signal that measures blood volume fluctuations, as typically sensed by conventional ``wearable devices''. While PPG presents clear advantages in acquisition, convenience, and cost-effectiveness, ECG provides more comprehensive information, allowing for a more precise detection of heart conditions. This implies that a conversion from PPG to ECG, as recently discussed in the literature, inherently involves an unavoidable level of uncertainty. In this paper we introduce a novel methodology for addressing the PPG-2-ECG conversion, and offer an enhanced classification of cardiovascular conditions using the given PPG, all while taking into account the uncertainties arising from the conversion process. We provide a mathematical justification for our proposed computational approach, and present empirical studies demonstrating its superior performance compared to state-of-the-art baseline methods. △ Less

Submitted 19 May, 2024; originally announced May 2024.

arXiv:2404.03052 [pdf, ps, other]

doi 10.1109/ICMLA58977.2023.00230

GPT-DETOX: An In-Context Learning-Based Paraphraser for Text Detoxification

Authors: Ali Pesaranghader, Nikhil Verma, Manasa Bharadwaj

Abstract: Harmful and offensive communication or content is detrimental to social bonding and the mental state of users on social media platforms. Text detoxification is a crucial task in natural language processing (NLP), where the goal is removing profanity and toxicity from text while preserving its content. Supervised and unsupervised learning are common approaches for designing text detoxification solu… ▽ More Harmful and offensive communication or content is detrimental to social bonding and the mental state of users on social media platforms. Text detoxification is a crucial task in natural language processing (NLP), where the goal is removing profanity and toxicity from text while preserving its content. Supervised and unsupervised learning are common approaches for designing text detoxification solutions. However, these methods necessitate fine-tuning, leading to computational overhead. In this paper, we propose GPT-DETOX as a framework for prompt-based in-context learning for text detoxification using GPT-3.5 Turbo. We utilize zero-shot and few-shot prompting techniques for detoxifying input sentences. To generate few-shot prompts, we propose two methods: word-matching example selection (WMES) and context-matching example selection (CMES). We additionally take into account ensemble in-context learning (EICL) where the ensemble is shaped by base prompts from zero-shot and all few-shot settings. We use ParaDetox and APPDIA as benchmark detoxification datasets. Our experimental results show that the zero-shot solution achieves promising performance, while our best few-shot setting outperforms the state-of-the-art models on ParaDetox and shows comparable results on APPDIA. Our EICL solutions obtain the greatest performance, adding at least 10% improvement, against both datasets. △ Less

Submitted 3 April, 2024; originally announced April 2024.

Comments: 7 pages, 8 tables. Published in: 2023 International Conference on Machine Learning and Applications (ICMLA)

arXiv:2403.00986 [pdf, other]

Merging Text Transformer Models from Different Initializations

Authors: Neha Verma, Maha Elbayad

Abstract: Recent work on one-shot permutation-based model merging has shown impressive low- or zero-barrier mode connectivity between models from completely different initializations. However, this line of work has not yet extended to the Transformer architecture, despite its dominant popularity in the language domain. Therefore, in this work, we investigate the extent to which separate Transformer minima l… ▽ More Recent work on one-shot permutation-based model merging has shown impressive low- or zero-barrier mode connectivity between models from completely different initializations. However, this line of work has not yet extended to the Transformer architecture, despite its dominant popularity in the language domain. Therefore, in this work, we investigate the extent to which separate Transformer minima learn similar features, and propose a model merging technique to investigate the relationship between these minima in the loss landscape. The specifics of the architecture, like its residual connections, multi-headed attention, and discrete, sequential input, require specific interventions in order to compute model permutations that remain within the same functional equivalence class. In merging these models with our method, we consistently find lower loss barriers between minima compared to model averaging for several models trained on a masked-language modeling task or fine-tuned on a language understanding benchmark. Our results show that the minima of these models are less sharp and isolated than previously understood, and provide a basis for future work on merging separately trained Transformer models. △ Less

Submitted 7 March, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

arXiv:2402.09957 [pdf, other]

On Designing Features for Condition Monitoring of Rotating Machines

Authors: Seetaram Maurya, Nishchal K. Verma

Abstract: Various methods for designing input features have been proposed for fault recognition in rotating machines using one-dimensional raw sensor data. The available methods are complex, rely on empirical approaches, and may differ depending on the condition monitoring data used. Therefore, this article proposes a novel algorithm to design input features that unifies the feature extraction process for d… ▽ More Various methods for designing input features have been proposed for fault recognition in rotating machines using one-dimensional raw sensor data. The available methods are complex, rely on empirical approaches, and may differ depending on the condition monitoring data used. Therefore, this article proposes a novel algorithm to design input features that unifies the feature extraction process for different time-series sensor data. This new insight for designing/extracting input features is obtained through the lens of histogram theory. The proposed algorithm extracts discriminative input features, which are suitable for a simple classifier to deep neural network-based classifiers. The designed input features are given as input to the classifier with end-to-end training in a single framework for machine conditions recognition. The proposed scheme has been validated through three real-time datasets: a) acoustic dataset, b) CWRU vibration dataset, and c) IMS vibration dataset. The real-time results and comparative study show the effectiveness of the proposed scheme for the prediction of the machine's health states. △ Less

Submitted 15 February, 2024; originally announced February 2024.

arXiv:2402.02871 [pdf, other]

Code-Based Single-Server Private Information Retrieval: Circumventing the Sub-Query Attack

Authors: Neehar Verma, Camilla Hollanti

Abstract: Private information retrieval from a single server is considered, utilizing random linear codes. Presented is a modified version of the first code-based single-server computational PIR scheme proposed by Holzbaur, Hollanti, and Wachter-Zeh in [Holzbaur et al., "Computational Code-Based Single-Server Private Information Retrieval", 2020 IEEE ISIT]. The original scheme was broken in [Bordage et al.,… ▽ More Private information retrieval from a single server is considered, utilizing random linear codes. Presented is a modified version of the first code-based single-server computational PIR scheme proposed by Holzbaur, Hollanti, and Wachter-Zeh in [Holzbaur et al., "Computational Code-Based Single-Server Private Information Retrieval", 2020 IEEE ISIT]. The original scheme was broken in [Bordage et al., "On the privacy of a code-based single-server computational PIR scheme", Cryptogr. Comm., 2021] by an attack arising from highly probable rank differences in sub-matrices of the user's query. Here, this attack is now circumvented by ensuring that the sub-matrices have negligible rank difference. Furthermore, the rank difference cannot be attributed to the desired file index, thereby ensuring the privacy of the scheme. In the case of retrieving multiple files, the rate of the modified scheme is largely unaffected and at par with the original scheme. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Comments: The scheme proposed in this work is a modified version of the scheme in arXiv:2001.07049 (IEEE ISIT 2020) and provides a mend against the attack discovered in arXiv:2004.00509 (Cryptography and Communications, 2021)

arXiv:2307.05616 [pdf, other]

Image Reconstruction using Enhanced Vision Transformer

Authors: Nikhil Verma, Deepkamal Kaur, Lydia Chau

Abstract: Removing noise from images is a challenging and fundamental problem in the field of computer vision. Images captured by modern cameras are inevitably degraded by noise which limits the accuracy of any quantitative measurements on those images. In this project, we propose a novel image reconstruction framework which can be used for tasks such as image denoising, deblurring or inpainting. The model… ▽ More Removing noise from images is a challenging and fundamental problem in the field of computer vision. Images captured by modern cameras are inevitably degraded by noise which limits the accuracy of any quantitative measurements on those images. In this project, we propose a novel image reconstruction framework which can be used for tasks such as image denoising, deblurring or inpainting. The model proposed in this project is based on Vision Transformer (ViT) that takes 2D images as input and outputs embeddings which can be used for reconstructing denoised images. We incorporate four additional optimization techniques in the framework to improve the model reconstruction capability, namely Locality Sensitive Attention (LSA), Shifted Patch Tokenization (SPT), Rotary Position Embeddings (RoPE) and adversarial loss function inspired from Generative Adversarial Networks (GANs). LSA, SPT and RoPE enable the transformer to learn from the dataset more efficiently, while the adversarial loss function enhances the resolution of the reconstructed images. Based on our experiments, the proposed architecture outperforms the benchmark U-Net model by more than 3.5\% structural similarity (SSIM) for the reconstruction tasks of image denoising and inpainting. The proposed enhancements further show an improvement of \textasciitilde5\% SSIM over the benchmark for both tasks. △ Less

Submitted 10 July, 2023; originally announced July 2023.

arXiv:2307.04978 [pdf, other]

Diffusion idea exploration for art generation

Authors: Nikhil Verma

Abstract: Cross-Modal learning tasks have picked up pace in recent times. With plethora of applications in diverse areas, generation of novel content using multiple modalities of data has remained a challenging problem. To address the same, various generative modelling techniques have been proposed for specific tasks. Novel and creative image generation is one important aspect for industrial application whi… ▽ More Cross-Modal learning tasks have picked up pace in recent times. With plethora of applications in diverse areas, generation of novel content using multiple modalities of data has remained a challenging problem. To address the same, various generative modelling techniques have been proposed for specific tasks. Novel and creative image generation is one important aspect for industrial application which could help as an arm for novel content generation. Techniques proposed previously used Generative Adversarial Network(GAN), autoregressive models and Variational Autoencoders (VAE) for accomplishing similar tasks. These approaches are limited in their capability to produce images guided by either text instructions or rough sketch images decreasing the overall performance of image generator. We used state of the art diffusion models to generate creative art by primarily leveraging text with additional support of rough sketches. Diffusion starts with a pattern of random dots and slowly converts that pattern into a design image using the guiding information fed into the model. Diffusion models have recently outperformed other generative models in image generation tasks using cross modal data as guiding information. The initial experiments for this task of novel image generation demonstrated promising qualitative results. △ Less

Submitted 10 July, 2023; originally announced July 2023.

Comments: Report Submitted for degree completion of Master of Science in Applied Computing at University of Toronto

arXiv:2306.08221 [pdf, other]

Contrastive Loss is All You Need to Recover Analogies as Parallel Lines

Authors: Narutatsu Ri, Fei-Tzin Lee, Nakul Verma

Abstract: While static word embedding models are known to represent linguistic analogies as parallel lines in high-dimensional space, the underlying mechanism as to why they result in such geometric structures remains obscure. We find that an elementary contrastive-style method employed over distributional information performs competitively with popular word embedding models on analogy recovery tasks, while… ▽ More While static word embedding models are known to represent linguistic analogies as parallel lines in high-dimensional space, the underlying mechanism as to why they result in such geometric structures remains obscure. We find that an elementary contrastive-style method employed over distributional information performs competitively with popular word embedding models on analogy recovery tasks, while achieving dramatic speedups in training time. Further, we demonstrate that a contrastive loss is sufficient to create these parallel structures in word embeddings, and establish a precise relationship between the co-occurrence statistics and the geometric structure of the resulting word embeddings. △ Less

Submitted 13 June, 2023; originally announced June 2023.

arXiv:2306.01594 [pdf, other]

A Novel Vision Transformer with Residual in Self-attention for Biomedical Image Classification

Authors: Arun K. Sharma, Nishchal K. Verma

Abstract: Biomedical image classification requires capturing of bio-informatics based on specific feature distribution. In most of such applications, there are mainly challenges due to limited availability of samples for diseased cases and imbalanced nature of dataset. This article presents the novel framework of multi-head self-attention for vision transformer (ViT) which makes capable of capturing the spe… ▽ More Biomedical image classification requires capturing of bio-informatics based on specific feature distribution. In most of such applications, there are mainly challenges due to limited availability of samples for diseased cases and imbalanced nature of dataset. This article presents the novel framework of multi-head self-attention for vision transformer (ViT) which makes capable of capturing the specific image features for classification and analysis. The proposed method uses the concept of residual connection for accumulating the best attention output in each block of multi-head attention. The proposed framework has been evaluated on two small datasets: (i) blood cell classification dataset and (ii) brain tumor detection using brain MRI images. The results show the significant improvement over traditional ViT and other convolution based state-of-the-art classification models. △ Less

Submitted 5 June, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

arXiv:2305.14280 [pdf, other]

Multilingual Pixel Representations for Translation and Effective Cross-lingual Transfer

Authors: Elizabeth Salesky, Neha Verma, Philipp Koehn, Matt Post

Abstract: We introduce and demonstrate how to effectively train multilingual machine translation models with pixel representations. We experiment with two different data settings with a variety of language and script coverage, demonstrating improved performance compared to subword embeddings. We explore various properties of pixel representations such as parameter sharing within and across scripts to better… ▽ More We introduce and demonstrate how to effectively train multilingual machine translation models with pixel representations. We experiment with two different data settings with a variety of language and script coverage, demonstrating improved performance compared to subword embeddings. We explore various properties of pixel representations such as parameter sharing within and across scripts to better understand where they lead to positive transfer. We observe that these properties not only enable seamless cross-lingual transfer to unseen scripts, but make pixel representations more data-efficient than alternatives such as vocabulary expansion. We hope this work contributes to more extensible multilingual models for all languages and scripts. △ Less

Submitted 24 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

Comments: EMNLP 2023

arXiv:2305.14230 [pdf, other]

Exploring Representational Disparities Between Multilingual and Bilingual Translation Models

Authors: Neha Verma, Kenton Murray, Kevin Duh

Abstract: Multilingual machine translation has proven immensely useful for both parameter efficiency and overall performance across many language pairs via complete multilingual parameter sharing. However, some language pairs in multilingual models can see worse performance than in bilingual models, especially in the one-to-many translation setting. Motivated by their empirical differences, we examine the g… ▽ More Multilingual machine translation has proven immensely useful for both parameter efficiency and overall performance across many language pairs via complete multilingual parameter sharing. However, some language pairs in multilingual models can see worse performance than in bilingual models, especially in the one-to-many translation setting. Motivated by their empirical differences, we examine the geometric differences in representations from bilingual models versus those from one-to-many multilingual models. Specifically, we compute the isotropy of these representations using intrinsic dimensionality and IsoScore, in order to measure how the representations utilize the dimensions in their underlying vector space. Using the same evaluation data in both models, we find that for a given language pair, its multilingual model decoder representations are consistently less isotropic and occupy fewer dimensions than comparable bilingual model decoder representations. Additionally, we show that much of the anisotropy in multilingual decoder representations can be attributed to modeling language-specific information, therefore limiting remaining representational capacity. △ Less

Submitted 26 March, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

Comments: LREC-COLING 2024

arXiv:2305.07552 [pdf, other]

Dish detection in food platters: A framework for automated diet logging and nutrition management

Authors: Mansi Goel, Shashank Dargar, Shounak Ghatak, Nidhi Verma, Pratik Chauhan, Anushka Gupta, Nikhila Vishnumolakala, Hareesh Amuru, Ekta Gambhir, Ronak Chhajed, Meenal Jain, Astha Jain, Samiksha Garg, Nitesh Narwade, Nikhilesh Verhwani, Abhuday Tiwari, Kirti Vashishtha, Ganesh Bagler

Abstract: Diet is central to the epidemic of lifestyle disorders. Accurate and effortless diet logging is one of the significant bottlenecks for effective diet management and calorie restriction. Dish detection from food platters is a challenging problem due to a visually complex food layout. We present an end-to-end computational framework for diet management, from data compilation, annotation, and state-o… ▽ More Diet is central to the epidemic of lifestyle disorders. Accurate and effortless diet logging is one of the significant bottlenecks for effective diet management and calorie restriction. Dish detection from food platters is a challenging problem due to a visually complex food layout. We present an end-to-end computational framework for diet management, from data compilation, annotation, and state-of-the-art model identification to its mobile app implementation. As a case study, we implement the framework in the context of Indian food platters known for their complex presentation that poses a challenge for the automated detection of dishes. Starting with the 61 most popular Indian dishes, we identify the state-of-the-art model through a comparative analysis of deep-learning-based object detection architectures. Rooted in a meticulous compilation of 68,005 platter images with 134,814 manual dish annotations, we first compare ten architectures for multi-label classification to identify ResNet152 (mAP=84.51%) as the best model. YOLOv8x (mAP=87.70%) emerged as the best model architecture for dish detection among the eight deep-learning models implemented after a thorough performance evaluation. By comparing with the state-of-the-art model for the IndianFood10 dataset, we demonstrate the superior object detection performance of YOLOv8x for this subset and establish Resnet152 as the best architecture for multi-label classification. The models thus trained on richly annotated data can be extended to include dishes from across global cuisines. The proposed framework is demonstrated through a proof-of-concept mobile application with diverse applications for diet logging, food recommendation systems, nutritional interventions, and mitigation of lifestyle disorders. △ Less

Submitted 12 May, 2023; originally announced May 2023.

Comments: 11 pages, 5 figures, 5 tables. Submitted to the 8th International Conference on Computer Vision & Image Processing (CVIP-2023)

ACM Class: I.4.9; I.5.4; J.3

arXiv:2303.01676 [pdf, other]

eViper: A Scalable Platform for Untethered Modular Soft Robots

Authors: Hsin Cheng, Zhiwu Zheng, Prakhar Kumar, Wali Afridi, Ben Kim, Sigurd Wagner, Naveen Verma, James C. Sturm, Minjie Chen

Abstract: Soft robots present unique capabilities, but have been limited by the lack of scalable technologies for construction and the complexity of algorithms for efficient control and motion, which depend on soft-body dynamics, high-dimensional actuation patterns, and external/on-board forces. This paper presents scalable methods and platforms to study the impact of weight distribution and actuation patte… ▽ More Soft robots present unique capabilities, but have been limited by the lack of scalable technologies for construction and the complexity of algorithms for efficient control and motion, which depend on soft-body dynamics, high-dimensional actuation patterns, and external/on-board forces. This paper presents scalable methods and platforms to study the impact of weight distribution and actuation patterns on fully untethered modular soft robots. An extendable Vibrating Intelligent Piezo-Electric Robot (eViper), together with an open-source Simulation Framework for Electroactive Robotic Sheet (SFERS) implemented in PyBullet, was developed as a platform to study the sophisticated weight-locomotion interaction. By integrating the power electronics, sensors, actuators, and batteries on-board, the eViper platform enables rapid design iteration and evaluation of different weight distribution and control strategies for the actuator arrays, supporting both physics-based modeling and data-driven modeling via on-board automatic data-acquisition capabilities. We show that SFERS can provide useful guidelines for optimizing the weight distribution and actuation patterns of the eViper to achieve the maximum speed or minimum cost-of-transportation (COT). △ Less

Submitted 14 November, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

Comments: 8 pages, 21 figures, accepted by IROS 2023

arXiv:2210.05098 [pdf, other]

IsoVec: Controlling the Relative Isomorphism of Word Embedding Spaces

Authors: Kelly Marchisio, Neha Verma, Kevin Duh, Philipp Koehn

Abstract: The ability to extract high-quality translation dictionaries from monolingual word embedding spaces depends critically on the geometric similarity of the spaces -- their degree of "isomorphism." We address the root-cause of faulty cross-lingual mapping: that word embedding training resulted in the underlying spaces being non-isomorphic. We incorporate global measures of isomorphism directly into t… ▽ More The ability to extract high-quality translation dictionaries from monolingual word embedding spaces depends critically on the geometric similarity of the spaces -- their degree of "isomorphism." We address the root-cause of faulty cross-lingual mapping: that word embedding training resulted in the underlying spaces being non-isomorphic. We incorporate global measures of isomorphism directly into the Skip-gram loss function, successfully increasing the relative isomorphism of trained word embedding spaces and improving their ability to be mapped to a shared cross-lingual space. The result is improved bilingual lexicon induction in general data conditions, under domain mismatch, and with training algorithm dissimilarities. We release IsoVec at https://github.com/kellymarchisio/isovec. △ Less

Submitted 4 July, 2023; v1 submitted 10 October, 2022; originally announced October 2022.

Comments: Updated EMNLP2022 Camera Ready (citation correction, removed references to dimensionality reduction [was not used here].)

arXiv:2209.04528 [pdf, other]

Improving Model Training via Self-learned Label Representations

Authors: Xiao Yu, Nakul Verma

Abstract: Modern neural network architectures have shown remarkable success in several large-scale classification and prediction tasks. Part of the success of these architectures is their flexibility to transform the data from the raw input representations (e.g. pixels for vision tasks, or text for natural language processing tasks) to one-hot output encoding. While much of the work has focused on studying… ▽ More Modern neural network architectures have shown remarkable success in several large-scale classification and prediction tasks. Part of the success of these architectures is their flexibility to transform the data from the raw input representations (e.g. pixels for vision tasks, or text for natural language processing tasks) to one-hot output encoding. While much of the work has focused on studying how the input gets transformed to the one-hot encoding, very little work has examined the effectiveness of these one-hot labels. In this work, we demonstrate that more sophisticated label representations are better for classification than the usual one-hot encoding. We propose Learning with Adaptive Labels (LwAL) algorithm, which simultaneously learns the label representation while training for the classification task. These learned labels can significantly cut down on the training time (usually by more than 50%) while often achieving better test accuracies. Our algorithm introduces negligible additional parameters and has a minimal computational overhead. Along with improved training times, our learned labels are semantically meaningful and can reveal hierarchical relationships that may be present in the data. △ Less

Submitted 9 September, 2022; originally announced September 2022.

arXiv:2207.00658 [pdf, other]

doi 10.1109/ICRA48891.2023.10160886

Wirelessly-Controlled Untethered Piezoelectric Planar Soft Robot Capable of Bidirectional Crawling and Rotation

Authors: Zhiwu Zheng, Hsin Cheng, Prakhar Kumar, Sigurd Wagner, Minjie Chen, Naveen Verma, James C. Sturm

Abstract: Electrostatic actuators provide a promising approach to creating soft robotic sheets, due to their flexible form factor, modular integration, and fast response speed. However, their control requires kilo-Volt signals and understanding of complex dynamics resulting from force interactions by on-board and environmental effects. In this work, we demonstrate an untethered planar five-actuator piezoele… ▽ More Electrostatic actuators provide a promising approach to creating soft robotic sheets, due to their flexible form factor, modular integration, and fast response speed. However, their control requires kilo-Volt signals and understanding of complex dynamics resulting from force interactions by on-board and environmental effects. In this work, we demonstrate an untethered planar five-actuator piezoelectric robot powered by batteries and on-board high-voltage circuitry, and controlled through a wireless link. The scalable fabrication approach is based on bonding different functional layers on top of each other (steel foil substrate, actuators, flexible electronics). The robot exhibits a range of controllable motions, including bidirectional crawling (up to ~0.6 cm/s), turning, and in-place rotation (at ~1 degree/s). High-speed videos and control experiments show that the richness of the motion results from the interaction of an asymmetric mass distribution in the robot and the associated dependence of the dynamics on the driving frequency of the piezoelectrics. The robot's speed can reach 6 cm/s with specific payload distribution. △ Less

Submitted 19 January, 2023; v1 submitted 1 July, 2022; originally announced July 2022.

Comments: Accepted to the 2023 IEEE International Conference on Robotics and Automation (ICRA)

Journal ref: 2023 IEEE International Conference on Robotics and Automation (ICRA), 641-647

arXiv:2203.15198 [pdf, other]

doi 10.1109/RoboSoft54090.2022.9762147

Model-Based Control of Planar Piezoelectric Inchworm Soft Robot for Crawling in Constrained Environments

Authors: Zhiwu Zheng, Prakhar Kumar, Yenan Chen, Hsin Cheng, Sigurd Wagner, Minjie Chen, Naveen Verma, James C. Sturm

Abstract: Soft robots have drawn significant attention recently for their ability to achieve rich shapes when interacting with complex environments. However, their elasticity and flexibility compared to rigid robots also pose significant challenges for precise and robust shape control in real-time. Motivated by their potential to operate in highly-constrained environments, as in search-and-rescue operations… ▽ More Soft robots have drawn significant attention recently for their ability to achieve rich shapes when interacting with complex environments. However, their elasticity and flexibility compared to rigid robots also pose significant challenges for precise and robust shape control in real-time. Motivated by their potential to operate in highly-constrained environments, as in search-and-rescue operations, this work addresses these challenges of soft robots by developing a model-based full-shape controller, validated and demonstrated by experiments. A five-actuator planar soft robot was constructed with planar piezoelectric layers bonded to a steel foil substrate, enabling inchworm-like motion. The controller uses a soft-body continuous model for shape planning and control, given target shapes and/or environmental constraints, such as crawling under overhead barriers or "roof" safety lines. An approach to background model calibrations is developed to address deviations of actual robot shape due to material parameter variations and drift. Full experimental shape control and optimal movement under a roof safety line are demonstrated, where the robot maximizes its speed within the overhead constraint. The mean-squared error between the measured and target shapes improves from ~0.05 cm$^{2}$ without calibration to ~0.01 cm$^{2}$ with calibration. Simulation-based validation is also performed with various different roof shapes. △ Less

Submitted 28 March, 2022; originally announced March 2022.

Comments: Accepted to the 2022 IEEE 5th International Conference on Soft Robotics (RoboSoft). Project website: https://piezorobotcontroller.github.io/ Summary video: https://youtu.be/Md-Uo-pUaIs

Journal ref: 2022 IEEE 5th International Conference on Soft Robotics (RoboSoft), 693-698

arXiv:2202.13521 [pdf, other]

doi 10.1109/ICRA46639.2022.9811927

Scalable Simulation and Demonstration of Jumping Piezoelectric 2-D Soft Robots

Authors: Zhiwu Zheng, Prakhar Kumar, Yenan Chen, Hsin Cheng, Sigurd Wagner, Minjie Chen, Naveen Verma, James C. Sturm

Abstract: Soft robots have drawn great interest due to their ability to take on a rich range of shapes and motions, compared to traditional rigid robots. However, the motions, and underlying statics and dynamics, pose significant challenges to forming well-generalized and robust models necessary for robot design and control. In this work, we demonstrate a five-actuator soft robot capable of complex motions… ▽ More Soft robots have drawn great interest due to their ability to take on a rich range of shapes and motions, compared to traditional rigid robots. However, the motions, and underlying statics and dynamics, pose significant challenges to forming well-generalized and robust models necessary for robot design and control. In this work, we demonstrate a five-actuator soft robot capable of complex motions and develop a scalable simulation framework that reliably predicts robot motions. The simulation framework is validated by comparing its predictions to experimental results, based on a robot constructed from piezoelectric layers bonded to a steel-foil substrate. The simulation framework exploits the physics engine PyBullet, and employs discrete rigid-link elements connected by motors to model the actuators. We perform static and AC analyses to validate a single-unit actuator cantilever setup and observe close agreement between simulation and experiments for both the cases. The analyses are extended to the five-actuator robot, where simulations accurately predict the static and AC robot motions, including shapes for applied DC voltage inputs, nearly-static "inchworm" motion, and jumping (in vertical as well as vertical and horizontal directions). These motions exhibit complex non-linear behavior, with forward robot motion reaching ~1 cm/s. Our open-source code can be found at: https://github.com/zhiwuz/sfers. △ Less

Submitted 27 February, 2022; originally announced February 2022.

Comments: Accepted to the International Conference on Robotics and Automation (ICRA) 2022. Video: https://youtu.be/nHcH3V7rCrk

Journal ref: 2022 International Conference on Robotics and Automation (ICRA), 2022, pp. 5199-5204

arXiv:2112.15594 [pdf, other]

doi 10.1073/pnas.2123433119

A Neural Network Solves, Explains, and Generates University Math Problems by Program Synthesis and Few-Shot Learning at Human Level

Authors: Iddo Drori, Sarah Zhang, Reece Shuttleworth, Leonard Tang, Albert Lu, Elizabeth Ke, Kevin Liu, Linda Chen, Sunny Tran, Newman Cheng, Roman Wang, Nikhil Singh, Taylor L. Patti, Jayson Lynch, Avi Shporer, Nakul Verma, Eugene Wu, Gilbert Strang

Abstract: We demonstrate that a neural network pre-trained on text and fine-tuned on code solves mathematics course problems, explains solutions, and generates new questions at a human level. We automatically synthesize programs using few-shot learning and OpenAI's Codex transformer and execute them to solve course problems at 81% automatic accuracy. We curate a new dataset of questions from MIT's largest m… ▽ More We demonstrate that a neural network pre-trained on text and fine-tuned on code solves mathematics course problems, explains solutions, and generates new questions at a human level. We automatically synthesize programs using few-shot learning and OpenAI's Codex transformer and execute them to solve course problems at 81% automatic accuracy. We curate a new dataset of questions from MIT's largest mathematics courses (Single Variable and Multivariable Calculus, Differential Equations, Introduction to Probability and Statistics, Linear Algebra, and Mathematics for Computer Science) and Columbia University's Computational Linear Algebra. We solve questions from a MATH dataset (on Prealgebra, Algebra, Counting and Probability, Intermediate Algebra, Number Theory, and Precalculus), the latest benchmark of advanced mathematics problems designed to assess mathematical reasoning. We randomly sample questions and generate solutions with multiple modalities, including numbers, equations, and plots. The latest GPT-3 language model pre-trained on text automatically solves only 18.8% of these university questions using zero-shot learning and 30.8% using few-shot learning and the most recent chain of thought prompting. In contrast, program synthesis with few-shot learning using Codex fine-tuned on code generates programs that automatically solve 81% of these questions. Our approach improves the previous state-of-the-art automatic solution accuracy on the benchmark topics from 8.8% to 81.1%. We perform a survey to evaluate the quality and difficulty of generated questions. This work is the first to automatically solve university-level mathematics course questions at a human level and the first work to explain and generate university-level mathematics course questions at scale, a milestone for higher education. △ Less

Submitted 30 May, 2022; v1 submitted 31 December, 2021; originally announced December 2021.

Comments: 181 pages, 8 figures, 280 tables

arXiv:2112.11024 [pdf, other]

Reputation-based PoS for the Restriction of Illicit Activities on Blockchain: Algorand Usecase

Authors: Mayank Pandey, Rachit Agarwal, Sandeep Kumar Shukla, Nishchal Kumar Verma

Abstract: In cryptocurrency-based permissionless blockchain networks, the decentralized structure enables any user to join and operate across different regions. The criminal entities exploit it by using cryptocurrency transactions on the blockchain to facilitate activities such as money laundering, gambling, and ransomware attacks. In recent times, different machine learning-based techniques can detect such… ▽ More In cryptocurrency-based permissionless blockchain networks, the decentralized structure enables any user to join and operate across different regions. The criminal entities exploit it by using cryptocurrency transactions on the blockchain to facilitate activities such as money laundering, gambling, and ransomware attacks. In recent times, different machine learning-based techniques can detect such criminal elements based on blockchain transaction data. However, there is no provision within the blockchain to deal with such elements. We propose a reputation-based methodology for response to the users detected carrying out the aforementioned illicit activities. We select Algorand blockchain to implement our methodology by incorporating it within the consensus protocol. The theoretical results obtained prove the restriction and exclusion of criminal elements through block proposal rejection and attenuation of the voting power as a validator for such entities. Further, we analyze the efficacy of our method and show that it puts no additional strain on the communication resources. △ Less

Submitted 25 August, 2022; v1 submitted 21 December, 2021; originally announced December 2021.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2112.08933 [pdf, other]

Responsive parallelized architecture for deploying deep learning models in production environments

Authors: Nikhil Verma, Krishna Prasad

Abstract: Recruiters can easily shortlist candidates for jobs via viewing their curriculum vitae (CV) document. Unstructured document CV beholds candidate's portfolio and named entities listing details. The main aim of this study is to design and propose a web oriented, highly responsive, computational pipeline that systematically predicts CV entities using hierarchically-refined label attention networks. D… ▽ More Recruiters can easily shortlist candidates for jobs via viewing their curriculum vitae (CV) document. Unstructured document CV beholds candidate's portfolio and named entities listing details. The main aim of this study is to design and propose a web oriented, highly responsive, computational pipeline that systematically predicts CV entities using hierarchically-refined label attention networks. Deep learning models specialized for named entity recognition were trained on large dataset to predict relevant fields. The article suggests an optimal strategy to use a number of deep learning models in parallel and predict in real time. We demonstrate selection of light weight micro web framework using Analytical Hierarchy Processing algorithm and focus on an approach useful to deploy large deep learning model-based pipelines in production ready environments using microservices. Deployed models and architecture proposed helped in parsing normal CV in less than 700 milliseconds for sequential flow of requests. △ Less

Submitted 10 July, 2023; v1 submitted 14 December, 2021; originally announced December 2021.

Comments: 20 Pages

arXiv:2111.13993 [pdf, other]

An analysis of document graph construction methods for AMR summarization

Authors: Fei-Tzin Lee, Chris Kedzie, Nakul Verma, Kathleen McKeown

Abstract: Meaning Representation (AMR) is a graph-based semantic representation for sentences, composed of collections of concepts linked by semantic relations. AMR-based approaches have found success in a variety of applications, but a challenge to using it in tasks that require document-level context is that it only represents individual sentences. Prior work in AMR-based summarization has automatically m… ▽ More Meaning Representation (AMR) is a graph-based semantic representation for sentences, composed of collections of concepts linked by semantic relations. AMR-based approaches have found success in a variety of applications, but a challenge to using it in tasks that require document-level context is that it only represents individual sentences. Prior work in AMR-based summarization has automatically merged the individual sentence graphs into a document graph, but the method of merging and its effects on summary content selection have not been independently evaluated. In this paper, we present a novel dataset consisting of human-annotated alignments between the nodes of paired documents and summaries which may be used to evaluate (1) merge strategies; and (2) the performance of content selection methods over nodes of a merged or unmerged AMR graph. We apply these two forms of evaluation to prior work as well as a new method for node merging and show that our new method has significantly better performance than prior work. △ Less

Submitted 27 November, 2021; originally announced November 2021.

arXiv:2111.08267 [pdf, other]

Solving Probability and Statistics Problems by Program Synthesis

Authors: Leonard Tang, Elizabeth Ke, Nikhil Singh, Nakul Verma, Iddo Drori

Abstract: We solve university level probability and statistics questions by program synthesis using OpenAI's Codex, a Transformer trained on text and fine-tuned on code. We transform course problems from MIT's 18.05 Introduction to Probability and Statistics and Harvard's STAT110 Probability into programming tasks. We then execute the generated code to get a solution. Since these course questions are ground… ▽ More We solve university level probability and statistics questions by program synthesis using OpenAI's Codex, a Transformer trained on text and fine-tuned on code. We transform course problems from MIT's 18.05 Introduction to Probability and Statistics and Harvard's STAT110 Probability into programming tasks. We then execute the generated code to get a solution. Since these course questions are grounded in probability, we often aim to have Codex generate probabilistic programs that simulate a large number of probabilistic dependencies to compute its solution. Our approach requires prompt engineering to transform the question from its original form to an explicit, tractable form that results in a correct program and solution. To estimate the amount of work needed to translate an original question into its tractable form, we measure the similarity between original and transformed questions. Our work is the first to introduce a new dataset of university-level probability and statistics problems and solve these problems in a scalable fashion using the program synthesis capabilities of large language models. △ Less

Submitted 16 November, 2021; originally announced November 2021.

Comments: 33 pages, 4 figures

arXiv:2111.08171 [pdf, other]

Solving Linear Algebra by Program Synthesis

Authors: Iddo Drori, Nakul Verma

Abstract: We solve MIT's Linear Algebra 18.06 course and Columbia University's Computational Linear Algebra COMS3251 courses with perfect accuracy by interactive program synthesis. This surprisingly strong result is achieved by turning the course questions into programming tasks and then running the programs to produce the correct answers. We use OpenAI Codex with zero-shot learning, without providing any e… ▽ More We solve MIT's Linear Algebra 18.06 course and Columbia University's Computational Linear Algebra COMS3251 courses with perfect accuracy by interactive program synthesis. This surprisingly strong result is achieved by turning the course questions into programming tasks and then running the programs to produce the correct answers. We use OpenAI Codex with zero-shot learning, without providing any examples in the prompts, to synthesize code from questions. We quantify the difference between the original question text and the transformed question text that yields a correct answer. Since all COMS3251 questions are not available online the model is not overfitting. We go beyond just generating code for questions with numerical answers by interactively generating code that also results visually pleasing plots as output. Finally, we automatically generate new questions given a few sample questions which may be used as new course content. This work is a significant step forward in solving quantitative math problems and opens the door for solving many university level STEM courses by machine. △ Less

Submitted 15 November, 2021; originally announced November 2021.

Comments: 32 pages, 3 figures

arXiv:2111.06885 [pdf, other]

Guided Sampling-based Evolutionary Deep Neural Network for Intelligent Fault Diagnosis

Authors: Arun K. Sharma, Nishchal K. Verma

Abstract: The diagnostic performance of most of the deep learning models is greatly affected by the selection of model architecture and hyperparameters. Manual selection of model architecture is not feasible as training and evaluating the different architectures of deep learning models is a time-consuming process. Therefore, we have proposed a novel framework of evolutionary deep neural network which uses p… ▽ More The diagnostic performance of most of the deep learning models is greatly affected by the selection of model architecture and hyperparameters. Manual selection of model architecture is not feasible as training and evaluating the different architectures of deep learning models is a time-consuming process. Therefore, we have proposed a novel framework of evolutionary deep neural network which uses policy gradient to guide the evolution of DNN architecture towards maximum diagnostic accuracy. We have formulated a policy gradient-based controller which generates an action to sample the new model architecture at every generation such that the optimality is obtained quickly. The fitness of the best model obtained is used as a reward to update the policy parameters. Also, the best model obtained is transferred to the next generation for quick model evaluation in the NSGA-II evolutionary framework. Thus, the algorithm gets the benefits of fast non-dominated sorting as well as quick model evaluation. The effectiveness of the proposed framework has been validated on three datasets: the Air Compressor dataset, Case Western Reserve University dataset, and Paderborn university dataset. △ Less

Submitted 23 February, 2022; v1 submitted 12 November, 2021; originally announced November 2021.

arXiv:2111.00944 [pdf, other]

doi 10.1109/TRO.2024.3353035

Piezoelectric Soft Robot Inchworm Motion by Tuning Ground Friction through Robot Shape: Quasi-Static Modeling and Experimental Validation

Authors: Zhiwu Zheng, Prakhar Kumar, Yenan Chen, Hsin Cheng, Sigurd Wagner, Minjie Chen, Naveen Verma, James C. Sturm

Abstract: Electrically-driven soft robots based on piezoelectric actuators may enable compact form factors and maneuverability in complex environments. In most prior work, piezoelectric actuators are used to control a single degree of freedom. In this work, the coordinated activation of five independent piezoelectric actuators, attached to a common metal foil, is used to implement inchworm-inspired crawling… ▽ More Electrically-driven soft robots based on piezoelectric actuators may enable compact form factors and maneuverability in complex environments. In most prior work, piezoelectric actuators are used to control a single degree of freedom. In this work, the coordinated activation of five independent piezoelectric actuators, attached to a common metal foil, is used to implement inchworm-inspired crawling motion in a robot that is less than 0.5 mm thick. The motion is based on the control of its friction to the ground through the robot's shape, in which one end of the robot (depending on its shape) is anchored to the ground by static friction, while the rest of its body expands or contracts. A complete analytical model of the robot shape, which includes gravity, is developed to quantify the robot shape, friction, and displacement. After validation of the model by experiments, the robot's five actuators are collectively sequenced for inchworm-like forward and backward motion. △ Less

Submitted 14 December, 2023; v1 submitted 1 November, 2021; originally announced November 2021.

Comments: Accepted to IEEE Transactions on Robotics

Journal ref: IEEE Transactions on Robotics. 2024 Jan 11

arXiv:2109.13479 [pdf, other]

Knowledge Transfer based Evolutionary Deep Neural Network for Intelligent Fault Diagnosis

Authors: Arun K. Sharma, Nishchal K. Verma

Abstract: The performance of a deep neural network (DNN) for fault diagnosis is very much dependent on the network architecture. Also, the diagnostic performance is reduced if the model trained on a laboratory case machine is used on a test dataset from an industrial machine running under variable operating conditions. Thus, there are two challenges for the intelligent fault diagnosis of industrial machines… ▽ More The performance of a deep neural network (DNN) for fault diagnosis is very much dependent on the network architecture. Also, the diagnostic performance is reduced if the model trained on a laboratory case machine is used on a test dataset from an industrial machine running under variable operating conditions. Thus, there are two challenges for the intelligent fault diagnosis of industrial machines: (i) selection of suitable DNN architecture and (ii) domain adaptation for the change in operating conditions. Therefore, we propose an evolutionary Net2Net transformation (EvoN2N) that finds the best suitable DNN architecture for the given dataset. Non-dominated sorting genetic algorithm II has been used to optimize the depth and width of the DNN architecture. Also, we have introduced a hybrid crossover technique for optimization of the depth and width of the deep neural network encoded in a chromosome. We have formulated a knowledge transfer-based fitness evaluation scheme for faster evolution. The proposed framework can obtain the best model for intelligent fault diagnosis without the need for a long-time-taking search process. We have used the Case Western Reserve University dataset, Paderborn university dataset, and gearbox fault detection dataset to demonstrate the effectiveness of the proposed framework for the selection of the best suitable architecture capable of excellent diagnostic performance, classification accuracy almost up to 100% △ Less

Submitted 10 February, 2022; v1 submitted 28 September, 2021; originally announced September 2021.

arXiv:2107.02975 [pdf, other]

Neural Natural Language Processing for Unstructured Data in Electronic Health Records: a Review

Authors: Irene Li, Jessica Pan, Jeremy Goldwasser, Neha Verma, Wai Pan Wong, Muhammed Yavuz Nuzumlalı, Benjamin Rosand, Yixin Li, Matthew Zhang, David Chang, R. Andrew Taylor, Harlan M. Krumholz, Dragomir Radev

Abstract: Electronic health records (EHRs), digital collections of patient healthcare events and observations, are ubiquitous in medicine and critical to healthcare delivery, operations, and research. Despite this central role, EHRs are notoriously difficult to process automatically. Well over half of the information stored within EHRs is in the form of unstructured text (e.g. provider notes, operation repo… ▽ More Electronic health records (EHRs), digital collections of patient healthcare events and observations, are ubiquitous in medicine and critical to healthcare delivery, operations, and research. Despite this central role, EHRs are notoriously difficult to process automatically. Well over half of the information stored within EHRs is in the form of unstructured text (e.g. provider notes, operation reports) and remains largely untapped for secondary use. Recently, however, newer neural network and deep learning approaches to Natural Language Processing (NLP) have made considerable advances, outperforming traditional statistical and rule-based systems on a variety of tasks. In this survey paper, we summarize current neural NLP methods for EHR applications. We focus on a broad scope of tasks, namely, classification and prediction, word embeddings, extraction, generation, and other topics such as question answering, phenotyping, knowledge graphs, medical dialogue, multilinguality, interpretability, etc. △ Less

Submitted 6 July, 2021; originally announced July 2021.

Comments: 33 pages, 11 figures

MSC Class: 68T50 ACM Class: I.2.7

arXiv:2106.11182 [pdf, other]

On fine-tuning of Autoencoders for Fuzzy rule classifiers

Authors: Rahul Kumar Sevakula, Nishchal Kumar Verma, Hisao Ishibuchi

Abstract: Recent discoveries in Deep Neural Networks are allowing researchers to tackle some very complex problems such as image classification and audio classification, with improved theoretical and empirical justifications. This paper presents a novel scheme to incorporate the use of autoencoders in Fuzzy rule classifiers (FRC). Autoencoders when stacked can learn the complex non-linear relationships amon… ▽ More Recent discoveries in Deep Neural Networks are allowing researchers to tackle some very complex problems such as image classification and audio classification, with improved theoretical and empirical justifications. This paper presents a novel scheme to incorporate the use of autoencoders in Fuzzy rule classifiers (FRC). Autoencoders when stacked can learn the complex non-linear relationships amongst data, and the proposed framework built towards FRC can allow users to input expert knowledge to the system. This paper further introduces four novel fine-tuning strategies for autoencoders to improve the FRC's classification and rule reduction performance. The proposed framework has been tested across five real-world benchmark datasets. Elaborate comparisons with over 15 previous studies, and across 10-fold cross validation performance, suggest that the proposed methods are capable of building FRCs which can provide state of the art accuracies. △ Less

Submitted 21 June, 2021; originally announced June 2021.

arXiv:2104.00369 [pdf, other]

FeTaQA: Free-form Table Question Answering

Authors: Linyong Nan, Chiachun Hsieh, Ziming Mao, Xi Victoria Lin, Neha Verma, Rui Zhang, Wojciech Kryściński, Nick Schoelkopf, Riley Kong, Xiangru Tang, Murori Mutuma, Ben Rosand, Isabel Trindade, Renusree Bandaru, Jacob Cunningham, Caiming Xiong, Dragomir Radev

Abstract: Existing table question answering datasets contain abundant factual questions that primarily evaluate the query and schema comprehension capability of a system, but they fail to include questions that require complex reasoning and integration of information due to the constraint of the associated short-form answers. To address these issues and to demonstrate the full challenge of table question an… ▽ More Existing table question answering datasets contain abundant factual questions that primarily evaluate the query and schema comprehension capability of a system, but they fail to include questions that require complex reasoning and integration of information due to the constraint of the associated short-form answers. To address these issues and to demonstrate the full challenge of table question answering, we introduce FeTaQA, a new dataset with 10K Wikipedia-based {table, question, free-form answer, supporting table cells} pairs. FeTaQA yields a more challenging table question answering setting because it requires generating free-form text answers after retrieval, inference, and integration of multiple discontinuous facts from a structured knowledge source. Unlike datasets of generative QA over text in which answers are prevalent with copies of short text spans from the source, answers in our dataset are human-generated explanations involving entities and their high-level relations. We provide two benchmark methods for the proposed task: a pipeline method based on semantic-parsing-based QA systems and an end-to-end method based on large pretrained text generation models, and show that FeTaQA poses a challenge for both methods. △ Less

Submitted 1 April, 2021; originally announced April 2021.

arXiv:2103.12459 [pdf, other]

Dual Mesh Convolutional Networks for Human Shape Correspondence

Authors: Nitika Verma, Adnane Boukhayma, Jakob Verbeek, Edmond Boyer

Abstract: Convolutional networks have been extremely successful for regular data structures such as 2D images and 3D voxel grids. The transposition to meshes is, however, not straight-forward due to their irregular structure. We explore how the dual, face-based representation of triangular meshes can be leveraged as a data structure for graph convolutional networks. In the dual mesh, each node (face) has a… ▽ More Convolutional networks have been extremely successful for regular data structures such as 2D images and 3D voxel grids. The transposition to meshes is, however, not straight-forward due to their irregular structure. We explore how the dual, face-based representation of triangular meshes can be leveraged as a data structure for graph convolutional networks. In the dual mesh, each node (face) has a fixed number of neighbors, which makes the networks less susceptible to overfitting on the mesh topology, and also al-lows the use of input features that are naturally defined over faces, such as surface normals and face areas. We evaluate the dual approach on the shape correspondence task on theFaust human shape dataset and variants of it with differ-ent mesh topologies. Our experiments show that results of graph convolutional networks improve when defined over the dual rather than primal mesh. Moreover, our models that explicitly leverage the neighborhood regularity of dual meshes allow improving results further while being more robust to changes in the mesh topology. △ Less

Submitted 16 October, 2021; v1 submitted 23 March, 2021; originally announced March 2021.

arXiv:2103.12326 [pdf]

Security of Healthcare Data Using Blockchains: A Survey

Authors: Mayank Pandey, Rachit Agarwal, Sandeep K. Shukla, Nishchal K. Verma

Abstract: The advancement in the healthcare sector is entering into a new era in the form of Health 4.0. The integration of innovative technologies like Cyber-Physical Systems (CPS), Big Data, Cloud Computing, Machine Learning, and Blockchain with Healthcare services has led to improved performance and efficiency through data-based learning and interconnection of systems. On the other hand, it has also incr… ▽ More The advancement in the healthcare sector is entering into a new era in the form of Health 4.0. The integration of innovative technologies like Cyber-Physical Systems (CPS), Big Data, Cloud Computing, Machine Learning, and Blockchain with Healthcare services has led to improved performance and efficiency through data-based learning and interconnection of systems. On the other hand, it has also increased complexities and has brought its own share of vulnerabilities due to the heavy influx, sharing, and storage of healthcare data. The protection of the same from cyber-attacks along with privacy preservation through authenticated access is one of the significant challenges for the healthcare sector. For this purpose, the use of blockchain-based networks can lead to a considerable reduction in the vulnerabilities of the healthcare systems and secure their data. This chapter explores blockchain's role in strengthening healthcare data security by answering the questions related to what data use, when we need, why we need, who needs, and how state-of-the-art techniques use blockchains to secure healthcare data. As a case study, we also explore and analyze the state-of-the-art implementations for blockchain in healthcare data security for the COVID-19 pandemic. In order to provide a path to future research directions, we identify and discuss the technical limitations and regulatory challenges associated with blockchain-based healthcare data security implementation. △ Less

Submitted 23 March, 2021; originally announced March 2021.

Comments: Submitted as a book chapter

arXiv:2103.08889 [pdf, other]

doi 10.1109/TAI.2021.3123935

Quick Learning Mechanism with Cross-Domain Adaptation for Intelligent Fault Diagnosis

Authors: Arun K. Sharma, Nishchal K. Verma

Abstract: The fault diagnostic model trained for a laboratory case machine fails to perform well on the industrial machines running under variable operating conditions. For every new operating condition of such machines, a new diagnostic model has to be trained which is a time-consuming and uneconomical process. Therefore, we propose a quick learning mechanism that can transform the existing diagnostic mode… ▽ More The fault diagnostic model trained for a laboratory case machine fails to perform well on the industrial machines running under variable operating conditions. For every new operating condition of such machines, a new diagnostic model has to be trained which is a time-consuming and uneconomical process. Therefore, we propose a quick learning mechanism that can transform the existing diagnostic model into a new model suitable for industrial machines operating in different conditions. The proposed method uses the Net2Net transformation followed by a fine-tuning to cancel/minimize the maximum mean discrepancy between the new data and the previous one. The fine-tuning of the model requires a very less amount of labelled target samples and very few iterations of training. Therefore, the proposed method is capable of learning the new target data pattern quickly. The effectiveness of the proposed fault diagnosis method has been demonstrated on the Case Western Reserve University dataset, Intelligent Maintenance Systems bearing dataset, and Paderborn university dataset under the wide variations of the operating conditions. It has been validated that the diagnostic model trained on artificially damaged fault datasets can be used to quickly train another model for a real damage dataset. △ Less

Submitted 6 September, 2021; v1 submitted 16 March, 2021; originally announced March 2021.

Comments: 9 pages, 6 figures, transaction

Report number: TAI-2021-Mar-A-00107

Journal ref: IEEE Transactions on Artificial Intelligence, Nov, 2021

arXiv:2012.02394 [pdf, ps, other]

Biased Programmers? Or Biased Data? A Field Experiment in Operationalizing AI Ethics

Authors: Bo Cowgill, Fabrizio Dell'Acqua, Samuel Deng, Daniel Hsu, Nakul Verma, Augustin Chaintreau

Abstract: Why do biased predictions arise? What interventions can prevent them? We evaluate 8.2 million algorithmic predictions of math performance from $\approx$400 AI engineers, each of whom developed an algorithm under a randomly assigned experimental condition. Our treatment arms modified programmers' incentives, training data, awareness, and/or technical knowledge of AI ethics. We then assess out-of-sa… ▽ More Why do biased predictions arise? What interventions can prevent them? We evaluate 8.2 million algorithmic predictions of math performance from $\approx$400 AI engineers, each of whom developed an algorithm under a randomly assigned experimental condition. Our treatment arms modified programmers' incentives, training data, awareness, and/or technical knowledge of AI ethics. We then assess out-of-sample predictions from their algorithms using randomized audit manipulations of algorithm inputs and ground-truth math performance for 20K subjects. We find that biased predictions are mostly caused by biased training data. However, one-third of the benefit of better training data comes through a novel economic mechanism: Engineers exert greater effort and are more responsive to incentives when given better training data. We also assess how performance varies with programmers' demographic characteristics, and their performance on a psychological test of implicit bias (IAT) concerning gender and careers. We find no evidence that female, minority and low-IAT engineers exhibit lower bias or discrimination in their code. However, we do find that prediction errors are correlated within demographic groups, which creates performance improvements through cross-demographic averaging. Finally, we quantify the benefits and tradeoffs of practical managerial or policy interventions such as technical advice, simple reminders, and improved incentives for decreasing algorithmic bias. △ Less

Submitted 3 December, 2020; originally announced December 2020.

Comments: Part of the Navigating the Broader Impacts of AI Research Workshop at NeurIPS 2020

arXiv:2010.13986 [pdf, other]

doi 10.1145/3446382.3448650

REITS: Reflective Surface for Intelligent Transportation Systems

Authors: Zhuqi Li, Can Wu, Sigurd Wagner, James C. Sturm, Naveen Verma, Kyle Jamieson

Abstract: Autonomous vehicles are predicted to dominate the transportation industry in the foreseeable future. Safety is one of the major challenges to the early deployment of self-driving systems. To ensure safety, self-driving vehicles must sense and detect humans, other vehicles, and road infrastructure accurately, robustly, and timely. However, existing sensing techniques used by self-driving vehicles m… ▽ More Autonomous vehicles are predicted to dominate the transportation industry in the foreseeable future. Safety is one of the major challenges to the early deployment of self-driving systems. To ensure safety, self-driving vehicles must sense and detect humans, other vehicles, and road infrastructure accurately, robustly, and timely. However, existing sensing techniques used by self-driving vehicles may not be absolutely reliable. In this paper, we design REITS, a system to improve the reliability of RF-based sensing modules for autonomous vehicles. We conduct theoretical analysis on possible failures of existing RF-based sensing systems. Based on the analysis, REITS adopts a multi-antenna design, which enables constructive blind beamforming to return an enhanced radar signal in the incident direction. REITS can also let the existing radar system sense identification information by switching between constructive beamforming state and destructive beamforming state. Preliminary results show that REITS improves the detection distance of a self-driving car radar by a factor of 3.63. △ Less

Submitted 2 February, 2021; v1 submitted 26 October, 2020; originally announced October 2020.

arXiv:2009.00822 [pdf, ps, other]

A Bayesian Approach with Type-2 Student-tMembership Function for T-S Model Identification

Authors: Vikas Singh, Homanga Bharadhwaj, Nishchal K Verma

Abstract: Clustering techniques have been proved highly suc-cessful for Takagi-Sugeno (T-S) fuzzy model identification. Inparticular, fuzzyc-regression clustering based on type-2 fuzzyset has been shown the remarkable results on non-sparse databut their performance degraded on sparse data. In this paper, aninnovative architecture for fuzzyc-regression model is presentedand a novel student-tdistribution base… ▽ More Clustering techniques have been proved highly suc-cessful for Takagi-Sugeno (T-S) fuzzy model identification. Inparticular, fuzzyc-regression clustering based on type-2 fuzzyset has been shown the remarkable results on non-sparse databut their performance degraded on sparse data. In this paper, aninnovative architecture for fuzzyc-regression model is presentedand a novel student-tdistribution based membership functionis designed for sparse data modelling. To avoid the overfitting,we have adopted a Bayesian approach for incorporating aGaussian prior on the regression coefficients. Additional noveltyof our approach lies in type-reduction where the final output iscomputed using Karnik Mendel algorithm and the consequentparameters of the model are optimized using Stochastic GradientDescent method. As detailed experimentation, the result showsthat proposed approach outperforms on standard datasets incomparison of various state-of-the-art methods. △ Less

Submitted 2 September, 2020; originally announced September 2020.

arXiv:2008.04114 [pdf, other]

Improved Adaptive Type-2 Fuzzy Filter with Exclusively Two Fuzzy Membership Function for Filtering Salt and Pepper Noise

Authors: Vikas Singh, Pooja Agrawal, Teena Sharma, Nishchal K. Verma

Abstract: Image denoising is one of the preliminary steps in image processing methods in which the presence of noise can deteriorate the image quality. To overcome this limitation, in this paper a improved two-stage fuzzy filter is proposed for filtering salt and pepper noise from the images. In the first-stage, the pixels in the image are categorized as good or noisy based on adaptive thresholding using ty… ▽ More Image denoising is one of the preliminary steps in image processing methods in which the presence of noise can deteriorate the image quality. To overcome this limitation, in this paper a improved two-stage fuzzy filter is proposed for filtering salt and pepper noise from the images. In the first-stage, the pixels in the image are categorized as good or noisy based on adaptive thresholding using type-2 fuzzy logic with exclusively two different membership functions in the filter window. In the second-stage, the noisy pixels are denoised using modified ordinary fuzzy logic in the respective filter window. The proposed filter is validated on standard images with various noise levels. The proposed filter removes the noise and preserves useful image characteristics, i.e., edges and corners at higher noise level. The performance of the proposed filter is compared with the various state-of-the-art methods in terms of peak signal-to-noise ratio and computation time. To show the effectiveness of filter statistical tests, i.e., Friedman test and Bonferroni-Dunn (BD) test are also carried out which clearly ascertain that the proposed filter outperforms in comparison of various filtering approaches. △ Less

Submitted 10 August, 2020; originally announced August 2020.

arXiv:2007.14298 [pdf]

Enhanced Quantum Key Distribution using Hybrid Channels and Natural Random Numbers

Authors: Hemant Rana, Nitin Verma

Abstract: Since the introduction of quantum computation by Richard Feynman in 1982, Quantum computation has shown exemplary results in various applications of computer science including unstructured database search, factorization, molecular simulations to name a few. Some of the recent developments include quantum machine learning, quantum neural networks, quantum walks on graphs, fault tolerant scalable qu… ▽ More Since the introduction of quantum computation by Richard Feynman in 1982, Quantum computation has shown exemplary results in various applications of computer science including unstructured database search, factorization, molecular simulations to name a few. Some of the recent developments include quantum machine learning, quantum neural networks, quantum walks on graphs, fault tolerant scalable quantum computers using error correction codes etc. One of the crucial modern applications of quantum information is quantum cryptography and secure key distribution over quantum channels which have several advantages over classical channels, especially detection of eavesdropping. Based on such properties of quantum systems and quantum channels, In this paper we propose three secure key distribution protocols based on a blend of classical and quantum channels. Also the proposed protocols exploits the property of quantum computers to generate natural random numbers that can be easily transmitted using a single qubit over a quantum channel and can be used for distributing keys to the involved parties in a communication network. △ Less

Submitted 28 July, 2020; originally announced July 2020.

Comments: 3 figures; 3 tables

arXiv:2007.13737 [pdf, other]

BIDEAL: A Toolbox for Bicluster Analysis -- Generation, Visualization and Validation

Authors: Nishchal K. Verma, T. Sharma, S. Dixit, P. Agrawal, S. Sengupta, V. Singh

Abstract: This paper introduces a novel toolbox named BIDEAL for the generation of biclusters, their analysis, visualization, and validation. The objective is to facilitate researchers to use forefront biclustering algorithms embedded on a single platform. A single toolbox comprising various biclustering algorithms play a vital role to extract meaningful patterns from the data for detecting diseases, biomar… ▽ More This paper introduces a novel toolbox named BIDEAL for the generation of biclusters, their analysis, visualization, and validation. The objective is to facilitate researchers to use forefront biclustering algorithms embedded on a single platform. A single toolbox comprising various biclustering algorithms play a vital role to extract meaningful patterns from the data for detecting diseases, biomarkers, gene-drug association, etc. BIDEAL consists of seventeen biclustering algorithms, three biclusters visualization techniques, and six validation indices. The toolbox can analyze several types of data, including biological data through a graphical user interface. It also facilitates data preprocessing techniques i.e., binarization, discretization, normalization, elimination of null and missing values. The effectiveness of the developed toolbox has been presented through testing and validations on Saccharomyces cerevisiae cell cycle, Leukemia cancer, Mammary tissue profile, and Ligand screen in B-cells datasets. The biclusters of these datasets have been generated using BIDEAL and evaluated in terms of coherency, differential co-expression ranking, and similarity measure. The visualization of generated biclusters has also been provided through a heat map and gene plot. △ Less

Submitted 26 July, 2020; originally announced July 2020.

arXiv:2007.02871 [pdf, other]

DART: Open-Domain Structured Data Record to Text Generation

Authors: Linyong Nan, Dragomir Radev, Rui Zhang, Amrit Rau, Abhinand Sivaprasad, Chiachun Hsieh, Xiangru Tang, Aadit Vyas, Neha Verma, Pranav Krishna, Yangxiaokang Liu, Nadia Irwanto, Jessica Pan, Faiaz Rahman, Ahmad Zaidi, Mutethia Mutuma, Yasin Tarabar, Ankit Gupta, Tao Yu, Yi Chern Tan, Xi Victoria Lin, Caiming Xiong, Richard Socher, Nazneen Fatema Rajani

Abstract: We present DART, an open domain structured DAta Record to Text generation dataset with over 82k instances (DARTs). Data-to-Text annotations can be a costly process, especially when dealing with tables which are the major source of structured data and contain nontrivial structures. To this end, we propose a procedure of extracting semantic triples from tables that encodes their structures by exploi… ▽ More We present DART, an open domain structured DAta Record to Text generation dataset with over 82k instances (DARTs). Data-to-Text annotations can be a costly process, especially when dealing with tables which are the major source of structured data and contain nontrivial structures. To this end, we propose a procedure of extracting semantic triples from tables that encodes their structures by exploiting the semantic dependencies among table headers and the table title. Our dataset construction framework effectively merged heterogeneous sources from open domain semantic parsing and dialogue-act-based meaning representation tasks by utilizing techniques such as: tree ontology annotation, question-answer pair to declarative sentence conversion, and predicate unification, all with minimum post-editing. We present systematic evaluation on DART as well as new state-of-the-art results on WebNLG 2017 to show that DART (1) poses new challenges to existing data-to-text datasets and (2) facilitates out-of-domain generalization. Our data and code can be found at https://github.com/Yale-LILY/dart. △ Less

Submitted 12 April, 2021; v1 submitted 6 July, 2020; originally announced July 2020.

Comments: NAACL 2021

arXiv:2005.02434 [pdf]

Nanotechnology-inspired Information Processing Systems of the Future

Authors: Randy Bryant, Mark Hill, Tom Kazior, Daniel Lee, Jie Liu, Klara Nahrstedt, Vijay Narayanan, Jan Rabaey, Hava Siegelmann, Naresh Shanbhag, Naveen Verma, H. -S. Philip Wong

Abstract: Nanoscale semiconductor technology has been a key enabler of the computing revolution. It has done so via advances in new materials and manufacturing processes that resulted in the size of the basic building block of computing systems - the logic switch and memory devices - being reduced into the nanoscale regime. Nanotechnology has provided increased computing functionality per unit volume, energ… ▽ More Nanoscale semiconductor technology has been a key enabler of the computing revolution. It has done so via advances in new materials and manufacturing processes that resulted in the size of the basic building block of computing systems - the logic switch and memory devices - being reduced into the nanoscale regime. Nanotechnology has provided increased computing functionality per unit volume, energy, and cost. In order for computing systems to continue to deliver substantial benefits for the foreseeable future to society at large, it is critical that the very notion of computing be examined in the light of nanoscale realities. In particular, one needs to ask what it means to compute when the very building block - the logic switch - no longer exhibits the level of determinism required by the von Neumann architecture. There needs to be a sustained and heavy investment in a nation-wide Vertically Integrated Semiconductor Ecosystem (VISE). VISE is a program in which research and development is conducted seamlessly across the entire compute stack - from applications, systems and algorithms, architectures, circuits and nanodevices, and materials. A nation-wide VISE provides clear strategic advantages in ensuring the US's global superiority in semiconductors. First, a VISE provides the highest quality seed-corn for nurturing transformative ideas that are critically needed today in order for nanotechnology-inspired computing to flourish. It does so by dramatically opening up new areas of semiconductor research that are inspired and driven by new application needs. Second, a VISE creates a very high barrier to entry from foreign competitors because it is extremely hard to establish, and even harder to duplicate. △ Less

Submitted 5 May, 2020; originally announced May 2020.

Comments: A Computing Community Consortium (CCC) workshop report, 18 pages

Report number: ccc2016report_3

arXiv:1912.11235 [pdf, ps, other]

Intelligent Condition Based Monitoring Techniques for Bearing Fault Diagnosis

Authors: Vikas Singh, Nishchal K. Verma

Abstract: In recent years, intelligent condition-based monitor-ing of rotary machinery systems has become a major researchfocus of machine fault diagnosis. In condition-based monitoring,it is challenging to form a large-scale well-annotated datasetdue to the expense of data acquisition and costly annotation.The generated data have a large number of redundant featureswhich degraded the performance of the mac… ▽ More In recent years, intelligent condition-based monitor-ing of rotary machinery systems has become a major researchfocus of machine fault diagnosis. In condition-based monitoring,it is challenging to form a large-scale well-annotated datasetdue to the expense of data acquisition and costly annotation.The generated data have a large number of redundant featureswhich degraded the performance of the machine learning models.To overcome this, we have utilized the advantages of minimumredundancy maximum relevance (mRMR) and transfer learningwith a deep learning model. In this work,mRMRis combinedwith deep learning and deep transfer learning framework toimprove the fault diagnostics performance in terms of accuracyand computational complexity. ThemRMRreduces the redundantinformation from data and increases the deep learning perfor-mance, whereas transfer learning, reduces a large amount of datadependency for training the model. In the proposed work, twoframeworks, i.e.,mRMRwith deep learning andmRMRwith deeptransfer learning, have explored and validated on CWRU andIMS rolling element bearings datasets. The analysis shows thatthe proposed frameworks can obtain better diagnostic accuracycompared to existing methods and can handle the data with alarge number of features more quickly. △ Less

Submitted 25 August, 2020; v1 submitted 24 December, 2019; originally announced December 2019.

arXiv:1912.11209 [pdf, ps, other]

An Entropy-based Variable Feature Weighted Fuzzy k-Means Algorithm for High Dimensional Data

Authors: Vikas Singh, Nishchal K. Verma

Abstract: This paper presents a new fuzzy k-means algorithm for the clustering of high dimensional data in various subspaces. Since, In the case of high dimensional data, some features might be irrelevant and relevant but may have different significance in the clustering. For a better clustering, it is crucial to incorporate the contribution of these features in the clustering process. To combine these feat… ▽ More This paper presents a new fuzzy k-means algorithm for the clustering of high dimensional data in various subspaces. Since, In the case of high dimensional data, some features might be irrelevant and relevant but may have different significance in the clustering. For a better clustering, it is crucial to incorporate the contribution of these features in the clustering process. To combine these features, in this paper, we have proposed a new fuzzy k-means clustering algorithm in which the objective function of the fuzzy k-means is modified using two different entropy term. The first entropy term helps to minimize the within-cluster dispersion and maximize the negative entropy to determine clusters to contribute to the association of data points. The second entropy term helps to control the weight of the features because different features have different contributing weights in the clustering process for obtaining the better partition of the data. The efficacy of the proposed method is presented in terms of various clustering measures on multiple datasets and compared with various state-of-the-art methods. △ Less

Submitted 23 December, 2019; originally announced December 2019.

arXiv:1910.14134 [pdf, other]

Meta-Learning to Cluster

Authors: Yibo Jiang, Nakul Verma

Abstract: Clustering is one of the most fundamental and wide-spread techniques in exploratory data analysis. Yet, the basic approach to clustering has not really changed: a practitioner hand-picks a task-specific clustering loss to optimize and fit the given data to reveal the underlying cluster structure. Some types of losses---such as k-means, or its non-linear version: kernelized k-means (centroid based)… ▽ More Clustering is one of the most fundamental and wide-spread techniques in exploratory data analysis. Yet, the basic approach to clustering has not really changed: a practitioner hand-picks a task-specific clustering loss to optimize and fit the given data to reveal the underlying cluster structure. Some types of losses---such as k-means, or its non-linear version: kernelized k-means (centroid based), and DBSCAN (density based)---are popular choices due to their good empirical performance on a range of applications. Although every so often the clustering output using these standard losses fails to reveal the underlying structure, and the practitioner has to custom-design their own variation. In this work we take an intrinsically different approach to clustering: rather than fitting a dataset to a specific clustering loss, we train a recurrent model that learns how to cluster. The model uses as training pairs examples of datasets (as input) and its corresponding cluster identities (as output). By providing multiple types of training datasets as inputs, our model has the ability to generalize well on unseen datasets (new clustering tasks). Our experiments reveal that by training on simple synthetically generated datasets or on existing real datasets, we can achieve better clustering performance on unseen real-world datasets when compared with standard benchmark clustering techniques. Our meta clustering model works well even for small datasets where the usual deep learning models tend to perform worse. △ Less

Submitted 30 October, 2019; originally announced October 2019.

arXiv:1910.07368 [pdf, other]

Model-Agnostic Meta-Learning using Runge-Kutta Methods

Authors: Daniel Jiwoong Im, Yibo Jiang, Nakul Verma

Abstract: Meta-learning has emerged as an important framework for learning new tasks from just a few examples. The success of any meta-learning model depends on (i) its fast adaptation to new tasks, as well as (ii) having a shared representation across similar tasks. Here we extend the model-agnostic meta-learning (MAML) framework introduced by Finn et al. (2017) to achieve improved performance by analyzing… ▽ More Meta-learning has emerged as an important framework for learning new tasks from just a few examples. The success of any meta-learning model depends on (i) its fast adaptation to new tasks, as well as (ii) having a shared representation across similar tasks. Here we extend the model-agnostic meta-learning (MAML) framework introduced by Finn et al. (2017) to achieve improved performance by analyzing the temporal dynamics of the optimization procedure via the Runge-Kutta method. This method enables us to gain fine-grained control over the optimization and helps us achieve both the adaptation and representation goals across tasks. By leveraging this refined control, we demonstrate that there are multiple principled ways to update MAML and show that the classic MAML optimization is simply a special case of second-order Runge-Kutta method that mainly focuses on fast-adaptation. Experiments on benchmark classification, regression and reinforcement learning tasks show that this refined control helps attain improved results. △ Less

Submitted 17 October, 2019; v1 submitted 16 October, 2019; originally announced October 2019.

arXiv:1909.03759 [pdf, other]

Neural Conversational QA: Learning to Reason v.s. Exploiting Patterns

Authors: Nikhil Verma, Abhishek Sharma, Dhiraj Madan, Danish Contractor, Harshit Kumar, Sachindra Joshi

Abstract: Neural Conversational QA tasks like ShARC require systems to answer questions based on the contents of a given passage. On studying recent state-of-the-art models on the ShARCQA task, we found indications that the models learn spurious clues/patterns in the dataset. Furthermore, we show that a heuristic-based program designed to exploit these patterns can have performance comparable to that of the… ▽ More Neural Conversational QA tasks like ShARC require systems to answer questions based on the contents of a given passage. On studying recent state-of-the-art models on the ShARCQA task, we found indications that the models learn spurious clues/patterns in the dataset. Furthermore, we show that a heuristic-based program designed to exploit these patterns can have performance comparable to that of the neural models. In this paper we share our findings about four types of patterns found in the ShARC corpus and describe how neural models exploit them. Motivated by the aforementioned findings, we create and share a modified dataset that has fewer spurious patterns, consequently allowing models to learn better. △ Less

Submitted 9 October, 2020; v1 submitted 9 September, 2019; originally announced September 2019.

Comments: Accepted at EMNLP 2020. NOTE: An older version of this paper presented a model called 'UrcaNet'. Please view the v1 version of this paper on arxiv for details on that model. This version does not contain UrcaNet

arXiv:1906.03492 [pdf, other]

Improving Low-Resource Cross-lingual Document Retrieval by Reranking with Deep Bilingual Representations

Authors: Rui Zhang, Caitlin Westerfield, Sungrok Shim, Garrett Bingham, Alexander Fabbri, Neha Verma, William Hu, Dragomir Radev

Abstract: In this paper, we propose to boost low-resource cross-lingual document retrieval performance with deep bilingual query-document representations. We match queries and documents in both source and target languages with four components, each of which is implemented as a term interaction-based deep neural network with cross-lingual word embeddings as input. By including query likelihood scores as extr… ▽ More In this paper, we propose to boost low-resource cross-lingual document retrieval performance with deep bilingual query-document representations. We match queries and documents in both source and target languages with four components, each of which is implemented as a term interaction-based deep neural network with cross-lingual word embeddings as input. By including query likelihood scores as extra features, our model effectively learns to rerank the retrieved documents by using a small number of relevance labels for low-resource language pairs. Due to the shared cross-lingual word embedding space, the model can also be directly applied to another language pair without any training label. Experimental results on the MATERIAL dataset show that our model outperforms the competitive translation-based baselines on English-Swahili, English-Tagalog, and English-Somali cross-lingual information retrieval tasks. △ Less

Submitted 8 June, 2019; originally announced June 2019.

Comments: ACL 2019, short paper

arXiv:1902.01738 [pdf, other]

Metric Learning on Manifolds

Authors: Max Aalto, Nakul Verma

Abstract: Recent literature has shown that symbolic data, such as text and graphs, is often better represented by points on a curved manifold, rather than in Euclidean space. However, geometrical operations on manifolds are generally more complicated than in Euclidean space, and thus many techniques for processing and analysis taken for granted in Euclidean space are difficult on manifolds. A priori, it is… ▽ More Recent literature has shown that symbolic data, such as text and graphs, is often better represented by points on a curved manifold, rather than in Euclidean space. However, geometrical operations on manifolds are generally more complicated than in Euclidean space, and thus many techniques for processing and analysis taken for granted in Euclidean space are difficult on manifolds. A priori, it is not obvious how we may generalize such methods to manifolds. We consider specifically the problem of distance metric learning, and present a framework that solves it on a large class of manifolds, such that similar data are located in closer proximity with respect to the manifold distance function. In particular, we extend the existing metric learning algorithms, and derive the corresponding sample complexity rates for the case of manifolds. Additionally, we demonstrate an improvement of performance in $k$-means clustering and $k$-nearest neighbor classification on real-world complex networks using our methods. △ Less

Submitted 5 February, 2019; originally announced February 2019.

arXiv:1901.10837 [pdf, other]

Noise-tolerant fair classification

Authors: Alexandre Louis Lamy, Ziyuan Zhong, Aditya Krishna Menon, Nakul Verma

Abstract: Fairness-aware learning involves designing algorithms that do not discriminate with respect to some sensitive feature (e.g., race or gender). Existing work on the problem operates under the assumption that the sensitive feature available in one's training sample is perfectly reliable. This assumption may be violated in many real-world cases: for example, respondents to a survey may choose to conce… ▽ More Fairness-aware learning involves designing algorithms that do not discriminate with respect to some sensitive feature (e.g., race or gender). Existing work on the problem operates under the assumption that the sensitive feature available in one's training sample is perfectly reliable. This assumption may be violated in many real-world cases: for example, respondents to a survey may choose to conceal or obfuscate their group identity out of fear of potential discrimination. This poses the question of whether one can still learn fair classifiers given noisy sensitive features. In this paper, we answer the question in the affirmative: we show that if one measures fairness using the mean-difference score, and sensitive features are subject to noise from the mutually contaminated learning model, then owing to a simple identity we only need to change the desired fairness-tolerance. The requisite tolerance can be estimated by leveraging existing noise-rate estimators from the label noise literature. We finally show that our procedure is empirically effective on two case-studies involving sensitive feature censoring. △ Less

Submitted 9 January, 2020; v1 submitted 30 January, 2019; originally announced January 2019.

arXiv:1811.08321 [pdf, other]

Stability Based Filter Pruning for Accelerating Deep CNNs

Authors: Pravendra Singh, Vinay Sameer Raja Kadi, Nikhil Verma, Vinay P. Namboodiri

Abstract: Convolutional neural networks (CNN) have achieved impressive performance on the wide variety of tasks (classification, detection, etc.) across multiple domains at the cost of high computational and memory requirements. Thus, leveraging CNNs for real-time applications necessitates model compression approaches that not only reduce the total number of parameters but reduce the overall computation as… ▽ More Convolutional neural networks (CNN) have achieved impressive performance on the wide variety of tasks (classification, detection, etc.) across multiple domains at the cost of high computational and memory requirements. Thus, leveraging CNNs for real-time applications necessitates model compression approaches that not only reduce the total number of parameters but reduce the overall computation as well. In this work, we present a stability-based approach for filter-level pruning of CNNs. We evaluate our proposed approach on different architectures (LeNet, VGG-16, ResNet, and Faster RCNN) and datasets and demonstrate its generalizability through extensive experiments. Moreover, our compressed models can be used at run-time without requiring any special libraries or hardware. Our model compression method reduces the number of FLOPS by an impressive factor of 6.03X and GPU memory footprint by more than 17X, significantly outperforming other state-of-the-art filter pruning methods. △ Less

Submitted 20 November, 2018; originally announced November 2018.

Comments: IEEE Winter Conference on Applications of Computer Vision (WACV), 2019

Showing 1–50 of 59 results for author: Verma, N