subscribe to arXiv mailings

Enhancing Explainable AI: A Hybrid Approach Combining GradCAM and LRP for CNN Interpretability

Authors: Vaibhav Dhore, Achintya Bhat, Viraj Nerlekar, Kashyap Chavhan, Aniket Umare

Abstract: We present a new technique that explains the output of a CNN-based model using a combination of GradCAM and LRP methods. Both of these methods produce visual explanations by highlighting input regions that are important for predictions. In the new method, the explanation produced by GradCAM is first processed to remove noises. The processed output is then multiplied elementwise with the output of… ▽ More We present a new technique that explains the output of a CNN-based model using a combination of GradCAM and LRP methods. Both of these methods produce visual explanations by highlighting input regions that are important for predictions. In the new method, the explanation produced by GradCAM is first processed to remove noises. The processed output is then multiplied elementwise with the output of LRP. Finally, a Gaussian blur is applied on the product. We compared the proposed method with GradCAM and LRP on the metrics of Faithfulness, Robustness, Complexity, Localisation and Randomisation. It was observed that this method performs better on Complexity than both GradCAM and LRP and is better than atleast one of them in the other metrics. △ Less

Submitted 20 May, 2024; originally announced May 2024.

ACM Class: I.4.0; I.5.2

arXiv:2405.02431 [pdf, other]

Delphi: Efficient Asynchronous Approximate Agreement for Distributed Oracles

Authors: Akhil Bandarupalli, Adithya Bhat, Saurabh Bagchi, Aniket Kate, Chen-Da Liu-Zhang, Michael K. Reiter

Abstract: Agreement protocols are crucial in various emerging applications, spanning from distributed (blockchains) oracles to fault-tolerant cyber-physical systems. In scenarios where sensor/oracle nodes measure a common source, maintaining output within the convex range of correct inputs, known as convex validity, is imperative. Present asynchronous convex agreement protocols employ either randomization,… ▽ More Agreement protocols are crucial in various emerging applications, spanning from distributed (blockchains) oracles to fault-tolerant cyber-physical systems. In scenarios where sensor/oracle nodes measure a common source, maintaining output within the convex range of correct inputs, known as convex validity, is imperative. Present asynchronous convex agreement protocols employ either randomization, incurring substantial computation overhead, or approximate agreement techniques, leading to high $\mathcal{\tilde{O}}(n^3)$ communication for an $n$-node system. This paper introduces Delphi, a deterministic protocol with $\mathcal{\tilde{O}}(n^2)$ communication and minimal computation overhead. Delphi assumes that honest inputs are bounded, except with negligible probability, and integrates agreement primitives from literature with a novel weighted averaging technique. Experimental results highlight Delphi's superior performance, showcasing a significantly lower latency compared to state-of-the-art protocols. Specifically, for an $n=160$-node system, Delphi achieves an 8x and 3x improvement in latency within CPS and AWS environments, respectively. △ Less

Submitted 7 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

Comments: 14 pages, 8 figures, Accepted to DSN 2024

arXiv:2403.14117 [pdf, other]

doi 10.1145/3613904.3642697

A Design Space for Intelligent and Interactive Writing Assistants

Authors: Mina Lee, Katy Ilonka Gero, John Joon Young Chung, Simon Buckingham Shum, Vipul Raheja, Hua Shen, Subhashini Venugopalan, Thiemo Wambsganss, David Zhou, Emad A. Alghamdi, Tal August, Avinash Bhat, Madiha Zahrah Choksi, Senjuti Dutta, Jin L. C. Guo, Md Naimul Hoque, Yewon Kim, Simon Knight, Seyed Parsa Neshaei, Agnia Sergeyuk, Antonette Shibani, Disha Shrivastava, Lila Shroff, Jessi Stark, Sarah Sterman , et al. (11 additional authors not shown)

Abstract: In our era of rapid technological advancement, the research landscape for writing assistants has become increasingly fragmented across various research communities. We seek to address this challenge by proposing a design space as a structured way to examine and explore the multidimensional space of intelligent and interactive writing assistants. Through a large community collaboration, we explore… ▽ More In our era of rapid technological advancement, the research landscape for writing assistants has become increasingly fragmented across various research communities. We seek to address this challenge by proposing a design space as a structured way to examine and explore the multidimensional space of intelligent and interactive writing assistants. Through a large community collaboration, we explore five aspects of writing assistants: task, user, technology, interaction, and ecosystem. Within each aspect, we define dimensions (i.e., fundamental components of an aspect) and codes (i.e., potential options for each dimension) by systematically reviewing 115 papers. Our design space aims to offer researchers and designers a practical tool to navigate, comprehend, and compare the various possibilities of writing assistants, and aid in the envisioning and design of new writing assistants. △ Less

Submitted 26 March, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

Comments: Published as a conference paper at CHI 2024

arXiv:2402.18206 [pdf, other]

Balancing Act: Distribution-Guided Debiasing in Diffusion Models

Authors: Rishubh Parihar, Abhijnya Bhat, Abhipsa Basu, Saswat Mallick, Jogendra Nath Kundu, R. Venkatesh Babu

Abstract: Diffusion Models (DMs) have emerged as powerful generative models with unprecedented image generation capability. These models are widely used for data augmentation and creative applications. However, DMs reflect the biases present in the training datasets. This is especially concerning in the context of faces, where the DM prefers one demographic subgroup vs others (eg. female vs male). In this w… ▽ More Diffusion Models (DMs) have emerged as powerful generative models with unprecedented image generation capability. These models are widely used for data augmentation and creative applications. However, DMs reflect the biases present in the training datasets. This is especially concerning in the context of faces, where the DM prefers one demographic subgroup vs others (eg. female vs male). In this work, we present a method for debiasing DMs without relying on additional data or model retraining. Specifically, we propose Distribution Guidance, which enforces the generated images to follow the prescribed attribute distribution. To realize this, we build on the key insight that the latent features of denoising UNet hold rich demographic semantics, and the same can be leveraged to guide debiased generation. We train Attribute Distribution Predictor (ADP) - a small mlp that maps the latent features to the distribution of attributes. ADP is trained with pseudo labels generated from existing attribute classifiers. The proposed Distribution Guidance with ADP enables us to do fair generation. Our method reduces bias across single/multiple attributes and outperforms the baseline by a significant margin for unconditional and text-conditional diffusion models. Further, we present a downstream task of training a fair attribute classifier by rebalancing the training set with our generated data. △ Less

Submitted 29 May, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

Comments: CVPR 2024. Project Page : https://ab-34.github.io/balancing_act/

arXiv:2402.01645 [pdf]

Recent Innovations in Footwear Sensors: Role of Smart Footwear in Healthcare -- A Survey

Authors: Pradyumna G. R., Roopa B. Hegde, Bommegowda K. B., Anil Kumar Bhat, Ganesh R. Naik, Amit N. Pujari

Abstract: Smart shoes have ushered in a new era of personalised health monitoring and assistive technology. The shoe leverages technologies such as Bluetooth for data collection and wireless transmission and incorporates features such as GPS tracking, obstacle detection, and fitness tracking. This article provides an overview of the current state of smart shoe technology, highlighting the integration of adv… ▽ More Smart shoes have ushered in a new era of personalised health monitoring and assistive technology. The shoe leverages technologies such as Bluetooth for data collection and wireless transmission and incorporates features such as GPS tracking, obstacle detection, and fitness tracking. This article provides an overview of the current state of smart shoe technology, highlighting the integration of advanced sensors for health monitoring, energy harvesting, assistive features for the visually impaired, and deep learning for data analysis. The study discusses the potential of smart footwear in medical applications, particularly for patients with diabetes, and the ongoing research in this field. Current footwear challenges are also discussed, including complex construction, poor fit, comfort, and high cost. △ Less

Submitted 6 February, 2024; v1 submitted 3 January, 2024; originally announced February 2024.

arXiv:2308.02780 [pdf, other]

doi 10.1145/3610088

SUMMIT: Scaffolding OSS Issue Discussion Through Summarization

Authors: Saskia Gilmer, Avinash Bhat, Shuvam Shah, Kevin Cherry, Jinghui Cheng, Jin L. C. Guo

Abstract: For Open Source Software (OSS) projects, discussions in Issue Tracking Systems (ITS) serve as a crucial collaboration mechanism for diverse stakeholders. However, these discussions can become lengthy and entangled, making it hard to find relevant information and make further contributions. In this work, we study the use of summarization to aid users in collaboratively making sense of OSS issue dis… ▽ More For Open Source Software (OSS) projects, discussions in Issue Tracking Systems (ITS) serve as a crucial collaboration mechanism for diverse stakeholders. However, these discussions can become lengthy and entangled, making it hard to find relevant information and make further contributions. In this work, we study the use of summarization to aid users in collaboratively making sense of OSS issue discussion threads. We reveal a complex picture of how summarization is used by issue users in practice as a strategy to help develop and manage their discussions. Grounded on the different objectives served by the summaries and the outcome of our formative study with OSS stakeholders, we identified a set of guidelines to inform the design of collaborative summarization tools for OSS issue discussions. We then developed SUMMIT, a tool that allows issue users to collectively construct summaries of different types of information discussed, as well as a set of comments representing continuous conversations within the thread. To alleviate the manual effort involved, SUMMIT uses techniques that automatically detect information types and summarize texts to facilitate the generation of these summaries. A lab user study indicates that, as the users of SUMMIT, OSS stakeholders adopted different strategies to acquire information on issue threads. Furthermore, different features of SUMMIT effectively lowered the perceived difficulty of locating information from issue threads and enabled the users to prioritize their effort. Overall, our findings demonstrated the potential of SUMMIT, and the corresponding design guidelines, in supporting users to acquire information from lengthy discussions in ITSs. Our work sheds light on key design considerations and features when exploring crowd-based and machine-learning-enabled instruments for asynchronous collaboration on complex tasks such as OSS development. △ Less

Submitted 4 August, 2023; originally announced August 2023.

Comments: To be published in CSCW2023

arXiv:2306.08701 [pdf, other]

Transpiling RTL Pseudo-code of the POWER Instruction Set Architecture to C for Real-time Performance Analysis on Cavatools Simulator

Authors: Kinar S, Prashanth K V, Adithya Hegde, Aditya Subrahmanya Bhat, Narender M

Abstract: This paper presents a transpiler framework for converting RTL pseudo code of the POWER Instruction Set Architecture (ISA) to C code, enabling its execution on the Cavatools simulator. The transpiler consists of a lexer and parser, which parse the RTL pseudo code and generate corresponding C code representations. The lexer tokenizes the input code, while the parser applies grammar rules to build an… ▽ More This paper presents a transpiler framework for converting RTL pseudo code of the POWER Instruction Set Architecture (ISA) to C code, enabling its execution on the Cavatools simulator. The transpiler consists of a lexer and parser, which parse the RTL pseudo code and generate corresponding C code representations. The lexer tokenizes the input code, while the parser applies grammar rules to build an abstract syntax tree (AST). The transpiler ensures compatibility with the Cavatools simulator by generating C code that adheres to its requirements. The resulting C code can be executed on the Cavatools simulator, allowing developers to analyze the instruction-level performance of the Power ISA in real time. The proposed framework facilitates the seamless integration of RTL pseudo code into the Cavatools ecosystem, enabling comprehensive performance analysis and optimization of Power ISA-based code. △ Less

Submitted 14 June, 2023; originally announced June 2023.

ACM Class: B.5.2

arXiv:2306.08243 [pdf, other]

MMASD: A Multimodal Dataset for Autism Intervention Analysis

Authors: Jicheng Li, Vuthea Chheang, Pinar Kullu, Eli Brignac, Zhang Guo, Kenneth E. Barner, Anjana Bhat, Roghayeh Leila Barmaki

Abstract: Autism spectrum disorder (ASD) is a developmental disorder characterized by significant social communication impairments and difficulties perceiving and presenting communication cues. Machine learning techniques have been broadly adopted to facilitate autism studies and assessments. However, computational models are primarily concentrated on specific analysis and validated on private datasets in t… ▽ More Autism spectrum disorder (ASD) is a developmental disorder characterized by significant social communication impairments and difficulties perceiving and presenting communication cues. Machine learning techniques have been broadly adopted to facilitate autism studies and assessments. However, computational models are primarily concentrated on specific analysis and validated on private datasets in the autism community, which limits comparisons across models due to privacy-preserving data sharing complications. This work presents a novel privacy-preserving open-source dataset, MMASD as a MultiModal ASD benchmark dataset, collected from play therapy interventions of children with Autism. MMASD includes data from 32 children with ASD, and 1,315 data samples segmented from over 100 hours of intervention recordings. To promote public access, each data sample consists of four privacy-preserving modalities of data; some of which are derived from original videos: (1) optical flow, (2) 2D skeleton, (3) 3D skeleton, and (4) clinician ASD evaluation scores of children, e.g., ADOS scores. MMASD aims to assist researchers and therapists in understanding children's cognitive status, monitoring their progress during therapy, and customizing the treatment plan accordingly. It also has inspiration for downstream tasks such as action quality assessment and interpersonal synchrony estimation. MMASD dataset can be easily accessed at https://github.com/Li-Jicheng/MMASD-A-Multimodal-Dataset-for-Autism-Intervention-Analysis. △ Less

Submitted 1 October, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

Comments: 8 pages, 2 figures

arXiv:2304.09761 [pdf, other]

An innovative Deep Learning Based Approach for Accurate Agricultural Crop Price Prediction

Authors: Mayank Ratan Bhardwaj, Jaydeep Pawar, Abhijnya Bhat, Deepanshu, Inavamsi Enaganti, Kartik Sagar, Y. Narahari

Abstract: Accurate prediction of agricultural crop prices is a crucial input for decision-making by various stakeholders in agriculture: farmers, consumers, retailers, wholesalers, and the Government. These decisions have significant implications including, most importantly, the economic well-being of the farmers. In this paper, our objective is to accurately predict crop prices using historical price infor… ▽ More Accurate prediction of agricultural crop prices is a crucial input for decision-making by various stakeholders in agriculture: farmers, consumers, retailers, wholesalers, and the Government. These decisions have significant implications including, most importantly, the economic well-being of the farmers. In this paper, our objective is to accurately predict crop prices using historical price information, climate conditions, soil type, location, and other key determinants of crop prices. This is a technically challenging problem, which has been attempted before. In this paper, we propose an innovative deep learning based approach to achieve increased accuracy in price prediction. The proposed approach uses graph neural networks (GNNs) in conjunction with a standard convolutional neural network (CNN) model to exploit geospatial dependencies in prices. Our approach works well with noisy legacy data and produces a performance that is at least 20% better than the results available in the literature. We are able to predict prices up to 30 days ahead. We choose two vegetables, potato (stable price behavior) and tomato (volatile price behavior) and work with noisy public data available from Indian agricultural markets. △ Less

Submitted 15 April, 2023; originally announced April 2023.

Comments: 9 pages, 3 figures, 3 tables

arXiv:2304.04998 [pdf, other]

EESMR: Energy Efficient BFT-SMR for the masses

Authors: Adithya Bhat, Akhil Bandarupalli, Manish Nagaraj, Saurabh Bagchi, Aniket Kate, Michael K. Reiter

Abstract: Modern Byzantine Fault-Tolerant State Machine Replication (BFT-SMR) solutions focus on reducing communication complexity, improving throughput, or lowering latency. This work explores the energy efficiency of BFT-SMR protocols. First, we propose a novel SMR protocol that optimizes for the steady state, i.e., when the leader is correct. This is done by reducing the number of required signatures per… ▽ More Modern Byzantine Fault-Tolerant State Machine Replication (BFT-SMR) solutions focus on reducing communication complexity, improving throughput, or lowering latency. This work explores the energy efficiency of BFT-SMR protocols. First, we propose a novel SMR protocol that optimizes for the steady state, i.e., when the leader is correct. This is done by reducing the number of required signatures per consensus unit and the communication complexity by order of the number of nodes n compared to the state-of-the-art BFT-SMR solutions. Concretely, we employ the idea that a quorum (collection) of signatures on a proposed value is avoidable during the failure-free runs. Second, we model and analyze the energy efficiency of protocols and argue why the steady-state needs to be optimized. Third, we present an application in the cyber-physical system (CPS) setting, where we consider a partially connected system by optionally leveraging wireless multicasts among neighbors. We analytically determine the parameter ranges for when our proposed protocol offers better energy efficiency than communicating with a baseline protocol utilizing an external trusted node. We present a hypergraph-based network model and generalize previous fault tolerance results to the model. Finally, we demonstrate our approach's practicality by analyzing our protocol's energy efficiency through experiments on a CPS test bed. In particular, we observe as high as 64% energy savings when compared to the state-of-the-art SMR solution for n=10 settings using BLE. △ Less

Submitted 14 October, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

Comments: Appearing in Middleware 2023

arXiv:2304.02822 [pdf, other]

Approach Intelligent Writing Assistants Usability with Seven Stages of Action

Authors: Avinash Bhat, Disha Shrivastava, Jin L. C. Guo

Abstract: Despite the potential of Large Language Models (LLMs) as writing assistants, they are plagued by issues like coherence and fluency of the model output, trustworthiness, ownership of the generated content, and predictability of model performance, thereby limiting their usability. In this position paper, we propose to adopt Norman's seven stages of action as a framework to approach the interaction d… ▽ More Despite the potential of Large Language Models (LLMs) as writing assistants, they are plagued by issues like coherence and fluency of the model output, trustworthiness, ownership of the generated content, and predictability of model performance, thereby limiting their usability. In this position paper, we propose to adopt Norman's seven stages of action as a framework to approach the interaction design of intelligent writing assistants. We illustrate the framework's applicability to writing tasks by providing an example of software tutorial authoring. The paper also discusses the framework as a tool to synthesize research on the interaction design of LLM-based tools and presents examples of tools that support the stages of action. Finally, we briefly outline the potential of a framework for human-LLM interaction research. △ Less

Submitted 5 April, 2023; originally announced April 2023.

Comments: The Second Workshop on Intelligent and Interactive Writing Assistants co-located with The ACM CHI Conference on Human Factors in Computing Systems (CHI 2023)

arXiv:2302.11107 [pdf, other]

Non-Uniform Interpolation in Integrated Gradients for Low-Latency Explainable-AI

Authors: Ashwin Bhat, Arijit Raychowdhury

Abstract: There has been a surge in Explainable-AI (XAI) methods that provide insights into the workings of Deep Neural Network (DNN) models. Integrated Gradients (IG) is a popular XAI algorithm that attributes relevance scores to input features commensurate with their contribution to the model's output. However, it requires multiple forward \& backward passes through the model. Thus, compared to a single f… ▽ More There has been a surge in Explainable-AI (XAI) methods that provide insights into the workings of Deep Neural Network (DNN) models. Integrated Gradients (IG) is a popular XAI algorithm that attributes relevance scores to input features commensurate with their contribution to the model's output. However, it requires multiple forward \& backward passes through the model. Thus, compared to a single forward-pass inference, there is a significant computational overhead to generate the explanation which hinders real-time XAI. This work addresses the aforementioned issue by accelerating IG with a hardware-aware algorithm optimization. We propose a novel non-uniform interpolation scheme to compute the IG attribution scores which replaces the baseline uniform interpolation. Our algorithm significantly reduces the total interpolation steps required without adversely impacting convergence. Experiments on the ImageNet dataset using a pre-trained InceptionV3 model demonstrate \textit{2.6-3.6}$\times$ performance speedup on GPU systems for iso-convergence. This includes the minimal \textit{0.2-3.2}\% latency overhead introduced by the pre-processing stage of computing the non-uniform interpolation step-sizes. △ Less

Submitted 21 February, 2023; originally announced February 2023.

arXiv:2302.08293 [pdf, other]

Social Visual Behavior Analytics for Autism Therapy of Children Based on Automated Mutual Gaze Detection

Authors: Zhang Guo, Vuthea Chheang, Jicheng Li, Kenneth E. Barner, Anjana Bhat, Roghayeh Barmaki

Abstract: Social visual behavior, as a type of non-verbal communication, plays a central role in studying social cognitive processes in interactive and complex settings of autism therapy interventions. However, for social visual behavior analytics in children with autism, it is challenging to collect gaze data manually and evaluate them because it costs a lot of time and effort for human coders. In this pap… ▽ More Social visual behavior, as a type of non-verbal communication, plays a central role in studying social cognitive processes in interactive and complex settings of autism therapy interventions. However, for social visual behavior analytics in children with autism, it is challenging to collect gaze data manually and evaluate them because it costs a lot of time and effort for human coders. In this paper, we introduce a social visual behavior analytics approach by quantifying the mutual gaze performance of children receiving play-based autism interventions using an automated mutual gaze detection framework. Our analysis is based on a video dataset that captures and records social interactions between children with autism and their therapy trainers (N=28 observations, 84 video clips, 21 Hrs duration). The effectiveness of our framework was evaluated by comparing the mutual gaze ratio derived from the mutual gaze detection framework with the human-coded ratio values. We analyzed the mutual gaze frequency and duration across different therapy settings, activities, and sessions. We created mutual gaze-related measures for social visual behavior score prediction using multiple machine learning-based regression models. The results show that our method provides mutual gaze measures that reliably represent (or even replace) the human coders' hand-coded social gaze measures and effectively evaluates and predicts ASD children's social visual performance during the intervention. Our findings have implications for social interaction analysis in small-group behavior assessments in numerous co-located settings in (special) education and in the workplace. △ Less

Submitted 16 February, 2023; originally announced February 2023.

Comments: Accepted to IEEE/ACM international conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE) 2023

arXiv:2302.03724 [pdf, other]

Assigning Optimal Integer Harmonic Periods to Hard Real Time Tasks

Authors: Anand Bhat, Ragunathan Rajkumar

Abstract: Selecting period values for tasks is a very important step in the design process of a real-time system, especially due to the significance of its impact on system schedulability. It is well known that, under RMS, the utilization bound for a harmonic task set is 100%. Also, polynomial-time algorithms have been developed for response-time analysis of harmonic task sets. In practice, the largest acce… ▽ More Selecting period values for tasks is a very important step in the design process of a real-time system, especially due to the significance of its impact on system schedulability. It is well known that, under RMS, the utilization bound for a harmonic task set is 100%. Also, polynomial-time algorithms have been developed for response-time analysis of harmonic task sets. In practice, the largest acceptable value for the period of a task is determined by the performance and safety requirements of the application. In this paper, we address the problem of assigning harmonic periods to a task set such that every task gets assigned an integer period less than or equal to its application specified upper bound and the task utilization of every task is less than 1. We focus on integer solutions given the discrete nature of time in real-time computer systems. We first express this problem of assigning harmonic periods to a task set as a discrete piecewise optimization problem. We then present the 'Discrete Piecewise Harmonic Search' (DPHS) algorithm that outputs an optimal harmonic task assignment. We then define conditions for a metric to be rational for harmonization. We show that commonly used metrics like, the total percentage error (TPE), total system utilization (TSU), first order error (FOE), and maximum percentage error (MPE), are rational. We next prove that the DPHS algorithm finds the optimal feasible assignment, if one exists, for these rational metrics. We apply the DPHS algorithm to harmonize task sets used in real-world applications to highlight its benefits. We compare the performance of the DPHS algorithm against a brute-force search and find that the DPHS searches up to 94\% fewer task sets than the brute-force search that obtains the optimal solution. △ Less

Submitted 7 February, 2023; originally announced February 2023.

Comments: 10 pages

ACM Class: C.4

arXiv:2302.00560 [pdf, other]

doi 10.1145/3544548.3581196

Co-Writing with Opinionated Language Models Affects Users' Views

Authors: Maurice Jakesch, Advait Bhat, Daniel Buschek, Lior Zalmanson, Mor Naaman

Abstract: If large language models like GPT-3 preferably produce a particular point of view, they may influence people's opinions on an unknown scale. This study investigates whether a language-model-powered writing assistant that generates some opinions more often than others impacts what users write - and what they think. In an online experiment, we asked participants (N=1,506) to write a post discussing… ▽ More If large language models like GPT-3 preferably produce a particular point of view, they may influence people's opinions on an unknown scale. This study investigates whether a language-model-powered writing assistant that generates some opinions more often than others impacts what users write - and what they think. In an online experiment, we asked participants (N=1,506) to write a post discussing whether social media is good for society. Treatment group participants used a language-model-powered writing assistant configured to argue that social media is good or bad for society. Participants then completed a social media attitude survey, and independent judges (N=500) evaluated the opinions expressed in their writing. Using the opinionated language model affected the opinions expressed in participants' writing and shifted their opinions in the subsequent attitude survey. We discuss the wider implications of our results and argue that the opinions built into AI language technologies need to be monitored and engineered more carefully. △ Less

Submitted 1 February, 2023; originally announced February 2023.

Journal ref: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI '23), April 23-28, 2023, Hamburg, Germany. ACM, New York, NY, USA

arXiv:2301.07315 [pdf, other]

Face Recognition in the age of CLIP & Billion image datasets

Authors: Aaditya Bhat, Shrey Jain

Abstract: CLIP (Contrastive Language-Image Pre-training) models developed by OpenAI have achieved outstanding results on various image recognition and retrieval tasks, displaying strong zero-shot performance. This means that they are able to perform effectively on tasks for which they have not been explicitly trained. Inspired by the success of OpenAI CLIP, a new publicly available dataset called LAION-5B w… ▽ More CLIP (Contrastive Language-Image Pre-training) models developed by OpenAI have achieved outstanding results on various image recognition and retrieval tasks, displaying strong zero-shot performance. This means that they are able to perform effectively on tasks for which they have not been explicitly trained. Inspired by the success of OpenAI CLIP, a new publicly available dataset called LAION-5B was collected which resulted in the development of open ViT-H/14, ViT-G/14 models that outperform the OpenAI L/14 model. The LAION-5B dataset also released an approximate nearest neighbor index, with a web interface for search & subset creation. In this paper, we evaluate the performance of various CLIP models as zero-shot face recognizers. Our findings show that CLIP models perform well on face recognition tasks, but increasing the size of the CLIP model does not necessarily lead to improved accuracy. Additionally, we investigate the robustness of CLIP models against data poisoning attacks by testing their performance on poisoned data. Through this analysis, we aim to understand the potential consequences and misuse of search engines built using CLIP models, which could potentially function as unintentional face recognition engines. △ Less

Submitted 18 January, 2023; originally announced January 2023.

arXiv:2211.03742 [pdf, other]

Multi-Task Learning Framework for Extracting Emotion Cause Span and Entailment in Conversations

Authors: Ashwani Bhat, Ashutosh Modi

Abstract: Predicting emotions expressed in text is a well-studied problem in the NLP community. Recently there has been active research in extracting the cause of an emotion expressed in text. Most of the previous work has done causal emotion entailment in documents. In this work, we propose neural models to extract emotion cause span and entailment in conversations. For learning such models, we use RECCON… ▽ More Predicting emotions expressed in text is a well-studied problem in the NLP community. Recently there has been active research in extracting the cause of an emotion expressed in text. Most of the previous work has done causal emotion entailment in documents. In this work, we propose neural models to extract emotion cause span and entailment in conversations. For learning such models, we use RECCON dataset, which is annotated with cause spans at the utterance level. In particular, we propose MuTEC, an end-to-end Multi-Task learning framework for extracting emotions, emotion cause, and entailment in conversations. This is in contrast to existing baseline models that use ground truth emotions to extract the cause. MuTEC performs better than the baselines for most of the data folds provided in the dataset. △ Less

Submitted 7 November, 2022; originally announced November 2022.

Comments: 19 Pages, Accepted at Workshop on Transfer Learning for Natural Language Processing, NeurIPS 2022

arXiv:2210.10922 [pdf, other]

Gradient Backpropagation based Feature Attribution to Enable Explainable-AI on the Edge

Authors: Ashwin Bhat, Adou Sangbone Assoa, Arijit Raychowdhury

Abstract: There has been a recent surge in the field of Explainable AI (XAI) which tackles the problem of providing insights into the behavior of black-box machine learning models. Within this field, \textit{feature attribution} encompasses methods which assign relevance scores to input features and visualize them as a heatmap. Designing flexible accelerators for multiple such algorithms is challenging sinc… ▽ More There has been a recent surge in the field of Explainable AI (XAI) which tackles the problem of providing insights into the behavior of black-box machine learning models. Within this field, \textit{feature attribution} encompasses methods which assign relevance scores to input features and visualize them as a heatmap. Designing flexible accelerators for multiple such algorithms is challenging since the hardware mapping of these algorithms has not been studied yet. In this work, we first analyze the dataflow of gradient backpropagation based feature attribution algorithms to determine the resource overhead required over inference. The gradient computation is optimized to minimize the memory overhead. Second, we develop a High-Level Synthesis (HLS) based configurable FPGA design that is targeted for edge devices and supports three feature attribution algorithms. Tile based computation is employed to maximally use on-chip resources while adhering to the resource constraints. Representative CNNs are trained on CIFAR-10 dataset and implemented on multiple Xilinx FPGAs using 16-bit fixed-point precision demonstrating flexibility of our library. Finally, through efficient reuse of allocated hardware resources, our design methodology demonstrates a pathway to repurpose inference accelerators to support feature attribution with minimal overhead, thereby enabling real-time XAI on the edge. △ Less

Submitted 19 October, 2022; originally announced October 2022.

Comments: To appear in 30th IFIP/IEEE International Conference on Very Large Scale Integration (VLSI-SoC 2022

arXiv:2208.01161 [pdf, other]

Pose Uncertainty Aware Movement Synchrony Estimation via Spatial-Temporal Graph Transformer

Authors: Jicheng Li, Anjana Bhat, Roghayeh Barmaki

Abstract: Movement synchrony reflects the coordination of body movements between interacting dyads. The estimation of movement synchrony has been automated by powerful deep learning models such as transformer networks. However, instead of designing a specialized network for movement synchrony estimation, previous transformer-based works broadly adopted architectures from other tasks such as human activity r… ▽ More Movement synchrony reflects the coordination of body movements between interacting dyads. The estimation of movement synchrony has been automated by powerful deep learning models such as transformer networks. However, instead of designing a specialized network for movement synchrony estimation, previous transformer-based works broadly adopted architectures from other tasks such as human activity recognition. Therefore, this paper proposed a skeleton-based graph transformer for movement synchrony estimation. The proposed model applied ST-GCN, a spatial-temporal graph convolutional neural network for skeleton feature extraction, followed by a spatial transformer for spatial feature generation. The spatial transformer is guided by a uniquely designed joint position embedding shared between the same joints of interacting individuals. Besides, we incorporated a temporal similarity matrix in temporal attention computation considering the periodic intrinsic of body movements. In addition, the confidence score associated with each joint reflects the uncertainty of a pose, while previous works on movement synchrony estimation have not sufficiently emphasized this point. Since transformer networks demand a significant amount of data to train, we constructed a dataset for movement synchrony estimation using Human3.6M, a benchmark dataset for human activity recognition, and pretrained our model on it using contrastive learning. We further applied knowledge distillation to alleviate information loss introduced by pose detector failure in a privacy-preserving way. We compared our method with representative approaches on PT13, a dataset collected from autism therapy interventions. Our method achieved an overall accuracy of 88.98% and surpassed its counterparts by a wide margin while maintaining data privacy. △ Less

Submitted 1 August, 2022; originally announced August 2022.

Comments: Accepted by 24th ACM International Conference on Multimodal Interaction (ICMI'22). 17 pages, 2 figures

arXiv:2208.01100 [pdf, other]

Dyadic Movement Synchrony Estimation Under Privacy-preserving Conditions

Authors: Jicheng Li, Anjana Bhat, Roghayeh Barmaki

Abstract: Movement synchrony refers to the dynamic temporal connection between the motions of interacting people. The applications of movement synchrony are wide and broad. For example, as a measure of coordination between teammates, synchrony scores are often reported in sports. The autism community also identifies movement synchrony as a key indicator of children's social and developmental achievements. I… ▽ More Movement synchrony refers to the dynamic temporal connection between the motions of interacting people. The applications of movement synchrony are wide and broad. For example, as a measure of coordination between teammates, synchrony scores are often reported in sports. The autism community also identifies movement synchrony as a key indicator of children's social and developmental achievements. In general, raw video recordings are often used for movement synchrony estimation, with the drawback that they may reveal people's identities. Furthermore, such privacy concern also hinders data sharing, one major roadblock to a fair comparison between different approaches in autism research. To address the issue, this paper proposes an ensemble method for movement synchrony estimation, one of the first deep-learning-based methods for automatic movement synchrony assessment under privacy-preserving conditions. Our method relies entirely on publicly shareable, identity-agnostic secondary data, such as skeleton data and optical flow. We validate our method on two datasets: (1) PT13 dataset collected from autism therapy interventions and (2) TASD-2 dataset collected from synchronized diving competitions. In this context, our method outperforms its counterpart approaches, both deep neural networks and alternatives. △ Less

Submitted 1 August, 2022; originally announced August 2022.

Comments: IEEE ICPR 2022. 8 pages, 3 figures

arXiv:2208.00636 [pdf, other]

doi 10.1145/3581641.3584060

Interacting with next-phrase suggestions: How suggestion systems aid and influence the cognitive processes of writing

Authors: Advait Bhat, Saaket Agashe, Niharika Mohile, Parth Oberoi, Ravi Jangir, Anirudha Joshi

Abstract: Writing with next-phrase suggestions powered by large language models is becoming more pervasive by the day. However, research to understand writers' interaction and decision-making processes while engaging with such systems is still emerging. We conducted a qualitative study to shed light on writers' cognitive processes while writing with next-phrase suggestion systems. To do so, we recruited 14… ▽ More Writing with next-phrase suggestions powered by large language models is becoming more pervasive by the day. However, research to understand writers' interaction and decision-making processes while engaging with such systems is still emerging. We conducted a qualitative study to shed light on writers' cognitive processes while writing with next-phrase suggestion systems. To do so, we recruited 14 amateur writers to write two reviews each, one without suggestions and one with suggestions. Additionally, we also positively and negatively biased the suggestion system to get a diverse range of instances where writers' opinions and the bias in the language model align or misalign to varying degrees. We found that writers interact with next-phrase suggestions in various complex ways: Writers abstracted and extracted multiple parts of the suggestions and incorporated them within their writing, even when they disagreed with the suggestion as a whole; along with evaluating the suggestions on various criteria. The suggestion system also had various effects on the writing process, such as altering the writer's usual writing plans, leading to higher levels of distraction etc. Based on our qualitative analysis using the cognitive process model of writing by Hayes as a lens, we propose a theoretical model of 'writer-suggestion interaction' for writing with GPT-2 (and causal language models in general) for a movie review writing task, followed by directions for future research and design. △ Less

Submitted 24 January, 2023; v1 submitted 1 August, 2022; originally announced August 2022.

arXiv:2205.02455 [pdf, other]

COGMEN: COntextualized GNN based Multimodal Emotion recognitioN

Authors: Abhinav Joshi, Ashwani Bhat, Ayush Jain, Atin Vikram Singh, Ashutosh Modi

Abstract: Emotions are an inherent part of human interactions, and consequently, it is imperative to develop AI systems that understand and recognize human emotions. During a conversation involving various people, a person's emotions are influenced by the other speaker's utterances and their own emotional state over the utterances. In this paper, we propose COntextualized Graph Neural Network based Multimod… ▽ More Emotions are an inherent part of human interactions, and consequently, it is imperative to develop AI systems that understand and recognize human emotions. During a conversation involving various people, a person's emotions are influenced by the other speaker's utterances and their own emotional state over the utterances. In this paper, we propose COntextualized Graph Neural Network based Multimodal Emotion recognitioN (COGMEN) system that leverages local information (i.e., inter/intra dependency between speakers) and global information (context). The proposed model uses Graph Neural Network (GNN) based architecture to model the complex dependencies (local and global information) in a conversation. Our model gives state-of-the-art (SOTA) results on IEMOCAP and MOSEI datasets, and detailed ablation experiments show the importance of modeling information at both levels. △ Less

Submitted 5 May, 2022; originally announced May 2022.

Comments: 17 pages (9 main + 8 appendix). Accepted at NAACL 2022

arXiv:2204.06425 [pdf, other]

doi 10.1145/3544548.3581518

Aspirations and Practice of Model Documentation: Moving the Needle with Nudging and Traceability

Authors: Avinash Bhat, Austin Coursey, Grace Hu, Sixian Li, Nadia Nahar, Shurui Zhou, Christian Kästner, Jin L. C. Guo

Abstract: The documentation practice for machine-learned (ML) models often falls short of established practices for traditional software, which impedes model accountability and inadvertently abets inappropriate or misuse of models. Recently, model cards, a proposal for model documentation, have attracted notable attention, but their impact on the actual practice is unclear. In this work, we systematically s… ▽ More The documentation practice for machine-learned (ML) models often falls short of established practices for traditional software, which impedes model accountability and inadvertently abets inappropriate or misuse of models. Recently, model cards, a proposal for model documentation, have attracted notable attention, but their impact on the actual practice is unclear. In this work, we systematically study the model documentation in the field and investigate how to encourage more responsible and accountable documentation practice. Our analysis of publicly available model cards reveals a substantial gap between the proposal and the practice. We then design a tool named DocML aiming to (1) nudge the data scientists to comply with the model cards proposal during the model development, especially the sections related to ethics, and (2) assess and manage the documentation quality. A lab study reveals the benefit of our tool towards long-term documentation quality and accountability. △ Less

Submitted 8 February, 2023; v1 submitted 13 April, 2022; originally announced April 2022.

Comments: To be published in proceedings of CHI 2023

arXiv:2202.01966 [pdf, other]

Predictive Closed-Loop Service Automation in O-RAN based Network Slicing

Authors: Joseph Thaliath, Solmaz Niknam, Sukhdeep Singh, Rahul Banerji, Navrati Saxena, Harpreet S. Dhillon, Jeffrey H. Reed, Ali Kashif Bashir, Avinash Bhat, Abhishek Roy

Abstract: Network slicing provides introduces customized and agile network deployment for managing different service types for various verticals under the same infrastructure. To cater to the dynamic service requirements of these verticals and meet the required quality-of-service (QoS) mentioned in the service-level agreement (SLA), network slices need to be isolated through dedicated elements and resources… ▽ More Network slicing provides introduces customized and agile network deployment for managing different service types for various verticals under the same infrastructure. To cater to the dynamic service requirements of these verticals and meet the required quality-of-service (QoS) mentioned in the service-level agreement (SLA), network slices need to be isolated through dedicated elements and resources. Additionally, allocated resources to these slices need to be continuously monitored and intelligently managed. This enables immediate detection and correction of any SLA violation to support automated service assurance in a closed-loop fashion. By reducing human intervention, intelligent and closed-loop resource management reduces the cost of offering flexible services. Resource management in a network shared among verticals (potentially administered by different providers), would be further facilitated through open and standardized interfaces. Open radio access network (O-RAN) is perhaps the most promising RAN architecture that inherits all the aforementioned features, namely intelligence, open and standard interfaces, and closed control loop. Inspired by this, in this article we provide a closed-loop and intelligent resource provisioning scheme for O-RAN slicing to prevent SLA violations. In order to maintain realism, a real-world dataset of a large operator is used to train a learning solution for optimizing resource utilization in the proposed closed-loop service automation process. Moreover, the deployment architecture and the corresponding flow that are cognizant of the O-RAN requirements are also discussed. △ Less

Submitted 3 February, 2022; originally announced February 2022.

Comments: 7 pages, 3 figures, 1 table

arXiv:2110.13967 [pdf, other]

Evaluating Serverless Architecture for Big Data Enterprise Applications

Authors: Aimer Bhat, Madhumonti Roy, Heeki Park

Abstract: In this paper, we investigate serverless computing for performing large scale data processing with cloudnative primitives. In this paper, we investigate serverless computing for performing large scale data processing with cloudnative primitives. △ Less

Submitted 26 October, 2021; originally announced October 2021.

Comments: 8 pages

Journal ref: BDCAT 2021

arXiv:2108.00578 [pdf, other]

Is My Model Using The Right Evidence? Systematic Probes for Examining Evidence-Based Tabular Reasoning

Authors: Vivek Gupta, Riyaz A. Bhat, Atreya Ghosal, Manish Shrivastava, Maneesh Singh, Vivek Srikumar

Abstract: Neural models command state-of-the-art performance across NLP tasks, including ones involving "reasoning". Models claiming to reason about the evidence presented to them should attend to the correct parts of the input avoiding spurious patterns therein, be self-consistent in their predictions across inputs, and be immune to biases derived from their pre-training in a nuanced, context-sensitive fas… ▽ More Neural models command state-of-the-art performance across NLP tasks, including ones involving "reasoning". Models claiming to reason about the evidence presented to them should attend to the correct parts of the input avoiding spurious patterns therein, be self-consistent in their predictions across inputs, and be immune to biases derived from their pre-training in a nuanced, context-sensitive fashion. {\em Do the prevalent *BERT-family of models do so?} In this paper, we study this question using the problem of reasoning on tabular data. Tabular inputs are especially well-suited for the study -- they admit systematic probes targeting the properties listed above. Our experiments demonstrate that a RoBERTa-based model, representative of the current state-of-the-art, fails at reasoning on the following counts: it (a) ignores relevant parts of the evidence, (b) is over-sensitive to annotation artifacts, and (c) relies on the knowledge encoded in the pre-trained language model rather than the evidence presented in its tabular inputs. Finally, through inoculation experiments, we show that fine-tuning the model on perturbed data does not help it overcome the above challenges. △ Less

Submitted 5 March, 2022; v1 submitted 1 August, 2021; originally announced August 2021.

Comments: 20 pages, 17 figure, 11 tables, TACL 2022, pre-MIT Press publication version

arXiv:2107.14140 [pdf]

Methodology and Analysis of Smart Contracts in Blockchain-Based International Trade Application

Authors: Asif Bhat, Rizal Mohd Nor, Md Amiruzzaman, Md. Rajibul Islam

Abstract: Blokchain is used in a variety of applications where trustworthy computing is re-quired. Trade finance is one of these areas that would benefit immensely from a decentralized way of doing transactions. This paper presents the preliminary as-sessment of Accepire-BT, a software platform developed for the practice of col-laborative Trade Finance. The proposed solution is enforced by smart contracts u… ▽ More Blokchain is used in a variety of applications where trustworthy computing is re-quired. Trade finance is one of these areas that would benefit immensely from a decentralized way of doing transactions. This paper presents the preliminary as-sessment of Accepire-BT, a software platform developed for the practice of col-laborative Trade Finance. The proposed solution is enforced by smart contracts using Solidity, the underlying programming language for the Ethereum block-chain. We evaluated the performance in the Rinkeby test network by using Remix and MetaMask. The results of the preliminary trial show that smart contracts take less than one minute per cycle. Also, we present a discussion about costs for us-ing the public Ethereum Rinkeby network. △ Less

Submitted 27 April, 2022; v1 submitted 14 July, 2021; originally announced July 2021.

Comments: 10 pages, 2 figures, and 1 table. Preprint: submitted to Information and Communication Technology Journals

arXiv:2106.15325 [pdf, other]

SE-MD: A Single-encoder multiple-decoder deep network for point cloud generation from 2D images

Authors: Abdul Mueed Hafiz, Rouf Ul Alam Bhat, Shabir Ahmad Parah, M. Hassaballah

Abstract: 3D model generation from single 2D RGB images is a challenging and actively researched computer vision task. Various techniques using conventional network architectures have been proposed for the same. However, the body of research work is limited and there are various issues like using inefficient 3D representation formats, weak 3D model generation backbones, inability to generate dense point clo… ▽ More 3D model generation from single 2D RGB images is a challenging and actively researched computer vision task. Various techniques using conventional network architectures have been proposed for the same. However, the body of research work is limited and there are various issues like using inefficient 3D representation formats, weak 3D model generation backbones, inability to generate dense point clouds, dependence of post-processing for generation of dense point clouds, and dependence on silhouettes in RGB images. In this paper, a novel 2D RGB image to point cloud conversion technique is proposed, which improves the state of art in the field due to its efficient, robust and simple model by using the concept of parallelization in network architecture. It not only uses the efficient and rich 3D representation of point clouds, but also uses a novel and robust point cloud generation backbone in order to address the prevalent issues. This involves using a single-encoder multiple-decoder deep network architecture wherein each decoder generates certain fixed viewpoints. This is followed by fusing all the viewpoints to generate a dense point cloud. Various experiments are conducted on the technique and its performance is compared with those of other state of the art techniques and impressive gains in performance are demonstrated. Code is available at https://github.com/mueedhafiz1982/ △ Less

Submitted 17 June, 2021; originally announced June 2021.

arXiv:2106.09199 [pdf, other]

A Two-stage Multi-modal Affect Analysis Framework for Children with Autism Spectrum Disorder

Authors: Jicheng Li, Anjana Bhat, Roghayeh Barmaki

Abstract: Autism spectrum disorder (ASD) is a developmental disorder that influences the communication and social behavior of a person in a way that those in the spectrum have difficulty in perceiving other people's facial expressions, as well as presenting and communicating emotions and affect via their own faces and bodies. Some efforts have been made to predict and improve children with ASD's affect stat… ▽ More Autism spectrum disorder (ASD) is a developmental disorder that influences the communication and social behavior of a person in a way that those in the spectrum have difficulty in perceiving other people's facial expressions, as well as presenting and communicating emotions and affect via their own faces and bodies. Some efforts have been made to predict and improve children with ASD's affect states in play therapy, a common method to improve children's social skills via play and games. However, many previous works only used pre-trained models on benchmark emotion datasets and failed to consider the distinction in emotion between typically developing children and children with autism. In this paper, we present an open-source two-stage multi-modal approach leveraging acoustic and visual cues to predict three main affect states of children with ASD's affect states (positive, negative, and neutral) in real-world play therapy scenarios, and achieved an overall accuracy of 72:40%. This work presents a novel way to combine human expertise and machine intelligence for ASD affect recognition by proposing a two-stage schema. △ Less

Submitted 16 June, 2021; originally announced June 2021.

Comments: 8 pages including reference; 8 figures

Journal ref: The AAAI-21 Workshop On Affective Content Analysis; 2021

arXiv:2106.07550 [pdf, other]

Attention mechanisms and deep learning for machine vision: A survey of the state of the art

Authors: Abdul Mueed Hafiz, Shabir Ahmad Parah, Rouf Ul Alam Bhat

Abstract: With the advent of state of the art nature-inspired pure attention based models i.e. transformers, and their success in natural language processing (NLP), their extension to machine vision (MV) tasks was inevitable and much felt. Subsequently, vision transformers (ViTs) were introduced which are giving quite a challenge to the established deep learning based machine vision techniques. However, pur… ▽ More With the advent of state of the art nature-inspired pure attention based models i.e. transformers, and their success in natural language processing (NLP), their extension to machine vision (MV) tasks was inevitable and much felt. Subsequently, vision transformers (ViTs) were introduced which are giving quite a challenge to the established deep learning based machine vision techniques. However, pure attention based models/architectures like transformers require huge data, large training times and large computational resources. Some recent works suggest that combinations of these two varied fields can prove to build systems which have the advantages of both these fields. Accordingly, this state of the art survey paper is introduced which hopefully will help readers get useful information about this interesting and potential research area. A gentle introduction to attention mechanisms is given, followed by a discussion of the popular attention based deep architectures. Subsequently, the major categories of the intersection of attention mechanisms and deep learning for machine vision (MV) based are discussed. Afterwards, the major algorithms, issues and trends within the scope of the paper are discussed. △ Less

Submitted 3 June, 2021; originally announced June 2021.

arXiv:2105.11241 [pdf]

Generation of COVID-19 Chest CT Scan Images using Generative Adversarial Networks

Authors: Prerak Mann, Sahaj Jain, Saurabh Mittal, Aruna Bhat

Abstract: SARS-CoV-2, also known as COVID-19 or Coronavirus, is a viral contagious disease that is infected by a novel coronavirus, and has been rapidly spreading across the globe. It is very important to test and isolate people to reduce spread, and from here comes the need to do this quickly and efficiently. According to some studies, Chest-CT outperforms RT-PCR lab testing, which is the current standard,… ▽ More SARS-CoV-2, also known as COVID-19 or Coronavirus, is a viral contagious disease that is infected by a novel coronavirus, and has been rapidly spreading across the globe. It is very important to test and isolate people to reduce spread, and from here comes the need to do this quickly and efficiently. According to some studies, Chest-CT outperforms RT-PCR lab testing, which is the current standard, when diagnosing COVID-19 patients. Due to this, computer vision researchers have developed various deep learning systems that can predict COVID-19 using a Chest-CT scan correctly to a certain degree. The accuracy of these systems is limited since deep learning neural networks such as CNNs (Convolutional Neural Networks) need a significantly large quantity of data for training in order to produce good quality results. Since the disease is relatively recent and more focus has been on CXR (Chest XRay) images, the available chest CT Scan image dataset is much less. We propose a method, by utilizing GANs, to generate synthetic chest CT images of both positive and negative COVID-19 patients. Using a pre-built predictive model, we concluded that around 40% of the generated images are correctly predicted as COVID-19 positive. The dataset thus generated can be used to train a CNN-based classifier which can help determine COVID-19 in a patient with greater accuracy. △ Less

Submitted 20 May, 2021; originally announced May 2021.

arXiv:2104.11959 [pdf, other]

MultiCruise: Eco-Lane Selection Strategy with Eco-Cruise Control for Connected and Automated Vehicles

Authors: Shunsuke Aoki, Lung En Jan, Junfeng Zhao, Anand Bhat, Chen-Fang Chang, Ragunathan, Rajkumar

Abstract: Connected and Automated Vehicles (CAVs) have real-time information from the surrounding environment by using local on-board sensors, V2X (Vehicle-to-Everything) communications, pre-loaded vehicle-specific lookup tables, and map database. CAVs are capable of improving energy efficiency by incorporating these information. In particular, Eco-Cruise and Eco-Lane Selection on highways and/or motorways… ▽ More Connected and Automated Vehicles (CAVs) have real-time information from the surrounding environment by using local on-board sensors, V2X (Vehicle-to-Everything) communications, pre-loaded vehicle-specific lookup tables, and map database. CAVs are capable of improving energy efficiency by incorporating these information. In particular, Eco-Cruise and Eco-Lane Selection on highways and/or motorways have immense potential to save energy, because there are generally fewer traffic controllers and the vehicles keep moving in general. In this paper, we present a cooperative and energy-efficient lane-selection strategy named MultiCruise, where each CAV selects one among multiple candidate lanes that allows the most energy-efficient travel. MultiCruise incorporates an Eco-Cruise component to select the most energy-efficient lane. The Eco-Cruise component calculates the driving parameters and prospective energy consumption of the ego vehicle for each candidate lane, and the Eco-Lane Selection component uses these values. As a result, MultiCruise can account for multiple data sources, such as the road curvature and the surrounding vehicles' velocities and accelerations. The eco-autonomous driving strategy, MultiCruise, is tested, designed and verified by using a co-simulation test platform that includes autonomous driving software and realistic road networks to study the performance under realistic driving conditions. Our experimental evaluations show that our eco-autonomous MultiCruise saves up to 8.5% fuel consumption. △ Less

Submitted 24 April, 2021; originally announced April 2021.

arXiv:2101.08523 [pdf, other]

Adv-OLM: Generating Textual Adversaries via OLM

Authors: Vijit Malik, Ashwani Bhat, Ashutosh Modi

Abstract: Deep learning models are susceptible to adversarial examples that have imperceptible perturbations in the original input, resulting in adversarial attacks against these models. Analysis of these attacks on the state of the art transformers in NLP can help improve the robustness of these models against such adversarial inputs. In this paper, we present Adv-OLM, a black-box attack method that adapts… ▽ More Deep learning models are susceptible to adversarial examples that have imperceptible perturbations in the original input, resulting in adversarial attacks against these models. Analysis of these attacks on the state of the art transformers in NLP can help improve the robustness of these models against such adversarial inputs. In this paper, we present Adv-OLM, a black-box attack method that adapts the idea of Occlusion and Language Models (OLM) to the current state of the art attack methods. OLM is used to rank words of a sentence, which are later substituted using word replacement strategies. We experimentally show that our approach outperforms other attack methods for several text classification tasks. △ Less

Submitted 21 January, 2021; originally announced January 2021.

Comments: 5 Pages + 1 Page references + 3 Pages Appendix, Accepted at EACL 2021

arXiv:2010.06804 [pdf, other]

Unsupervised Relation Extraction from Language Models using Constrained Cloze Completion

Authors: Ankur Goswami, Akshata Bhat, Hadar Ohana, Theodoros Rekatsinas

Abstract: We show that state-of-the-art self-supervised language models can be readily used to extract relations from a corpus without the need to train a fine-tuned extractive head. We introduce RE-Flex, a simple framework that performs constrained cloze completion over pretrained language models to perform unsupervised relation extraction. RE-Flex uses contextual matching to ensure that language model pre… ▽ More We show that state-of-the-art self-supervised language models can be readily used to extract relations from a corpus without the need to train a fine-tuned extractive head. We introduce RE-Flex, a simple framework that performs constrained cloze completion over pretrained language models to perform unsupervised relation extraction. RE-Flex uses contextual matching to ensure that language model predictions matches supporting evidence from the input corpus that is relevant to a target relation. We perform an extensive experimental study over multiple relation extraction benchmarks and demonstrate that RE-Flex outperforms competing unsupervised relation extraction methods based on pretrained language models by up to 27.8 $F_1$ points compared to the next-best method. Our results show that constrained inference queries against a language model can enable accurate unsupervised relation extraction. △ Less

Submitted 14 October, 2020; originally announced October 2020.

Comments: 14 pages, 5 figures, Accepted to Findings of EMNLP 2020

arXiv:2004.07980 [pdf, other]

Co-simulation Platform for Developing InfoRich Energy-Efficient Connected and Automated Vehicles

Authors: Shunsuke Aoki, Lung En Jan, Junfeng Zhao, Anand Bhat, Ragunathan, Rajkumar, Chen-Fang Chang

Abstract: With advances in sensing, computing, and communication technologies, Connected and Automated Vehicles (CAVs) are becoming feasible. The advent of CAVs presents new opportunities to improve the energy efficiency of individual vehicles. However, testing and verifying energy-efficient autonomous driving systems are difficult due to safety considerations and repeatability. In this paper, we present a… ▽ More With advances in sensing, computing, and communication technologies, Connected and Automated Vehicles (CAVs) are becoming feasible. The advent of CAVs presents new opportunities to improve the energy efficiency of individual vehicles. However, testing and verifying energy-efficient autonomous driving systems are difficult due to safety considerations and repeatability. In this paper, we present a co-simulation platform to develop and test novel vehicle eco-autonomous driving technologies named InfoRich, which incorporates the information from on-board sensors, V2X communications, and map database. The co-simulation platform includes eco-autonomous driving software, vehicle dynamics and powertrain (VD&PT) model, and a traffic environment simulator. Also, we utilize synthetic drive cycles derived from real-world driving data to test the strategies under realistic driving scenarios. To build road networks from the real-world driving data, we develop an Automated Parser and Calculator for Map/Scenario named AutoPASCAL. Overall, the simulation platform provides a realistic vehicle model, powertrain model, sensor model, traffic model, and road-network model to enable the evaluation of the energy efficiency of eco-autonomous driving. △ Less

Submitted 16 April, 2020; originally announced April 2020.

arXiv:2003.00274 [pdf, other]

Causal Learning by a Robot with Semantic-Episodic Memory in an Aesop's Fable Experiment

Authors: Ajaz A. Bhat, Vishwanathan Mohan

Abstract: Corvids, apes, and children solve The Crow and The Pitcher task (from Aesop's Fables) indicating a causal understanding of the task. By cumulatively interacting with different objects, how can cognitive agents abstract the underlying cause-effect relations to predict affordances of novel objects? We address this question by re-enacting the Aesop's Fable task on a robot and present a) a brain-guide… ▽ More Corvids, apes, and children solve The Crow and The Pitcher task (from Aesop's Fables) indicating a causal understanding of the task. By cumulatively interacting with different objects, how can cognitive agents abstract the underlying cause-effect relations to predict affordances of novel objects? We address this question by re-enacting the Aesop's Fable task on a robot and present a) a brain-guided neural model of semantic-episodic memory; with b) four task-agnostic learning rules that compare expectations from recalled past episodes with the current scenario to progressively extract the hidden causal relations. The ensuing robot behaviours illustrate causal learning; and predictions for novel objects converge to Archimedes' principle, independent of both the objects explored during learning and the order of their cumulative exploration. △ Less

Submitted 29 February, 2020; originally announced March 2020.

Comments: To appear in ICLR 2020 4 pages For associated videos, see https://www.youtube.com/playlist?list=PLIfoHEM1gr24EniCzBuUxZ2tqNpQA8QQm

ACM Class: I.2.6

arXiv:2001.00486 [pdf, other]

Reparo: Publicly Verifiable Layer to Repair Blockchains

Authors: Sri Aravinda Krishnan Thyagarajan, Adithya Bhat, Bernardo Magri, Daniel Tschudi, Aniket Kate

Abstract: Although blockchains aim for immutability as their core feature, several instances have exposed the harms with perfect immutability. The permanence of illicit content inserted in Bitcoin poses a challenge to law enforcement agencies like Interpol, and millions of dollars are lost in buggy smart contracts in Ethereum. A line of research then spawned on Redactable blockchains with the aim of solving… ▽ More Although blockchains aim for immutability as their core feature, several instances have exposed the harms with perfect immutability. The permanence of illicit content inserted in Bitcoin poses a challenge to law enforcement agencies like Interpol, and millions of dollars are lost in buggy smart contracts in Ethereum. A line of research then spawned on Redactable blockchains with the aim of solving the problem of redacting illicit contents from both permissioned and permissionless blockchains. However, all the existing proposals follow the build-new-chain approach for redactions, and cannot be integrated with existing systems like Bitcoin and Ethereum. We present Reparo, a generic protocol that acts as a publicly verifiable layer on top of any blockchain to perform repairs, ranging from fixing buggy contracts to removing illicit contents from the chain. Reparo facilitates additional functionalities for blockchains while maintaining the same provable security guarantee; thus, Reparo can be integrated with existing blockchains and start performing repairs on the pre-existent data. Any system user may propose a repair and a deliberation process ensues resulting in a decision that complies with the repair policy of the chain and is publicly verifiable. Our Reparo layer can be easily tailored to different consensus requirements, does not require heavy cryptographic machinery and can, therefore, be efficiently instantiated in any permission-ed or -less setting. We demonstrate it by giving efficient instantiations of Reparo on top of Ethereum (with PoS and PoW), Bitcoin, and Cardano. Moreover, we evaluate Reparo with Ethereum mainnet and show that the cost of fixing several prominent smart contract bugs is almost negligible. For instance, the cost of repairing the prominent Parity Multisig wallet bug with Reparo is as low as 0.000000018% of the Ethers that can be retrieved after the fix. △ Less

Submitted 10 March, 2021; v1 submitted 2 January, 2020; originally announced January 2020.

Comments: Appeared in Financial Cryptography 2021 (https://fc21.ifca.ai/program.php#abstract-talk-66)

arXiv:1909.09662 [pdf, other]

Mine Tunnel Exploration using Multiple Quadrupedal Robots

Authors: Ian D. Miller, Fernando Cladera, Anthony Cowley, Shreyas S. Shivakumar, Elijah S. Lee, Laura Jarin-Lipschitz, Akhilesh Bhat, Neil Rodrigues, Alex Zhou, Avraham Cohen, Adarsh Kulkarni, James Laney, Camillo Jose Taylor, Vijay Kumar

Abstract: Robotic exploration of underground environments is a particularly challenging problem due to communication, endurance, and traversability constraints which necessitate high degrees of autonomy and agility. These challenges are further exacerbated by the need to minimize human intervention for practical applications. While legged robots have the ability to traverse extremely challenging terrain, th… ▽ More Robotic exploration of underground environments is a particularly challenging problem due to communication, endurance, and traversability constraints which necessitate high degrees of autonomy and agility. These challenges are further exacerbated by the need to minimize human intervention for practical applications. While legged robots have the ability to traverse extremely challenging terrain, they also engender new challenges for planning, estimation, and control. In this work, we describe a fully autonomous system for multi-robot mine exploration and mapping using legged quadrupeds, as well as a distributed database mesh networking system for reporting data. In addition, we show results from the DARPA Subterranean Challenge (SubT) Tunnel Circuit demonstrating localization of artifacts after traversals of hundreds of meters. These experiments describe fully autonomous exploration of an unknown Global Navigation Satellite System (GNSS)-denied environment undertaken by legged robots. △ Less

Submitted 3 February, 2020; v1 submitted 20 September, 2019; originally announced September 2019.

Comments: Accompanying video: https://www.youtube.com/watch?v=jGXuOCHKC8E

arXiv:1902.05085 [pdf, ps, other]

Leveraging Newswire Treebanks for Parsing Conversational Data with Argument Scrambling

Authors: Riyaz Ahmad Bhat, Irshad Ahmad Bhat, Dipti Misra Sharma

Abstract: We investigate the problem of parsing conversational data of morphologically-rich languages such as Hindi where argument scrambling occurs frequently. We evaluate a state-of-the-art non-linear transition-based parsing system on a new dataset containing 506 dependency trees for sentences from Bollywood (Hindi) movie scripts and Twitter posts of Hindi monolingual speakers. We show that a dependency… ▽ More We investigate the problem of parsing conversational data of morphologically-rich languages such as Hindi where argument scrambling occurs frequently. We evaluate a state-of-the-art non-linear transition-based parsing system on a new dataset containing 506 dependency trees for sentences from Bollywood (Hindi) movie scripts and Twitter posts of Hindi monolingual speakers. We show that a dependency parser trained on a newswire treebank is strongly biased towards the canonical structures and degrades when applied to conversational data. Inspired by Transformational Generative Grammar, we mitigate the sampling bias by generating all theoretically possible alternative word orders of a clause from the existing (kernel) structures in the treebank. Training our parser on canonical and transformed structures improves performance on conversational data by around 9% LAS over the baseline newswire parser. △ Less

Submitted 13 February, 2019; originally announced February 2019.

Comments: Proceedings of the 15th International Conference on Parsing Technologies, pages 61-66, Pisa, Italy; September 20-22, 2017. Association for Computational Linguistics

Journal ref: Proceedings of the 15th International Conference on Parsing Technologies, pages 61-66, Pisa, Italy; September 20-22, 2017. Association for Computational Linguistics

arXiv:1902.04132 [pdf, other]

Automatic Inspection of Utility Scale Solar Power Plants using Deep Learning

Authors: Alekh Karkada Ashok, Chandan G, Adithya Bhat, Kausthubh Karnataki, Ganesh Shankar

Abstract: Solar energy has the potential to become the backbone energy source for the world. Utility scale solar power plants (more than 50 MW) could have more than 100K individual solar modules and be spread over more than 200 acres of land. Traditionally methods of monitoring each module become too costly in the utility scale. We demonstrate an alternative using the recent advances in deep learning to aut… ▽ More Solar energy has the potential to become the backbone energy source for the world. Utility scale solar power plants (more than 50 MW) could have more than 100K individual solar modules and be spread over more than 200 acres of land. Traditionally methods of monitoring each module become too costly in the utility scale. We demonstrate an alternative using the recent advances in deep learning to automatically analyze drone footage. We show that this can be a quick and reliable alternative. We show that it can save huge amounts of power and the impact the developing world hugely. △ Less

Submitted 20 December, 2018; originally announced February 2019.

Comments: Presented at NIPS 2018 Workshop on Machine Learning for the Developing World

arXiv:1808.07269 [pdf, other]

doi 10.1103/PhysRevD.99.092001

A Deep Neural Network for Pixel-Level Electromagnetic Particle Identification in the MicroBooNE Liquid Argon Time Projection Chamber

Authors: MicroBooNE collaboration, C. Adams, M. Alrashed, R. An, J. Anthony, J. Asaadi, A. Ashkenazi, M. Auger, S. Balasubramanian, B. Baller, C. Barnes, G. Barr, M. Bass, F. Bay, A. Bhat, K. Bhattacharya, M. Bishai, A. Blake, T. Bolton, L. Camilleri, D. Caratelli, I. Caro Terrazas, R. Carr, R. Castillo Fernandez, F. Cavanna , et al. (148 additional authors not shown)

Abstract: We have developed a convolutional neural network (CNN) that can make a pixel-level prediction of objects in image data recorded by a liquid argon time projection chamber (LArTPC) for the first time. We describe the network design, training techniques, and software tools developed to train this network. The goal of this work is to develop a complete deep neural network based data reconstruction cha… ▽ More We have developed a convolutional neural network (CNN) that can make a pixel-level prediction of objects in image data recorded by a liquid argon time projection chamber (LArTPC) for the first time. We describe the network design, training techniques, and software tools developed to train this network. The goal of this work is to develop a complete deep neural network based data reconstruction chain for the MicroBooNE detector. We show the first demonstration of a network's validity on real LArTPC data using MicroBooNE collection plane images. The demonstration is performed for stopping muon and a $ν_μ$ charged current neutral pion data samples. △ Less

Submitted 22 August, 2018; originally announced August 2018.

Journal ref: Phys. Rev. D 99, 092001 (2019)

arXiv:1804.05868 [pdf, other]

Universal Dependency Parsing for Hindi-English Code-switching

Authors: Irshad Ahmad Bhat, Riyaz Ahmad Bhat, Manish Shrivastava, Dipti Misra Sharma

Abstract: Code-switching is a phenomenon of mixing grammatical structures of two or more languages under varied social constraints. The code-switching data differ so radically from the benchmark corpora used in NLP community that the application of standard technologies to these data degrades their performance sharply. Unlike standard corpora, these data often need to go through additional processes such as… ▽ More Code-switching is a phenomenon of mixing grammatical structures of two or more languages under varied social constraints. The code-switching data differ so radically from the benchmark corpora used in NLP community that the application of standard technologies to these data degrades their performance sharply. Unlike standard corpora, these data often need to go through additional processes such as language identification, normalization and/or back-transliteration for their efficient processing. In this paper, we investigate these indispensable processes and other problems associated with syntactic parsing of code-switching data and propose methods to mitigate their effects. In particular, we study dependency parsing of code-switching data of Hindi and English multilingual speakers from Twitter. We present a treebank of Hindi-English code-switching tweets under Universal Dependencies scheme and propose a neural stacking model for parsing that efficiently leverages part-of-speech tag and syntactic tree annotations in the code-switching treebank and the preexisting Hindi and English treebanks. We also present normalization and back-transliteration models with a decoding process tailored for code-switching data. Results show that our neural stacking parser is 1.5% LAS points better than the augmented parsing model and our decoding process improves results by 3.8% LAS points over the first-best normalization and/or back-transliteration. △ Less

Submitted 24 April, 2018; v1 submitted 16 April, 2018; originally announced April 2018.

arXiv:1703.10772 [pdf, ps, other]

Joining Hands: Exploiting Monolingual Treebanks for Parsing of Code-mixing Data

Authors: Irshad Ahmad Bhat, Riyaz Ahmad Bhat, Manish Shrivastava, Dipti Misra Sharma

Abstract: In this paper, we propose efficient and less resource-intensive strategies for parsing of code-mixed data. These strategies are not constrained by in-domain annotations, rather they leverage pre-existing monolingual annotated resources for training. We show that these methods can produce significantly better results as compared to an informed baseline. Besides, we also present a data set of 450 Hi… ▽ More In this paper, we propose efficient and less resource-intensive strategies for parsing of code-mixed data. These strategies are not constrained by in-domain annotations, rather they leverage pre-existing monolingual annotated resources for training. We show that these methods can produce significantly better results as compared to an informed baseline. Besides, we also present a data set of 450 Hindi and English code-mixed tweets of Hindi multilingual speakers for evaluation. The data set is manually annotated with Universal Dependencies. △ Less

Submitted 31 March, 2017; originally announced March 2017.

Comments: 5 pages, EACL 2017 short paper

Showing 1–43 of 43 results for author: Bhat, A