subscribe to arXiv mailings

A Survey on Privacy Attacks Against Digital Twin Systems in AI-Robotics

Authors: Ivan A. Fernandez, Subash Neupane, Trisha Chakraborty, Shaswata Mitra, Sudip Mittal, Nisha Pillai, Jingdao Chen, Shahram Rahimi

Abstract: Industry 4.0 has witnessed the rise of complex robots fueled by the integration of Artificial Intelligence/Machine Learning (AI/ML) and Digital Twin (DT) technologies. While these technologies offer numerous benefits, they also introduce potential privacy and security risks. This paper surveys privacy attacks targeting robots enabled by AI and DT models. Exfiltration and data leakage of ML models… ▽ More Industry 4.0 has witnessed the rise of complex robots fueled by the integration of Artificial Intelligence/Machine Learning (AI/ML) and Digital Twin (DT) technologies. While these technologies offer numerous benefits, they also introduce potential privacy and security risks. This paper surveys privacy attacks targeting robots enabled by AI and DT models. Exfiltration and data leakage of ML models are discussed in addition to the potential extraction of models derived from first-principles (e.g., physics-based). We also discuss design considerations with DT-integrated robotics touching on the impact of ML model training, responsible AI and DT safeguards, data governance and ethical considerations on the effectiveness of these attacks. We advocate for a trusted autonomy approach, emphasizing the need to combine robotics, AI, and DT technologies with robust ethical frameworks and trustworthiness principles for secure and reliable AI robotic systems. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 10 pages, 3 figures, 1 table

arXiv:2406.16993 [pdf, other]

Are Vision xLSTM Embedded UNet More Reliable in Medical 3D Image Segmentation?

Authors: Pallabi Dutta, Soham Bose, Swalpa Kumar Roy, Sushmita Mitra

Abstract: The advancement of developing efficient medical image segmentation has evolved from initial dependence on Convolutional Neural Networks (CNNs) to the present investigation of hybrid models that combine CNNs with Vision Transformers. Furthermore, there is an increasing focus on creating architectures that are both high-performing in medical image segmentation tasks and computationally efficient to… ▽ More The advancement of developing efficient medical image segmentation has evolved from initial dependence on Convolutional Neural Networks (CNNs) to the present investigation of hybrid models that combine CNNs with Vision Transformers. Furthermore, there is an increasing focus on creating architectures that are both high-performing in medical image segmentation tasks and computationally efficient to be deployed on systems with limited resources. Although transformers have several advantages like capturing global dependencies in the input data, they face challenges such as high computational and memory complexity. This paper investigates the integration of CNNs and Vision Extended Long Short-Term Memory (Vision-xLSTM) models by introducing a novel approach called UVixLSTM. The Vision-xLSTM blocks captures temporal and global relationships within the patches extracted from the CNN feature maps. The convolutional feature reconstruction path upsamples the output volume from the Vision-xLSTM blocks to produce the segmentation output. Our primary objective is to propose that Vision-xLSTM forms a reliable backbone for medical image segmentation tasks, offering excellent segmentation performance and reduced computational complexity. UVixLSTM exhibits superior performance compared to state-of-the-art networks on the publicly-available Synapse dataset. Code is available at: https://github.com/duttapallabi2907/UVixLSTM △ Less

Submitted 24 June, 2024; originally announced June 2024.

arXiv:2406.04654 [pdf, other]

GenzIQA: Generalized Image Quality Assessment using Prompt-Guided Latent Diffusion Models

Authors: Diptanu De, Shankhanil Mitra, Rajiv Soundararajan

Abstract: The design of no-reference (NR) image quality assessment (IQA) algorithms is extremely important to benchmark and calibrate user experiences in modern visual systems. A major drawback of state-of-the-art NR-IQA methods is their limited ability to generalize across diverse IQA settings with reasonable distribution shifts. Recent text-to-image generative models such as latent diffusion models genera… ▽ More The design of no-reference (NR) image quality assessment (IQA) algorithms is extremely important to benchmark and calibrate user experiences in modern visual systems. A major drawback of state-of-the-art NR-IQA methods is their limited ability to generalize across diverse IQA settings with reasonable distribution shifts. Recent text-to-image generative models such as latent diffusion models generate meaningful visual concepts with fine details related to text concepts. In this work, we leverage the denoising process of such diffusion models for generalized IQA by understanding the degree of alignment between learnable quality-aware text prompts and images. In particular, we learn cross-attention maps from intermediate layers of the denoiser of latent diffusion models to capture quality-aware representations of images. In addition, we also introduce learnable quality-aware text prompts that enable the cross-attention features to be better quality-aware. Our extensive cross database experiments across various user-generated, synthetic, and low-light content-based benchmarking databases show that latent diffusion models can achieve superior generalization in IQA when compared to other methods in the literature. △ Less

Submitted 7 June, 2024; originally announced June 2024.

arXiv:2404.19341 [pdf, other]

Reliable or Deceptive? Investigating Gated Features for Smooth Visual Explanations in CNNs

Authors: Soham Mitra, Atri Sukul, Swalpa Kumar Roy, Pravendra Singh, Vinay Verma

Abstract: Deep learning models have achieved remarkable success across diverse domains. However, the intricate nature of these models often impedes a clear understanding of their decision-making processes. This is where Explainable AI (XAI) becomes indispensable, offering intuitive explanations for model decisions. In this work, we propose a simple yet highly effective approach, ScoreCAM++, which introduces… ▽ More Deep learning models have achieved remarkable success across diverse domains. However, the intricate nature of these models often impedes a clear understanding of their decision-making processes. This is where Explainable AI (XAI) becomes indispensable, offering intuitive explanations for model decisions. In this work, we propose a simple yet highly effective approach, ScoreCAM++, which introduces modifications to enhance the promising ScoreCAM method for visual explainability. Our proposed approach involves altering the normalization function within the activation layer utilized in ScoreCAM, resulting in significantly improved results compared to previous efforts. Additionally, we apply an activation function to the upsampled activation layers to enhance interpretability. This improvement is achieved by selectively gating lower-priority values within the activation layer. Through extensive experiments and qualitative comparisons, we demonstrate that ScoreCAM++ consistently achieves notably superior performance and fairness in interpreting the decision-making process compared to both ScoreCAM and previous methods. △ Less

Submitted 30 April, 2024; originally announced April 2024.

arXiv:2404.10547 [pdf, other]

A/B testing under Interference with Partial Network Information

Authors: Shiv Shankar, Ritwik Sinha, Yash Chandak, Saayan Mitra, Madalina Fiterau

Abstract: A/B tests are often required to be conducted on subjects that might have social connections. For e.g., experiments on social media, or medical and social interventions to control the spread of an epidemic. In such settings, the SUTVA assumption for randomized-controlled trials is violated due to network interference, or spill-over effects, as treatments to group A can potentially also affect the c… ▽ More A/B tests are often required to be conducted on subjects that might have social connections. For e.g., experiments on social media, or medical and social interventions to control the spread of an epidemic. In such settings, the SUTVA assumption for randomized-controlled trials is violated due to network interference, or spill-over effects, as treatments to group A can potentially also affect the control group B. When the underlying social network is known exactly, prior works have demonstrated how to conduct A/B tests adequately to estimate the global average treatment effect (GATE). However, in practice, it is often impossible to obtain knowledge about the exact underlying network. In this paper, we present UNITE: a novel estimator that relax this assumption and can identify GATE while only relying on knowledge of the superset of neighbors for any subject in the graph. Through theoretical analysis and extensive experiments, we show that the proposed approach performs better in comparison to standard estimators. △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: AISTATS 2024

arXiv:2403.08607 [pdf, other]

MedInsight: A Multi-Source Context Augmentation Framework for Generating Patient-Centric Medical Responses using Large Language Models

Authors: Subash Neupane, Shaswata Mitra, Sudip Mittal, Noorbakhsh Amiri Golilarz, Shahram Rahimi, Amin Amirlatifi

Abstract: Large Language Models (LLMs) have shown impressive capabilities in generating human-like responses. However, their lack of domain-specific knowledge limits their applicability in healthcare settings, where contextual and comprehensive responses are vital. To address this challenge and enable the generation of patient-centric responses that are contextually relevant and comprehensive, we propose Me… ▽ More Large Language Models (LLMs) have shown impressive capabilities in generating human-like responses. However, their lack of domain-specific knowledge limits their applicability in healthcare settings, where contextual and comprehensive responses are vital. To address this challenge and enable the generation of patient-centric responses that are contextually relevant and comprehensive, we propose MedInsight:a novel retrieval augmented framework that augments LLM inputs (prompts) with relevant background information from multiple sources. MedInsight extracts pertinent details from the patient's medical record or consultation transcript. It then integrates information from authoritative medical textbooks and curated web resources based on the patient's health history and condition. By constructing an augmented context combining the patient's record with relevant medical knowledge, MedInsight generates enriched, patient-specific responses tailored for healthcare applications such as diagnosis, treatment recommendations, or patient education. Experiments on the MTSamples dataset validate MedInsight's effectiveness in generating contextually appropriate medical responses. Quantitative evaluation using the Ragas metric and TruLens for answer similarity and answer correctness demonstrates the model's efficacy. Furthermore, human evaluation studies involving Subject Matter Expert (SMEs) confirm MedInsight's utility, with moderate inter-rater agreement on the relevance and correctness of the generated responses. △ Less

Submitted 13 March, 2024; originally announced March 2024.

arXiv:2403.01882 [pdf, other]

Using Virtual Reality for Detection and Intervention of Depression -- A Systematic Literature Review

Authors: Mohammad Waqas, Y Pawankumar Gururaj, V D Shanmukha Mitra, Sai Anirudh Karri, Raghu Reddy, Syed Azeemuddin

Abstract: The use of emerging technologies like Virtual Reality (VR) in therapeutic settings has increased in the past few years. By incorporating VR, a mental health condition like depression can be assessed effectively, while also providing personalized motivation and meaningful engagement for treatment purposes. The integration of external sensors further enhances the engagement of the subjects with the… ▽ More The use of emerging technologies like Virtual Reality (VR) in therapeutic settings has increased in the past few years. By incorporating VR, a mental health condition like depression can be assessed effectively, while also providing personalized motivation and meaningful engagement for treatment purposes. The integration of external sensors further enhances the engagement of the subjects with the VR scenes. This paper presents a comprehensive review of existing literature on the detection and treatment of depression using VR. It explores various types of VR scenes, external hardware, innovative metrics, and targeted user studies conducted by researchers and professionals in the field. The paper also discusses potential requirements for designing VR scenes specifically tailored for depression assessment and treatment, with the aim of guiding future practitioners in this area. △ Less

Submitted 4 March, 2024; originally announced March 2024.

Comments: 8 pages, 2 figures, 3 tables, Conference full paper

arXiv:2403.01693 [pdf, other]

HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances

Authors: Supreeth Narasimhaswamy, Uttaran Bhattacharya, Xiang Chen, Ishita Dasgupta, Saayan Mitra, Minh Hoai

Abstract: Text-to-image generative models can generate high-quality humans, but realism is lost when generating hands. Common artifacts include irregular hand poses, shapes, incorrect numbers of fingers, and physically implausible finger orientations. To generate images with realistic hands, we propose a novel diffusion-based architecture called HanDiffuser that achieves realism by injecting hand embeddings… ▽ More Text-to-image generative models can generate high-quality humans, but realism is lost when generating hands. Common artifacts include irregular hand poses, shapes, incorrect numbers of fingers, and physically implausible finger orientations. To generate images with realistic hands, we propose a novel diffusion-based architecture called HanDiffuser that achieves realism by injecting hand embeddings in the generative process. HanDiffuser consists of two components: a Text-to-Hand-Params diffusion model to generate SMPL-Body and MANO-Hand parameters from input text prompts, and a Text-Guided Hand-Params-to-Image diffusion model to synthesize images by conditioning on the prompts and hand parameters generated by the previous component. We incorporate multiple aspects of hand representation, including 3D shapes and joint-level finger positions, orientations and articulations, for robust learning and reliable performance during inference. We conduct extensive quantitative and qualitative experiments and perform user studies to demonstrate the efficacy of our method in generating images with high-quality hands. △ Less

Submitted 21 April, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

Comments: Revisions: 1. Added a link to project page in the abstract, 2. Updated references and related work, 3. Fixed some grammatical errors

arXiv:2402.17067 [pdf, ps, other]

On Independent Samples Along the Langevin Diffusion and the Unadjusted Langevin Algorithm

Authors: Jiaming Liang, Siddharth Mitra, Andre Wibisono

Abstract: We study the rate at which the initial and current random variables become independent along a Markov chain, focusing on the Langevin diffusion in continuous time and the Unadjusted Langevin Algorithm (ULA) in discrete time. We measure the dependence between random variables via their mutual information. For the Langevin diffusion, we show the mutual information converges to $0$ exponentially fast… ▽ More We study the rate at which the initial and current random variables become independent along a Markov chain, focusing on the Langevin diffusion in continuous time and the Unadjusted Langevin Algorithm (ULA) in discrete time. We measure the dependence between random variables via their mutual information. For the Langevin diffusion, we show the mutual information converges to $0$ exponentially fast when the target is strongly log-concave, and at a polynomial rate when the target is weakly log-concave. These rates are analogous to the mixing time of the Langevin diffusion under similar assumptions. For the ULA, we show the mutual information converges to $0$ exponentially fast when the target is strongly log-concave and smooth. We prove our results by developing the mutual version of the mixing time analyses of these Markov chains. We also provide alternative proofs based on strong data processing inequalities for the Langevin diffusion and the ULA, and by showing regularity results for these processes in mutual information. △ Less

Submitted 26 February, 2024; originally announced February 2024.

Comments: 41 pages

arXiv:2402.16766 [pdf]

doi 10.1109/FIE58773.2023.10342966

The Paradox of Industrial Involvement in Engineering Higher Education

Authors: Srinjoy Mitra, Jean-Pierre Raskin

Abstract: This paper discusses the importance of reflective and socially conscious education in engineering schools, particularly within the EE/CS sector. While most engineering disciplines have historically aligned themselves with the demands of the technology industry, the lack of critical examination of industry practices and their impact on justice, equality, and sustainability is self-evident. Today, t… ▽ More This paper discusses the importance of reflective and socially conscious education in engineering schools, particularly within the EE/CS sector. While most engineering disciplines have historically aligned themselves with the demands of the technology industry, the lack of critical examination of industry practices and their impact on justice, equality, and sustainability is self-evident. Today, the for-profit engineering/technology companies, some of which are among the largest in the world, also shape the narrative of engineering education and research in universities. As engineering graduates form the largest cohorts within STEM disciplines in Western countries, they become future professionals who will work, lead, or even establish companies in this industry. Unfortunately, the curriculum within engineering education often lacks a deep understanding of social realities, an essential component of a comprehensive university education. Here we establish this unusual connection with the industry that has driven engineering higher education for several decades and its obvious negative impacts to society. We analyse this nexus and highlight the need for engineering schools to hold a more critical viewpoint. Given the wealth and power of modern technology companies, particularly in the ICT domain, questioning their techno-solutionism narrative is essential within the institutes of higher education. △ Less

Submitted 26 February, 2024; originally announced February 2024.

arXiv:2402.04541 [pdf, other]

BRI3L: A Brightness Illusion Image Dataset for Identification and Localization of Regions of Illusory Perception

Authors: Aniket Roy, Anirban Roy, Soma Mitra, Kuntal Ghosh

Abstract: Visual illusions play a significant role in understanding visual perception. Current methods in understanding and evaluating visual illusions are mostly deterministic filtering based approach and they evaluate on a handful of visual illusions, and the conclusions therefore, are not generic. To this end, we generate a large-scale dataset of 22,366 images (BRI3L: BRightness Illusion Image dataset fo… ▽ More Visual illusions play a significant role in understanding visual perception. Current methods in understanding and evaluating visual illusions are mostly deterministic filtering based approach and they evaluate on a handful of visual illusions, and the conclusions therefore, are not generic. To this end, we generate a large-scale dataset of 22,366 images (BRI3L: BRightness Illusion Image dataset for Identification and Localization of illusory perception) of the five types of brightness illusions and benchmark the dataset using data-driven neural network based approaches. The dataset contains label information - (1) whether a particular image is illusory/nonillusory, (2) the segmentation mask of the illusory region of the image. Hence, both the classification and segmentation task can be evaluated using this dataset. We follow the standard psychophysical experiments involving human subjects to validate the dataset. To the best of our knowledge, this is the first attempt to develop a dataset of visual illusions and benchmark using data-driven approach for illusion classification and localization. We consider five well-studied types of brightness illusions: 1) Hermann grid, 2) Simultaneous Brightness Contrast, 3) White illusion, 4) Grid illusion, and 5) Induced Grating illusion. Benchmarking on the dataset achieves 99.56% accuracy in illusion identification and 84.37% pixel accuracy in illusion localization. The application of deep learning model, it is shown, also generalizes over unseen brightness illusions like brightness assimilation to contrast transitions. We also test the ability of state-of-theart diffusion models to generate brightness illusions. We have provided all the code, dataset, instructions etc in the github repo: https://github.com/aniket004/BRI3L △ Less

Submitted 6 February, 2024; originally announced February 2024.

arXiv:2401.10036 [pdf, other]

LOCALINTEL: Generating Organizational Threat Intelligence from Global and Local Cyber Knowledge

Authors: Shaswata Mitra, Subash Neupane, Trisha Chakraborty, Sudip Mittal, Aritran Piplai, Manas Gaur, Shahram Rahimi

Abstract: Security Operations Center (SoC) analysts gather threat reports from openly accessible global threat databases and customize them manually to suit a particular organization's needs. These analysts also depend on internal repositories, which act as private local knowledge database for an organization. Credible cyber intelligence, critical operational details, and relevant organizational information… ▽ More Security Operations Center (SoC) analysts gather threat reports from openly accessible global threat databases and customize them manually to suit a particular organization's needs. These analysts also depend on internal repositories, which act as private local knowledge database for an organization. Credible cyber intelligence, critical operational details, and relevant organizational information are all stored in these local knowledge databases. Analysts undertake a labor intensive task utilizing these global and local knowledge databases to manually create organization's unique threat response and mitigation strategies. Recently, Large Language Models (LLMs) have shown the capability to efficiently process large diverse knowledge sources. We leverage this ability to process global and local knowledge databases to automate the generation of organization-specific threat intelligence. In this work, we present LOCALINTEL, a novel automated knowledge contextualization system that, upon prompting, retrieves threat reports from the global threat repositories and uses its local knowledge database to contextualize them for a specific organization. LOCALINTEL comprises of three key phases: global threat intelligence retrieval, local knowledge retrieval, and contextualized completion generation. The former retrieves intelligence from global threat repositories, while the second retrieves pertinent knowledge from the local knowledge database. Finally, the fusion of these knowledge sources is orchestrated through a generator to produce a contextualized completion. △ Less

Submitted 18 January, 2024; originally announced January 2024.

arXiv:2401.05680 [pdf, other]

Use of Graph Neural Networks in Aiding Defensive Cyber Operations

Authors: Shaswata Mitra, Trisha Chakraborty, Subash Neupane, Aritran Piplai, Sudip Mittal

Abstract: In an increasingly interconnected world, where information is the lifeblood of modern society, regular cyber-attacks sabotage the confidentiality, integrity, and availability of digital systems and information. Additionally, cyber-attacks differ depending on the objective and evolve rapidly to disguise defensive systems. However, a typical cyber-attack demonstrates a series of stages from attack i… ▽ More In an increasingly interconnected world, where information is the lifeblood of modern society, regular cyber-attacks sabotage the confidentiality, integrity, and availability of digital systems and information. Additionally, cyber-attacks differ depending on the objective and evolve rapidly to disguise defensive systems. However, a typical cyber-attack demonstrates a series of stages from attack initiation to final resolution, called an attack life cycle. These diverse characteristics and the relentless evolution of cyber attacks have led cyber defense to adopt modern approaches like Machine Learning to bolster defensive measures and break the attack life cycle. Among the adopted ML approaches, Graph Neural Networks have emerged as a promising approach for enhancing the effectiveness of defensive measures due to their ability to process and learn from heterogeneous cyber threat data. In this paper, we look into the application of GNNs in aiding to break each stage of one of the most renowned attack life cycles, the Lockheed Martin Cyber Kill Chain. We address each phase of CKC and discuss how GNNs contribute to preparing and preventing an attack from a defensive standpoint. Furthermore, We also discuss open research areas and further improvement scopes. △ Less

Submitted 11 January, 2024; originally announced January 2024.

Comments: 35 pages, 9 figures, 8 tables

arXiv:2401.00013 [pdf, other]

HITSnDIFFs: From Truth Discovery to Ability Discovery by Recovering Matrices with the Consecutive Ones Property

Authors: Zixuan Chen, Subhodeep Mitra, R Ravi, Wolfgang Gatterbauer

Abstract: We analyze a general problem in a crowd-sourced setting where one user asks a question (also called item) and other users return answers (also called labels) for this question. Different from existing crowd sourcing work which focuses on finding the most appropriate label for the question (the "truth"), our problem is to determine a ranking of the users based on their ability to answer questions.… ▽ More We analyze a general problem in a crowd-sourced setting where one user asks a question (also called item) and other users return answers (also called labels) for this question. Different from existing crowd sourcing work which focuses on finding the most appropriate label for the question (the "truth"), our problem is to determine a ranking of the users based on their ability to answer questions. We call this problem "ability discovery" to emphasize the connection to and duality with the more well-studied problem of "truth discovery". To model items and their labels in a principled way, we draw upon Item Response Theory (IRT) which is the widely accepted theory behind standardized tests such as SAT and GRE. We start from an idealized setting where the relative performance of users is consistent across items and better users choose better fitting labels for each item. We posit that a principled algorithmic solution to our more general problem should solve this ideal setting correctly and observe that the response matrices in this setting obey the Consecutive Ones Property (C1P). While C1P is well understood algorithmically with various discrete algorithms, we devise a novel variant of the HITS algorithm which we call "HITSNDIFFS" (or HND), and prove that it can recover the ideal C1P-permutation in case it exists. Unlike fast combinatorial algorithms for finding the consecutive ones permutation (if it exists), HND also returns an ordering when such a permutation does not exist. Thus it provides a principled heuristic for our problem that is guaranteed to return the correct answer in the ideal setting. Our experiments show that HND produces user rankings with robustly high accuracy compared to state-of-the-art truth discovery methods. We also show that our novel variant of HITS scales better in the number of users than ABH, the only prior spectral C1P reconstruction algorithm. △ Less

Submitted 21 December, 2023; originally announced January 2024.

Comments: 22 pages, 14 figures, long version of of ICDE 2024 conference paper

arXiv:2312.15425 [pdf, other]

Knowledge Guided Semi-Supervised Learning for Quality Assessment of User Generated Videos

Authors: Shankhanil Mitra, Rajiv Soundararajan

Abstract: Perceptual quality assessment of user generated content (UGC) videos is challenging due to the requirement of large scale human annotated videos for training. In this work, we address this challenge by first designing a self-supervised Spatio-Temporal Visual Quality Representation Learning (ST-VQRL) framework to generate robust quality aware features for videos. Then, we propose a dual-model based… ▽ More Perceptual quality assessment of user generated content (UGC) videos is challenging due to the requirement of large scale human annotated videos for training. In this work, we address this challenge by first designing a self-supervised Spatio-Temporal Visual Quality Representation Learning (ST-VQRL) framework to generate robust quality aware features for videos. Then, we propose a dual-model based Semi Supervised Learning (SSL) method specifically designed for the Video Quality Assessment (SSL-VQA) task, through a novel knowledge transfer of quality predictions between the two models. Our SSL-VQA method uses the ST-VQRL backbone to produce robust performances across various VQA datasets including cross-database settings, despite being learned with limited human annotated videos. Our model improves the state-of-the-art performance when trained only with limited data by around 10%, and by around 15% when unlabelled data is also used in SSL. Source codes and checkpoints are available at https://github.com/Shankhanil006/SSL-VQA. △ Less

Submitted 24 December, 2023; originally announced December 2023.

Comments: Accepted to 38th AAAI conference on AI (AAAI 24)

arXiv:2312.15036 [pdf, other]

doi 10.1145/3583740.3626617

SODA: Protecting Proprietary Information in On-Device Machine Learning Models

Authors: Akanksha Atrey, Ritwik Sinha, Saayan Mitra, Prashant Shenoy

Abstract: The growth of low-end hardware has led to a proliferation of machine learning-based services in edge applications. These applications gather contextual information about users and provide some services, such as personalized offers, through a machine learning (ML) model. A growing practice has been to deploy such ML models on the user's device to reduce latency, maintain user privacy, and minimize… ▽ More The growth of low-end hardware has led to a proliferation of machine learning-based services in edge applications. These applications gather contextual information about users and provide some services, such as personalized offers, through a machine learning (ML) model. A growing practice has been to deploy such ML models on the user's device to reduce latency, maintain user privacy, and minimize continuous reliance on a centralized source. However, deploying ML models on the user's edge device can leak proprietary information about the service provider. In this work, we investigate on-device ML models that are used to provide mobile services and demonstrate how simple attacks can leak proprietary information of the service provider. We show that different adversaries can easily exploit such models to maximize their profit and accomplish content theft. Motivated by the need to thwart such attacks, we present an end-to-end framework, SODA, for deploying and serving on edge devices while defending against adversarial usage. Our results demonstrate that SODA can detect adversarial usage with 89% accuracy in less than 50 queries with minimal impact on service performance, latency, and storage. △ Less

Submitted 22 December, 2023; originally announced December 2023.

Journal ref: ACM/IEEE Symposium on Edge Computing 2023

arXiv:2312.13427 [pdf, other]

doi 10.1145/3626762

R2D2: Reducing Redundancy and Duplication in Data Lakes

Authors: Raunak Shah, Koyel Mukherjee, Atharv Tyagi, Sai Keerthana Karnam, Dhruv Joshi, Shivam Bhosale, Subrata Mitra

Abstract: Enterprise data lakes often suffer from substantial amounts of duplicate and redundant data, with data volumes ranging from terabytes to petabytes. This leads to both increased storage costs and unnecessarily high maintenance costs for these datasets. In this work, we focus on identifying and reducing redundancy in enterprise data lakes by addressing the problem of 'dataset containment'. To the be… ▽ More Enterprise data lakes often suffer from substantial amounts of duplicate and redundant data, with data volumes ranging from terabytes to petabytes. This leads to both increased storage costs and unnecessarily high maintenance costs for these datasets. In this work, we focus on identifying and reducing redundancy in enterprise data lakes by addressing the problem of 'dataset containment'. To the best of our knowledge, this is one of the first works that addresses table-level containment at a large scale. We propose R2D2: a three-step hierarchical pipeline that efficiently identifies almost all instances of containment by progressively reducing the search space in the data lake. It first builds (i) a schema containment graph, followed by (ii) statistical min-max pruning, and finally, (iii) content level pruning. We further propose minimizing the total storage and access costs by optimally identifying redundant datasets that can be deleted (and reconstructed on demand) while respecting latency constraints. We implement our system on Azure Databricks clusters using Apache Spark for enterprise data stored in ADLS Gen2, and on AWS clusters for open-source data. In contrast to existing modified baselines that are inaccurate or take several days to run, our pipeline can process an enterprise customer data lake at the TB scale in approximately 5 hours with high accuracy. We present theoretical results as well as extensive empirical validation on both enterprise (scale of TBs) and open-source datasets (scale of MBs - GBs), which showcase the effectiveness of our pipeline. △ Less

Submitted 20 December, 2023; originally announced December 2023.

Comments: The first two authors contributed equally. 25 pages, accepted to the International Conference on Management of Data (SIGMOD) 2024. ©Raunak Shah | ACM 2023. This is the author's version of the work. Not for redistribution. The definitive Version of Record was published in Proceedings of the ACM on Management of Data (PACMMOD), http://dx.doi.org/10.1145/3626762

Journal ref: Proc. ACM Manag. Data 1, 4, Article 268 (December 2023), 25 pages

arXiv:2312.04838 [pdf, other]

Learning Generalizable Perceptual Representations for Data-Efficient No-Reference Image Quality Assessment

Authors: Suhas Srinath, Shankhanil Mitra, Shika Rao, Rajiv Soundararajan

Abstract: No-reference (NR) image quality assessment (IQA) is an important tool in enhancing the user experience in diverse visual applications. A major drawback of state-of-the-art NR-IQA techniques is their reliance on a large number of human annotations to train models for a target IQA application. To mitigate this requirement, there is a need for unsupervised learning of generalizable quality representa… ▽ More No-reference (NR) image quality assessment (IQA) is an important tool in enhancing the user experience in diverse visual applications. A major drawback of state-of-the-art NR-IQA techniques is their reliance on a large number of human annotations to train models for a target IQA application. To mitigate this requirement, there is a need for unsupervised learning of generalizable quality representations that capture diverse distortions. We enable the learning of low-level quality features agnostic to distortion types by introducing a novel quality-aware contrastive loss. Further, we leverage the generalizability of vision-language models by fine-tuning one such model to extract high-level image quality information through relevant text prompts. The two sets of features are combined to effectively predict quality by training a simple regressor with very few samples on a target dataset. Additionally, we design zero-shot quality predictions from both pathways in a completely blind setting. Our experiments on diverse datasets encompassing various distortions show the generalizability of the features and their superior performance in the data-efficient and zero-shot settings. Code will be made available at https://github.com/suhas-srinath/GRepQ. △ Less

Submitted 8 December, 2023; originally announced December 2023.

Comments: Accepted to IEEE/CVF WACV 2024

arXiv:2312.04429 [pdf, other]

Approximate Caching for Efficiently Serving Diffusion Models

Authors: Shubham Agarwal, Subrata Mitra, Sarthak Chakraborty, Srikrishna Karanam, Koyel Mukherjee, Shiv Saini

Abstract: Text-to-image generation using diffusion models has seen explosive popularity owing to their ability in producing high quality images adhering to text prompts. However, production-grade diffusion model serving is a resource intensive task that not only require high-end GPUs which are expensive but also incurs considerable latency. In this paper, we introduce a technique called approximate-caching… ▽ More Text-to-image generation using diffusion models has seen explosive popularity owing to their ability in producing high quality images adhering to text prompts. However, production-grade diffusion model serving is a resource intensive task that not only require high-end GPUs which are expensive but also incurs considerable latency. In this paper, we introduce a technique called approximate-caching that can reduce such iterative denoising steps for an image generation based on a prompt by reusing intermediate noise states created during a prior image generation for similar prompts. Based on this idea, we present an end to end text-to-image system, Nirvana, that uses the approximate-caching with a novel cache management-policy Least Computationally Beneficial and Frequently Used (LCBFU) to provide % GPU compute savings, 19.8% end-to-end latency reduction and 19% dollar savings, on average, on two real production workloads. We further present an extensive characterization of real production text-to-image prompts from the perspective of caching, popularity and reuse of intermediate states in a large production environment. △ Less

Submitted 7 December, 2023; originally announced December 2023.

Comments: Accepted at NSDI'24

arXiv:2311.11509 [pdf, other]

Token-Level Adversarial Prompt Detection Based on Perplexity Measures and Contextual Information

Authors: Zhengmian Hu, Gang Wu, Saayan Mitra, Ruiyi Zhang, Tong Sun, Heng Huang, Viswanathan Swaminathan

Abstract: In recent years, Large Language Models (LLM) have emerged as pivotal tools in various applications. However, these models are susceptible to adversarial prompt attacks, where attackers can carefully curate input strings that mislead LLMs into generating incorrect or undesired outputs. Previous work has revealed that with relatively simple yet effective attacks based on discrete optimization, it is… ▽ More In recent years, Large Language Models (LLM) have emerged as pivotal tools in various applications. However, these models are susceptible to adversarial prompt attacks, where attackers can carefully curate input strings that mislead LLMs into generating incorrect or undesired outputs. Previous work has revealed that with relatively simple yet effective attacks based on discrete optimization, it is possible to generate adversarial prompts that bypass moderation and alignment of the models. This vulnerability to adversarial prompts underscores a significant concern regarding the robustness and reliability of LLMs. Our work aims to address this concern by introducing a novel approach to detecting adversarial prompts at a token level, leveraging the LLM's capability to predict the next token's probability. We measure the degree of the model's perplexity, where tokens predicted with high probability are considered normal, and those exhibiting high perplexity are flagged as adversarial. Additionaly, our method also integrates context understanding by incorporating neighboring token information to encourage the detection of contiguous adversarial prompt sequences. To this end, we design two algorithms for adversarial prompt detection: one based on optimization techniques and another on Probabilistic Graphical Models (PGM). Both methods are equipped with efficient solving methods, ensuring efficient adversarial prompt detection. Our token-level detection result can be visualized as heatmap overlays on the text sequence, allowing for a clearer and more intuitive representation of which part of the text may contain adversarial prompts. △ Less

Submitted 18 February, 2024; v1 submitted 19 November, 2023; originally announced November 2023.

arXiv:2311.11429 [pdf, other]

Fast Heavy Inner Product Identification Between Weights and Inputs in Neural Network Training

Authors: Lianke Qin, Saayan Mitra, Zhao Song, Yuanyuan Yang, Tianyi Zhou

Abstract: In this paper, we consider a heavy inner product identification problem, which generalizes the Light Bulb problem~(\cite{prr89}): Given two sets $A \subset \{-1,+1\}^d$ and $B \subset \{-1,+1\}^d$ with $|A|=|B| = n$, if there are exact $k$ pairs whose inner product passes a certain threshold, i.e., $\{(a_1, b_1), \cdots, (a_k, b_k)\} \subset A \times B$ such that… ▽ More In this paper, we consider a heavy inner product identification problem, which generalizes the Light Bulb problem~(\cite{prr89}): Given two sets $A \subset \{-1,+1\}^d$ and $B \subset \{-1,+1\}^d$ with $|A|=|B| = n$, if there are exact $k$ pairs whose inner product passes a certain threshold, i.e., $\{(a_1, b_1), \cdots, (a_k, b_k)\} \subset A \times B$ such that $\forall i \in [k], \langle a_i,b_i \rangle \geq ρ\cdot d$, for a threshold $ρ\in (0,1)$, the goal is to identify those $k$ heavy inner products. We provide an algorithm that runs in $O(n^{2 ω/ 3+ o(1)})$ time to find the $k$ inner product pairs that surpass $ρ\cdot d$ threshold with high probability, where $ω$ is the current matrix multiplication exponent. By solving this problem, our method speed up the training of neural networks with ReLU activation function. △ Less

Submitted 19 November, 2023; originally announced November 2023.

Comments: IEEE BigData 2023

arXiv:2311.10811 [pdf, other]

A novel post-hoc explanation comparison metric and applications

Authors: Shreyan Mitra, Leilani Gilpin

Abstract: Explanatory systems make the behavior of machine learning models more transparent, but are often inconsistent. To quantify the differences between explanatory systems, this paper presents the Shreyan Distance, a novel metric based on the weighted difference between ranked feature importance lists produced by such systems. This paper uses the Shreyan Distance to compare two explanatory systems, SHA… ▽ More Explanatory systems make the behavior of machine learning models more transparent, but are often inconsistent. To quantify the differences between explanatory systems, this paper presents the Shreyan Distance, a novel metric based on the weighted difference between ranked feature importance lists produced by such systems. This paper uses the Shreyan Distance to compare two explanatory systems, SHAP and LIME, for both regression and classification learning tasks. Because we find that the average Shreyan Distance varies significantly between these two tasks, we conclude that consistency between explainers not only depends on inherent properties of the explainers themselves, but also the type of learning task. This paper further contributes the XAISuite library, which integrates the Shreyan distance algorithm into machine learning pipelines. △ Less

Submitted 17 November, 2023; originally announced November 2023.

Comments: 8 pages, 4 figures, 2 tables, and 1 listing. arXiv admin note: substantial text overlap with arXiv:2304.08499

MSC Class: I.2

arXiv:2311.08652 [pdf, other]

Refining Perception Contracts: Case Studies in Vision-based Safe Auto-landing

Authors: Yangge Li, Benjamin C Yang, Yixuan Jia, Daniel Zhuang, Sayan Mitra

Abstract: Perception contracts provide a method for evaluating safety of control systems that use machine learning for perception. A perception contract is a specification for testing the ML components, and it gives a method for proving end-to-end system-level safety requirements. The feasibility of contract-based testing and assurance was established earlier in the context of straight lane keeping: a 3-dim… ▽ More Perception contracts provide a method for evaluating safety of control systems that use machine learning for perception. A perception contract is a specification for testing the ML components, and it gives a method for proving end-to-end system-level safety requirements. The feasibility of contract-based testing and assurance was established earlier in the context of straight lane keeping: a 3-dimensional system with relatively simple dynamics. This paper presents the analysis of two 6 and 12-dimensional flight control systems that use multi-stage, heterogeneous, ML-enabled perception. The paper advances methodology by introducing an algorithm for constructing data and requirement guided refinement of perception contracts (DaRePC). The resulting analysis provides testable contracts which establish the state and environment conditions under which an aircraft can safety touchdown on the runway and a drone can safely pass through a sequence of gates. It can also discover conditions (e.g., low-horizon sun) that can possibly violate the safety of the vision-based control system. △ Less

Submitted 14 November, 2023; originally announced November 2023.

arXiv:2311.04817 [pdf, other]

Decentralized Personalized Online Federated Learning

Authors: Renzhi Wu, Saayan Mitra, Xiang Chen, Anup Rao

Abstract: Vanilla federated learning does not support learning in an online environment, learning a personalized model on each client, and learning in a decentralized setting. There are existing methods extending federated learning in each of the three aspects. However, some important applications on enterprise edge servers (e.g. online item recommendation at global scale) involve the three aspects at the s… ▽ More Vanilla federated learning does not support learning in an online environment, learning a personalized model on each client, and learning in a decentralized setting. There are existing methods extending federated learning in each of the three aspects. However, some important applications on enterprise edge servers (e.g. online item recommendation at global scale) involve the three aspects at the same time. Therefore, we propose a new learning setting \textit{Decentralized Personalized Online Federated Learning} that considers all the three aspects at the same time. In this new setting for learning, the first technical challenge is how to aggregate the shared model parameters from neighboring clients to obtain a personalized local model with good performance on each client. We propose to directly learn an aggregation by optimizing the performance of the local model with respect to the aggregation weights. This not only improves personalization of each local model but also helps the local model adapting to potential data shift by intelligently incorporating the right amount of information from its neighbors. The second challenge is how to select the neighbors for each client. We propose a peer selection method based on the learned aggregation weights enabling each client to select the most helpful neighbors and reduce communication cost at the same time. We verify the effectiveness and robustness of our proposed method on three real-world item recommendation datasets and one air quality prediction dataset. △ Less

Submitted 8 November, 2023; originally announced November 2023.

Journal ref: IEEE BigData 2023

arXiv:2310.13227 [pdf, other]

ToolChain*: Efficient Action Space Navigation in Large Language Models with A* Search

Authors: Yuchen Zhuang, Xiang Chen, Tong Yu, Saayan Mitra, Victor Bursztyn, Ryan A. Rossi, Somdeb Sarkhel, Chao Zhang

Abstract: Large language models (LLMs) have demonstrated powerful decision-making and planning capabilities in solving complicated real-world problems. LLM-based autonomous agents can interact with diverse tools (e.g., functional APIs) and generate solution plans that execute a series of API function calls in a step-by-step manner. The multitude of candidate API function calls significantly expands the acti… ▽ More Large language models (LLMs) have demonstrated powerful decision-making and planning capabilities in solving complicated real-world problems. LLM-based autonomous agents can interact with diverse tools (e.g., functional APIs) and generate solution plans that execute a series of API function calls in a step-by-step manner. The multitude of candidate API function calls significantly expands the action space, amplifying the critical need for efficient action space navigation. However, existing methods either struggle with unidirectional exploration in expansive action spaces, trapped into a locally optimal solution, or suffer from exhaustively traversing all potential actions, causing inefficient navigation. To address these issues, we propose ToolChain*, an efficient tree search-based planning algorithm for LLM-based agents. It formulates the entire action space as a decision tree, where each node represents a possible API function call involved in a solution plan. By incorporating the A* search algorithm with task-specific cost function design, it efficiently prunes high-cost branches that may involve incorrect actions, identifying the most low-cost valid path as the solution. Extensive experiments on multiple tool-use and reasoning tasks demonstrate that ToolChain* efficiently balances exploration and exploitation within an expansive action space. It outperforms state-of-the-art baselines on planning and reasoning tasks by 3.1% and 3.5% on average while requiring 7.35x and 2.31x less time, respectively. △ Less

Submitted 19 October, 2023; originally announced October 2023.

arXiv:2310.08565 [pdf, other]

Security Considerations in AI-Robotics: A Survey of Current Methods, Challenges, and Opportunities

Authors: Subash Neupane, Shaswata Mitra, Ivan A. Fernandez, Swayamjit Saha, Sudip Mittal, Jingdao Chen, Nisha Pillai, Shahram Rahimi

Abstract: Robotics and Artificial Intelligence (AI) have been inextricably intertwined since their inception. Today, AI-Robotics systems have become an integral part of our daily lives, from robotic vacuum cleaners to semi-autonomous cars. These systems are built upon three fundamental architectural elements: perception, navigation and planning, and control. However, while the integration of AI-Robotics sys… ▽ More Robotics and Artificial Intelligence (AI) have been inextricably intertwined since their inception. Today, AI-Robotics systems have become an integral part of our daily lives, from robotic vacuum cleaners to semi-autonomous cars. These systems are built upon three fundamental architectural elements: perception, navigation and planning, and control. However, while the integration of AI-Robotics systems has enhanced the quality our lives, it has also presented a serious problem - these systems are vulnerable to security attacks. The physical components, algorithms, and data that make up AI-Robotics systems can be exploited by malicious actors, potentially leading to dire consequences. Motivated by the need to address the security concerns in AI-Robotics systems, this paper presents a comprehensive survey and taxonomy across three dimensions: attack surfaces, ethical and legal concerns, and Human-Robot Interaction (HRI) security. Our goal is to provide users, developers and other stakeholders with a holistic understanding of these areas to enhance the overall AI-Robotics system security. We begin by surveying potential attack surfaces and provide mitigating defensive strategies. We then delve into ethical issues, such as dependency and psychological impact, as well as the legal concerns regarding accountability for these systems. Besides, emerging trends such as HRI are discussed, considering privacy, integrity, safety, trustworthiness, and explainability concerns. Finally, we present our vision for future research directions in this dynamic and promising field. △ Less

Submitted 25 January, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

arXiv:2310.04288 [pdf, other]

Searching for Optimal Runtime Assurance via Reachability and Reinforcement Learning

Authors: Kristina Miller, Christopher K. Zeitler, William Shen, Kerianne Hobbs, Sayan Mitra, John Schierman, Mahesh Viswanathan

Abstract: A runtime assurance system (RTA) for a given plant enables the exercise of an untrusted or experimental controller while assuring safety with a backup (or safety) controller. The relevant computational design problem is to create a logic that assures safety by switching to the safety controller as needed, while maximizing some performance criteria, such as the utilization of the untrusted controll… ▽ More A runtime assurance system (RTA) for a given plant enables the exercise of an untrusted or experimental controller while assuring safety with a backup (or safety) controller. The relevant computational design problem is to create a logic that assures safety by switching to the safety controller as needed, while maximizing some performance criteria, such as the utilization of the untrusted controller. Existing RTA design strategies are well-known to be overly conservative and, in principle, can lead to safety violations. In this paper, we formulate the optimal RTA design problem and present a new approach for solving it. Our approach relies on reward shaping and reinforcement learning. It can guarantee safety and leverage machine learning technologies for scalability. We have implemented this algorithm and present experimental results comparing our approach with state-of-the-art reachability and simulation-based RTA approaches in a number of scenarios using aircraft models in 3D space with complex safety requirements. Our approach can guarantee safety while increasing utilization of the experimental controller over existing approaches. △ Less

Submitted 6 October, 2023; originally announced October 2023.

arXiv:2309.13515 [pdf, other]

Learning-based Inverse Perception Contracts and Applications

Authors: Dawei Sun, Benjamin C. Yang, Sayan Mitra

Abstract: Perception modules are integral in many modern autonomous systems, but their accuracy can be subject to the vagaries of the environment. In this paper, we propose a learning-based approach that can automatically characterize the error of a perception module from data and use this for safe control. The proposed approach constructs an inverse perception contract (IPC) which generates a set that cont… ▽ More Perception modules are integral in many modern autonomous systems, but their accuracy can be subject to the vagaries of the environment. In this paper, we propose a learning-based approach that can automatically characterize the error of a perception module from data and use this for safe control. The proposed approach constructs an inverse perception contract (IPC) which generates a set that contains the ground-truth value that is being estimated by the perception module, with high probability. We apply the proposed approach to study a vision pipeline deployed on a quadcopter. With the proposed approach, we successfully constructed an IPC for the vision pipeline. We then designed a control algorithm that utilizes the learned IPC, with the goal of landing the quadcopter safely on a landing pad. Experiments show that with the learned IPC, the control algorithm safely landed the quadcopter despite the error from the perception module, while the baseline algorithm without using the learned IPC failed to do so. △ Less

Submitted 3 March, 2024; v1 submitted 23 September, 2023; originally announced September 2023.

arXiv:2309.12355 [pdf]

Role of ICT Innovation in Perpetuating the Myth of Techno-Solutionism

Authors: Srinjoy Mitra, Jean-Pierre Raskin, Mario Pansera

Abstract: Innovation in Information and Communication Technology has become one of the key economic drivers of our technology dependent world. In popular notion, the tech industry or how ICT is often known has become synonymous to all technologies that drive modernity. Digital technologies have become so pervasive that it is hard to imagine new technology developments that are not totally or partially influ… ▽ More Innovation in Information and Communication Technology has become one of the key economic drivers of our technology dependent world. In popular notion, the tech industry or how ICT is often known has become synonymous to all technologies that drive modernity. Digital technologies have become so pervasive that it is hard to imagine new technology developments that are not totally or partially influenced by ICT innovations. Furthermore, the pace of innovation in ICT sector over the last few decades has been unprecedented in human history. In this paper we argue that, not only ICT had a tremendous impact on the way we communicate and produce but this innovation paradigm has crucially shaped collective expectations and imagination about what technology more broadly can actually deliver. These expectations have often crystalised into a widespread acceptance, among general public and policy makers, of technosolutionism. This is a belief that technology not restricted to ICT alone can solve all problems humanity is facing from poverty and inequality to ecosystem loss and climate change. In this paper we show the many impacts of relentless ICT innovation. The spectacular advances in this sector, coupled with corporate power that benefits from them have facilitated the uptake by governments and industries of an uncritical narrative of techno-optimist that neglects the complexity of the wicked problems that affect the present and future of humanity. △ Less

Submitted 1 September, 2023; originally announced September 2023.

arXiv:2309.10878 [pdf, other]

DeepliteRT: Computer Vision at the Edge

Authors: Saad Ashfaq, Alexander Hoffman, Saptarshi Mitra, Sudhakar Sah, MohammadHossein AskariHemmat, Ehsan Saboori

Abstract: The proliferation of edge devices has unlocked unprecedented opportunities for deep learning model deployment in computer vision applications. However, these complex models require considerable power, memory and compute resources that are typically not available on edge platforms. Ultra low-bit quantization presents an attractive solution to this problem by scaling down the model weights and activ… ▽ More The proliferation of edge devices has unlocked unprecedented opportunities for deep learning model deployment in computer vision applications. However, these complex models require considerable power, memory and compute resources that are typically not available on edge platforms. Ultra low-bit quantization presents an attractive solution to this problem by scaling down the model weights and activations from 32-bit to less than 8-bit. We implement highly optimized ultra low-bit convolution operators for ARM-based targets that outperform existing methods by up to 4.34x. Our operator is implemented within Deeplite Runtime (DeepliteRT), an end-to-end solution for the compilation, tuning, and inference of ultra low-bit models on ARM devices. Compiler passes in DeepliteRT automatically convert a fake-quantized model in full precision to a compact ultra low-bit representation, easing the process of quantized model deployment on commodity hardware. We analyze the performance of DeepliteRT on classification and detection models against optimized 32-bit floating-point, 8-bit integer, and 2-bit baselines, achieving significant speedups of up to 2.20x, 2.33x and 2.17x, respectively. △ Less

Submitted 19 September, 2023; originally announced September 2023.

Comments: Accepted at British Machine Vision Conference (BMVC) 2023

arXiv:2309.08814 [pdf, other]

URA*: Uncertainty-aware Path Planning using Image-based Aerial-to-Ground Traversability Estimation for Off-road Environments

Authors: Charles Moore, Shaswata Mitra, Nisha Pillai, Marc Moore, Sudip Mittal, Cindy Bethel, Jingdao Chen

Abstract: A major challenge with off-road autonomous navigation is the lack of maps or road markings that can be used to plan a path for autonomous robots. Classical path planning methods mostly assume a perfectly known environment without accounting for the inherent perception and sensing uncertainty from detecting terrain and obstacles in off-road environments. Recent work in computer vision and deep neur… ▽ More A major challenge with off-road autonomous navigation is the lack of maps or road markings that can be used to plan a path for autonomous robots. Classical path planning methods mostly assume a perfectly known environment without accounting for the inherent perception and sensing uncertainty from detecting terrain and obstacles in off-road environments. Recent work in computer vision and deep neural networks has advanced the capability of terrain traversability segmentation from raw images; however, the feasibility of using these noisy segmentation maps for navigation and path planning has not been adequately explored. To address this problem, this research proposes an uncertainty-aware path planning method, URA* using aerial images for autonomous navigation in off-road environments. An ensemble convolutional neural network (CNN) model is first used to perform pixel-level traversability estimation from aerial images of the region of interest. The traversability predictions are represented as a grid of traversal probability values. An uncertainty-aware planner is then applied to compute the best path from a start point to a goal point given these noisy traversal probability estimates. The proposed planner also incorporates replanning techniques to allow rapid replanning during online robot operation. The proposed method is evaluated on the Massachusetts Road Dataset, the DeepGlobe dataset, as well as a dataset of aerial images from off-road proving grounds at Mississippi State University. Results show that the proposed image segmentation and planning methods outperform conventional planning algorithms in terms of the quality and feasibility of the initial path, as well as the quality of replanned paths. △ Less

Submitted 15 September, 2023; originally announced September 2023.

arXiv:2308.11395 [pdf, other]

ULGss: A Strategy to construct a Library of Universal Logic Gates for $N$-variable Boolean Logic beyond NAND and NOR

Authors: Aadarsh G. Goenka, Shyamali Mitra, Mrinal K. Naskar, Nibaran Das

Abstract: In literature, NAND and NOR are two logic gates that display functional completeness, hence regarded as Universal gates. So, the present effort is focused on exploring a library of universal gates in binary that are still unexplored in literature along with a broad and systematic approach to classify the logic connectives. The study shows that the number of Universal Gates in any logic system grow… ▽ More In literature, NAND and NOR are two logic gates that display functional completeness, hence regarded as Universal gates. So, the present effort is focused on exploring a library of universal gates in binary that are still unexplored in literature along with a broad and systematic approach to classify the logic connectives. The study shows that the number of Universal Gates in any logic system grows exponentially with the number of input variables $N$. It is revealed that there are $56$ Universal gates in binary for $N=3$. It is shown that the ratio of the count of Universal gates to the total number of Logic gates is $\approx $ $\frac{1}{4}$ or 0.25. Adding constants $0,1$ allow for the creation of $4$ additional (for $N=2$) and $169$ additional Universal Gates (for $N=3$). In this article, the mathematical and logical underpinnings of the concept of universal logic gates are presented, along with a search strategy $ULG_{SS}$ exploring multiple paths leading to their identification. A fast-track approach has been introduced that uses the hexadecimal representation of a logic gate to quickly ascertain its attribute. △ Less

Submitted 22 August, 2023; originally announced August 2023.

Comments: 8 pages 10 tables 11 figures

arXiv:2307.14735 [pdf, other]

Test Time Adaptation for Blind Image Quality Assessment

Authors: Subhadeep Roy, Shankhanil Mitra, Soma Biswas, Rajiv Soundararajan

Abstract: While the design of blind image quality assessment (IQA) algorithms has improved significantly, the distribution shift between the training and testing scenarios often leads to a poor performance of these methods at inference time. This motivates the study of test time adaptation (TTA) techniques to improve their performance at inference time. Existing auxiliary tasks and loss functions used for T… ▽ More While the design of blind image quality assessment (IQA) algorithms has improved significantly, the distribution shift between the training and testing scenarios often leads to a poor performance of these methods at inference time. This motivates the study of test time adaptation (TTA) techniques to improve their performance at inference time. Existing auxiliary tasks and loss functions used for TTA may not be relevant for quality-aware adaptation of the pre-trained model. In this work, we introduce two novel quality-relevant auxiliary tasks at the batch and sample levels to enable TTA for blind IQA. In particular, we introduce a group contrastive loss at the batch level and a relative rank loss at the sample level to make the model quality aware and adapt to the target data. Our experiments reveal that even using a small batch of images from the test distribution helps achieve significant improvement in performance by updating the batch normalization statistics of the source model. △ Less

Submitted 26 September, 2023; v1 submitted 27 July, 2023; originally announced July 2023.

Comments: Accepted to ICCV 2023

arXiv:2307.13901 [pdf, other]

YOLOBench: Benchmarking Efficient Object Detectors on Embedded Systems

Authors: Ivan Lazarevich, Matteo Grimaldi, Ravish Kumar, Saptarshi Mitra, Shahrukh Khan, Sudhakar Sah

Abstract: We present YOLOBench, a benchmark comprised of 550+ YOLO-based object detection models on 4 different datasets and 4 different embedded hardware platforms (x86 CPU, ARM CPU, Nvidia GPU, NPU). We collect accuracy and latency numbers for a variety of YOLO-based one-stage detectors at different model scales by performing a fair, controlled comparison of these detectors with a fixed training environme… ▽ More We present YOLOBench, a benchmark comprised of 550+ YOLO-based object detection models on 4 different datasets and 4 different embedded hardware platforms (x86 CPU, ARM CPU, Nvidia GPU, NPU). We collect accuracy and latency numbers for a variety of YOLO-based one-stage detectors at different model scales by performing a fair, controlled comparison of these detectors with a fixed training environment (code and training hyperparameters). Pareto-optimality analysis of the collected data reveals that, if modern detection heads and training techniques are incorporated into the learning process, multiple architectures of the YOLO series achieve a good accuracy-latency trade-off, including older models like YOLOv3 and YOLOv4. We also evaluate training-free accuracy estimators used in neural architecture search on YOLOBench and demonstrate that, while most state-of-the-art zero-cost accuracy estimators are outperformed by a simple baseline like MAC count, some of them can be effectively used to predict Pareto-optimal detection models. We showcase that by using a zero-cost proxy to identify a YOLO architecture competitive against a state-of-the-art YOLOv8 model on a Raspberry Pi 4 CPU. The code and data are available at https://github.com/Deeplite/deeplite-torch-zoo △ Less

Submitted 21 August, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

arXiv:2306.11797 [pdf, other]

doi 10.1088/2632-2153/ad0938

Towards a robust and reliable deep learning approach for detection of compact binary mergers in gravitational wave data

Authors: Shreejit Jadhav, Mihir Shrivastava, Sanjit Mitra

Abstract: The ability of deep learning (DL) approaches to learn generalised signal and noise models, coupled with their fast inference on GPUs, holds great promise for enhancing gravitational-wave (GW) searches in terms of speed, parameter space coverage, and search sensitivity. However, the opaque nature of DL models severely harms their reliability. In this work, we meticulously develop a DL model stage-w… ▽ More The ability of deep learning (DL) approaches to learn generalised signal and noise models, coupled with their fast inference on GPUs, holds great promise for enhancing gravitational-wave (GW) searches in terms of speed, parameter space coverage, and search sensitivity. However, the opaque nature of DL models severely harms their reliability. In this work, we meticulously develop a DL model stage-wise and work towards improving its robustness and reliability. First, we address the problems in maintaining the purity of training data by deriving a new metric that better reflects the visual strength of the 'chirp' signal features in the data. Using a reduced, smooth representation obtained through a variational auto-encoder (VAE), we build a classifier to search for compact binary coalescence (CBC) signals. Our tests on real LIGO data show an impressive performance of the model. However, upon probing the robustness of the model through adversarial attacks, its simple failure modes were identified, underlining how such models can still be highly fragile. As a first step towards bringing robustness, we retrain the model in a novel framework involving a generative adversarial network (GAN). Over the course of training, the model learns to eliminate the primary modes of failure identified by the adversaries. Although absolute robustness is practically impossible to achieve, we demonstrate some fundamental improvements earned through such training, like sparseness and reduced degeneracy in the extracted features at different layers inside the model. We show that these gains are achieved at practically zero loss in terms of model performance on real LIGO data before and after GAN training. Through a direct search on 8.8 days of LIGO data, we recover two significant CBC events from GWTC-2.1, GW190519_153544 and GW190521_074359. We also report the search sensitivity obtained from an injection study. △ Less

Submitted 13 November, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

Comments: 22 pages, 22 figures

Journal ref: Mach. Learn.: Sci. Technol. 4 045028 (2023)

arXiv:2306.08803 [pdf, other]

Langevin Thompson Sampling with Logarithmic Communication: Bandits and Reinforcement Learning

Authors: Amin Karbasi, Nikki Lijing Kuang, Yi-An Ma, Siddharth Mitra

Abstract: Thompson sampling (TS) is widely used in sequential decision making due to its ease of use and appealing empirical performance. However, many existing analytical and empirical results for TS rely on restrictive assumptions on reward distributions, such as belonging to conjugate families, which limits their applicability in realistic scenarios. Moreover, sequential decision making problems are ofte… ▽ More Thompson sampling (TS) is widely used in sequential decision making due to its ease of use and appealing empirical performance. However, many existing analytical and empirical results for TS rely on restrictive assumptions on reward distributions, such as belonging to conjugate families, which limits their applicability in realistic scenarios. Moreover, sequential decision making problems are often carried out in a batched manner, either due to the inherent nature of the problem or to serve the purpose of reducing communication and computation costs. In this work, we jointly study these problems in two popular settings, namely, stochastic multi-armed bandits (MABs) and infinite-horizon reinforcement learning (RL), where TS is used to learn the unknown reward distributions and transition dynamics, respectively. We propose batched $\textit{Langevin Thompson Sampling}$ algorithms that leverage MCMC methods to sample from approximate posteriors with only logarithmic communication costs in terms of batches. Our algorithms are computationally efficient and maintain the same order-optimal regret guarantees of $\mathcal{O}(\log T)$ for stochastic MABs, and $\mathcal{O}(\sqrt{T})$ for RL. We complement our theoretical findings with experimental results. △ Less

Submitted 14 June, 2023; originally announced June 2023.

Comments: ICML 2023

ACM Class: G.3; I.2.0

arXiv:2306.04585 [pdf, other]

RTAEval: A framework for evaluating runtime assurance logic

Authors: Kristina Miller, Christopher K. Zeitler, William Shen, Mahesh Viswanathan, Sayan Mitra

Abstract: Runtime assurance (RTA) addresses the problem of keeping an autonomous system safe while using an untrusted (or experimental) controller. This can be done via logic that explicitly switches between the untrusted controller and a safety controller, or logic that filters the input provided by the untrusted controller. While several tools implement specific instances of RTAs, there is currently no fr… ▽ More Runtime assurance (RTA) addresses the problem of keeping an autonomous system safe while using an untrusted (or experimental) controller. This can be done via logic that explicitly switches between the untrusted controller and a safety controller, or logic that filters the input provided by the untrusted controller. While several tools implement specific instances of RTAs, there is currently no framework for evaluating different approaches. Given the importance of the RTA problem in building safe autonomous systems, an evaluation tool is needed. In this paper, we present the RTAEval framework as a low code framework that can be used to quickly evaluate different RTA logics for different types of agents in a variety of scenarios. RTAEval is designed to quickly create scenarios, run different RTA logics, and collect data that can be used to evaluate and visualize performance. In this paper, we describe different components of RTAEval and show how it can be used to create and evaluate scenarios involving multiple aircraft models. △ Less

Submitted 5 June, 2023; originally announced June 2023.

arXiv:2305.08993 [pdf, other]

Survey of Malware Analysis through Control Flow Graph using Machine Learning

Authors: Shaswata Mitra, Stephen A. Torri, Sudip Mittal

Abstract: Malware is a significant threat to the security of computer systems and networks which requires sophisticated techniques to analyze the behavior and functionality for detection. Traditional signature-based malware detection methods have become ineffective in detecting new and unknown malware due to their rapid evolution. One of the most promising techniques that can overcome the limitations of sig… ▽ More Malware is a significant threat to the security of computer systems and networks which requires sophisticated techniques to analyze the behavior and functionality for detection. Traditional signature-based malware detection methods have become ineffective in detecting new and unknown malware due to their rapid evolution. One of the most promising techniques that can overcome the limitations of signature-based detection is to use control flow graphs (CFGs). CFGs leverage the structural information of a program to represent the possible paths of execution as a graph, where nodes represent instructions and edges represent control flow dependencies. Machine learning (ML) algorithms are being used to extract these features from CFGs and classify them as malicious or benign. In this survey, we aim to review some state-of-the-art methods for malware detection through CFGs using ML, focusing on the different ways of extracting, representing, and classifying. Specifically, we present a comprehensive overview of different types of CFG features that have been used as well as different ML algorithms that have been applied to CFG-based malware detection. We provide an in-depth analysis of the challenges and limitations of these approaches, as well as suggest potential solutions to address some open problems and promising future directions for research in this field. △ Less

Submitted 20 June, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

arXiv:2304.14993 [pdf, ps, other]

ChatGPT in the Classroom: An Analysis of Its Strengths and Weaknesses for Solving Undergraduate Computer Science Questions

Authors: Ishika Joshi, Ritvik Budhiraja, Harshal Dev, Jahnvi Kadia, M. Osama Ataullah, Sayan Mitra, Dhruv Kumar, Harshal D. Akolekar

Abstract: ChatGPT is an AI language model developed by OpenAI that can understand and generate human-like text. It can be used for a variety of use cases such as language generation, question answering, text summarization, chatbot development, language translation, sentiment analysis, content creation, personalization, text completion, and storytelling. While ChatGPT has garnered significant positive attent… ▽ More ChatGPT is an AI language model developed by OpenAI that can understand and generate human-like text. It can be used for a variety of use cases such as language generation, question answering, text summarization, chatbot development, language translation, sentiment analysis, content creation, personalization, text completion, and storytelling. While ChatGPT has garnered significant positive attention, it has also generated a sense of apprehension and uncertainty in academic circles. There is concern that students may leverage ChatGPT to complete take-home assignments and exams and obtain favorable grades without genuinely acquiring knowledge. This paper adopts a quantitative approach to demonstrate ChatGPT's high degree of unreliability in answering a diverse range of questions pertaining to topics in undergraduate computer science. Our analysis shows that students may risk self-sabotage by blindly depending on ChatGPT to complete assignments and exams. We build upon this analysis to provide constructive recommendations to both students and instructors. △ Less

Submitted 6 October, 2023; v1 submitted 28 April, 2023; originally announced April 2023.

Comments: Accepted in SIGCSE TS 2024

arXiv:2304.09049 [pdf, other]

DeepGEMM: Accelerated Ultra Low-Precision Inference on CPU Architectures using Lookup Tables

Authors: Darshan C. Ganji, Saad Ashfaq, Ehsan Saboori, Sudhakar Sah, Saptarshi Mitra, MohammadHossein AskariHemmat, Alexander Hoffman, Ahmed Hassanien, Mathieu Léonardon

Abstract: A lot of recent progress has been made in ultra low-bit quantization, promising significant improvements in latency, memory footprint and energy consumption on edge devices. Quantization methods such as Learned Step Size Quantization can achieve model accuracy that is comparable to full-precision floating-point baselines even with sub-byte quantization. However, it is extremely challenging to depl… ▽ More A lot of recent progress has been made in ultra low-bit quantization, promising significant improvements in latency, memory footprint and energy consumption on edge devices. Quantization methods such as Learned Step Size Quantization can achieve model accuracy that is comparable to full-precision floating-point baselines even with sub-byte quantization. However, it is extremely challenging to deploy these ultra low-bit quantized models on mainstream CPU devices because commodity SIMD (Single Instruction, Multiple Data) hardware typically supports no less than 8-bit precision. To overcome this limitation, we propose DeepGEMM, a lookup table based approach for the execution of ultra low-precision convolutional neural networks on SIMD hardware. The proposed method precomputes all possible products of weights and activations, stores them in a lookup table, and efficiently accesses them at inference time to avoid costly multiply-accumulate operations. Our 2-bit implementation outperforms corresponding 8-bit integer kernels in the QNNPACK framework by up to 1.74x on x86 platforms. △ Less

Submitted 18 April, 2023; originally announced April 2023.

arXiv:2304.08499 [pdf, other]

The XAISuite framework and the implications of explanatory system dissonance

Authors: Shreyan Mitra, Leilani Gilpin

Abstract: Explanatory systems make machine learning models more transparent. However, they are often inconsistent. In order to quantify and isolate possible scenarios leading to this discrepancy, this paper compares two explanatory systems, SHAP and LIME, based on the correlation of their respective importance scores using 14 machine learning models (7 regression and 7 classification) and 4 tabular datasets… ▽ More Explanatory systems make machine learning models more transparent. However, they are often inconsistent. In order to quantify and isolate possible scenarios leading to this discrepancy, this paper compares two explanatory systems, SHAP and LIME, based on the correlation of their respective importance scores using 14 machine learning models (7 regression and 7 classification) and 4 tabular datasets (2 regression and 2 classification). We make two novel findings. Firstly, the magnitude of importance is not significant in explanation consistency. The correlations between SHAP and LIME importance scores for the most important features may or may not be more variable than the correlation between SHAP and LIME importance scores averaged across all features. Secondly, the similarity between SHAP and LIME importance scores cannot predict model accuracy. In the process of our research, we construct an open-source library, XAISuite, that unifies the process of training and explaining models. Finally, this paper contributes a generalized framework to better explain machine learning models and optimize their performance. △ Less

Submitted 15 April, 2023; originally announced April 2023.

Comments: 41 pages, 23 figures

arXiv:2302.06155 [pdf, other]

Identifying Semantically Difficult Samples to Improve Text Classification

Authors: Shashank Mujumdar, Stuti Mehta, Hima Patel, Suman Mitra

Abstract: In this paper, we investigate the effect of addressing difficult samples from a given text dataset on the downstream text classification task. We define difficult samples as being non-obvious cases for text classification by analysing them in the semantic embedding space; specifically - (i) semantically similar samples that belong to different classes and (ii) semantically dissimilar samples that… ▽ More In this paper, we investigate the effect of addressing difficult samples from a given text dataset on the downstream text classification task. We define difficult samples as being non-obvious cases for text classification by analysing them in the semantic embedding space; specifically - (i) semantically similar samples that belong to different classes and (ii) semantically dissimilar samples that belong to the same class. We propose a penalty function to measure the overall difficulty score of every sample in the dataset. We conduct exhaustive experiments on 13 standard datasets to show a consistent improvement of up to 9% and discuss qualitative results to show effectiveness of our approach in identifying difficult samples for a text classification model. △ Less

Submitted 13 February, 2023; originally announced February 2023.

arXiv:2301.11767 [pdf, other]

CAPoW: Context-Aware AI-Assisted Proof of Work based DDoS Defense

Authors: Trisha Chakraborty, Shaswata Mitra, Sudip Mittal

Abstract: Critical servers can be secured against distributed denial of service (DDoS) attacks using proof of work (PoW) systems assisted by an Artificial Intelligence (AI) that learns contextual network request patterns. In this work, we introduce CAPoW, a context-aware anti-DDoS framework that injects latency adaptively during communication by utilizing context-aware PoW puzzles. In CAPoW, a security prof… ▽ More Critical servers can be secured against distributed denial of service (DDoS) attacks using proof of work (PoW) systems assisted by an Artificial Intelligence (AI) that learns contextual network request patterns. In this work, we introduce CAPoW, a context-aware anti-DDoS framework that injects latency adaptively during communication by utilizing context-aware PoW puzzles. In CAPoW, a security professional can define relevant request context attributes which can be learned by the AI system. These contextual attributes can include information about the user request, such as IP address, time, flow-level information, etc., and are utilized to generate a contextual score for incoming requests that influence the hardness of a PoW puzzle. These puzzles need to be solved by a user before the server begins to process their request. Solving puzzles slow down the volume of incoming adversarial requests. Additionally, the framework compels the adversary to incur a cost per request, hence making it expensive for an adversary to prolong a DDoS attack. We include the theoretical foundations of the CAPoW framework along with a description of its implementation and evaluation. △ Less

Submitted 27 January, 2023; originally announced January 2023.

Comments: 8 pages

Journal ref: 20th International Conference on Security and Cryptography (SECRYPT 2023)

arXiv:2301.08714 [pdf, other]

Verse: A Python library for reasoning about multi-agent hybrid system scenarios

Authors: Yangge Li, Haoqing Zhu, Katherine Braught, Keyi Shen, Sayan Mitra

Abstract: We present the Verse library with the aim of making hybrid system verification more usable for multi-agent scenarios. In Verse, decision making agents move in a map and interact with each other through sensors. The decision logic for each agent is written in a subset of Python and the continuous dynamics is given by a black-box simulator. Multiple agents can be instantiated and they can be ported… ▽ More We present the Verse library with the aim of making hybrid system verification more usable for multi-agent scenarios. In Verse, decision making agents move in a map and interact with each other through sensors. The decision logic for each agent is written in a subset of Python and the continuous dynamics is given by a black-box simulator. Multiple agents can be instantiated and they can be ported to different maps for creating scenarios. Verse provides functions for simulating and verifying such scenarios using existing reachability analysis algorithms. We illustrate several capabilities and use cases of the library with heterogeneous agents, incremental verification, different sensor models, and the flexibility of plugging in different subroutines for post computations. △ Less

Submitted 22 January, 2023; v1 submitted 20 January, 2023; originally announced January 2023.

Comments: 26 pages, 16 figures

arXiv:2301.06961 [pdf, other]

Composite Deep Network with Feature Weighting for Improved Delineation of COVID Infection in Lung CT

Authors: Pallabi Dutta, Sushmita Mitra

Abstract: An early effective screening and grading of COVID-19 has become imperative towards optimizing the limited available resources of the medical facilities. An automated segmentation of the infected volumes in lung CT is expected to significantly aid in the diagnosis and care of patients. However, an accurate demarcation of lesions remains problematic due to their irregular structure and location(s) w… ▽ More An early effective screening and grading of COVID-19 has become imperative towards optimizing the limited available resources of the medical facilities. An automated segmentation of the infected volumes in lung CT is expected to significantly aid in the diagnosis and care of patients. However, an accurate demarcation of lesions remains problematic due to their irregular structure and location(s) within the lung. A novel deep learning architecture, Composite Deep network with Feature Weighting (CDNetFW), is proposed for efficient delineation of infected regions from lung CT images. Initially a coarser-segmentation is performed directly at shallower levels, thereby facilitating discovery of robust and discriminatory characteristics in the hidden layers. The novel feature weighting module helps prioritise relevant feature maps to be probed, along with those regions containing crucial information within these maps. This is followed by estimating the severity of the disease.The deep network CDNetFW has been shown to outperform several state-of-the-art architectures in the COVID-19 lesion segmentation task, as measured by experimental results on CT slices from publicly available datasets, especially when it comes to defining structures involving complex geometries. △ Less

Submitted 17 February, 2023; v1 submitted 17 January, 2023; originally announced January 2023.

arXiv:2212.12264 [pdf, other]

Collective Intelligent Strategy for Improved Segmentation of COVID-19 from CT

Authors: Surochita Pal Das, Sushmita Mitra, B. Uma Shankar

Abstract: The devastation caused by the coronavirus pandemic makes it imperative to design automated techniques for a fast and accurate detection. We propose a novel non-invasive tool, using deep learning and imaging, for delineating COVID-19 infection in lungs. The Ensembling Attention-based Multi-scaled Convolution network (EAMC), employing Leave-One-Patient-Out (LOPO) training, exhibits high sensitivity… ▽ More The devastation caused by the coronavirus pandemic makes it imperative to design automated techniques for a fast and accurate detection. We propose a novel non-invasive tool, using deep learning and imaging, for delineating COVID-19 infection in lungs. The Ensembling Attention-based Multi-scaled Convolution network (EAMC), employing Leave-One-Patient-Out (LOPO) training, exhibits high sensitivity and precision in outlining infected regions along with assessment of severity. The Attention module combines contextual with local information, at multiple scales, for accurate segmentation. Ensemble learning integrates heterogeneity of decision through different base classifiers. The superiority of EAMC, even with severe class imbalance, is established through comparison with existing state-of-the-art learning models over four publicly-available COVID-19 datasets. The results are suggestive of the relevance of deep learning in providing assistive intelligence to medical practitioners, when they are overburdened with patients as in pandemics. Its clinical significance lies in its unprecedented scope in providing low-cost decision-making for patients lacking specialized healthcare at remote locations. △ Less

Submitted 23 December, 2022; originally announced December 2022.

arXiv:2212.06225 [pdf, other]

Reinforced Approximate Exploratory Data Analysis

Authors: Shaddy Garg, Subrata Mitra, Tong Yu, Yash Gadhia, Arjun Kashettiwar

Abstract: Exploratory data analytics (EDA) is a sequential decision making process where analysts choose subsequent queries that might lead to some interesting insights based on the previous queries and corresponding results. Data processing systems often execute the queries on samples to produce results with low latency. Different downsampling strategy preserves different statistics of the data and have di… ▽ More Exploratory data analytics (EDA) is a sequential decision making process where analysts choose subsequent queries that might lead to some interesting insights based on the previous queries and corresponding results. Data processing systems often execute the queries on samples to produce results with low latency. Different downsampling strategy preserves different statistics of the data and have different magnitude of latency reductions. The optimum choice of sampling strategy often depends on the particular context of the analysis flow and the hidden intent of the analyst. In this paper, we are the first to consider the impact of sampling in interactive data exploration settings as they introduce approximation errors. We propose a Deep Reinforcement Learning (DRL) based framework which can optimize the sample selection in order to keep the analysis and insight generation flow intact. Evaluations with 3 real datasets show that our technique can preserve the original insight generation flow while improving the interaction latency, compared to baseline methods. △ Less

Submitted 12 December, 2022; originally announced December 2022.

Comments: Appears in the 37th AAAI Conference on Artificial Intelligence (AAAI), 2023

arXiv:2211.03758 [pdf, other]

Privacy Aware Experiments without Cookies

Authors: Shiv Shankar, Ritwik Sinha, Saayan Mitra, Viswanathan Swaminathan, Sridhar Mahadevan, Moumita Sinha

Abstract: Consider two brands that want to jointly test alternate web experiences for their customers with an A/B test. Such collaborative tests are today enabled using \textit{third-party cookies}, where each brand has information on the identity of visitors to another website. With the imminent elimination of third-party cookies, such A/B tests will become untenable. We propose a two-stage experimental de… ▽ More Consider two brands that want to jointly test alternate web experiences for their customers with an A/B test. Such collaborative tests are today enabled using \textit{third-party cookies}, where each brand has information on the identity of visitors to another website. With the imminent elimination of third-party cookies, such A/B tests will become untenable. We propose a two-stage experimental design, where the two brands only need to agree on high-level aggregate parameters of the experiment to test the alternate experiences. Our design respects the privacy of customers. We propose an estimater of the Average Treatment Effect (ATE), show that it is unbiased and theoretically compute its variance. Our demonstration describes how a marketer for a brand can design such an experiment and analyze the results. On real and simulated data, we show that the approach provides valid estimate of the ATE with low variance and is robust to the proportion of visitors overlapping across the brands. △ Less

Submitted 6 February, 2023; v1 submitted 3 November, 2022; originally announced November 2022.

Comments: Technical report supplementing paper accepted to WSDM 23

arXiv:2210.15571 [pdf, other]

Full-scale Deeply Supervised Attention Network for Segmenting COVID-19 Lesions

Authors: Pallabi Dutta, Sushmita Mitra

Abstract: Automated delineation of COVID-19 lesions from lung CT scans aids the diagnosis and prognosis for patients. The asymmetric shapes and positioning of the infected regions make the task extremely difficult. Capturing information at multiple scales will assist in deciphering features, at global and local levels, to encompass lesions of variable size and texture. We introduce the Full-scale Deeply Sup… ▽ More Automated delineation of COVID-19 lesions from lung CT scans aids the diagnosis and prognosis for patients. The asymmetric shapes and positioning of the infected regions make the task extremely difficult. Capturing information at multiple scales will assist in deciphering features, at global and local levels, to encompass lesions of variable size and texture. We introduce the Full-scale Deeply Supervised Attention Network (FuDSA-Net), for efficient segmentation of corona-infected lung areas in CT images. The model considers activation responses from all levels of the encoding path, encompassing multi-scalar features acquired at different levels of the network. This helps segment target regions (lesions) of varying shape, size and contrast. Incorporation of the entire gamut of multi-scalar characteristics into the novel attention mechanism helps prioritize the selection of activation responses and locations containing useful information. Determining robust and discriminatory features along the decoder path is facilitated with deep supervision. Connections in the decoder arm are remodeled to handle the issue of vanishing gradient. As observed from the experimental results, FuDSA-Net surpasses other state-of-the-art architectures; especially, when it comes to characterizing complicated geometries of the lesions. △ Less

Submitted 27 October, 2022; originally announced October 2022.

arXiv:2210.08974 [pdf]

Coordinated Science Laboratory 70th Anniversary Symposium: The Future of Computing

Authors: Klara Nahrstedt, Naresh Shanbhag, Vikram Adve, Nancy Amato, Romit Roy Choudhury, Carl Gunter, Nam Sung Kim, Olgica Milenkovic, Sayan Mitra, Lav Varshney, Yurii Vlasov, Sarita Adve, Rashid Bashir, Andreas Cangellaris, James DiCarlo, Katie Driggs-Campbell, Nick Feamster, Mattia Gazzola, Karrie Karahalios, Sanmi Koyejo, Paul Kwiat, Bo Li, Negar Mehr, Ravish Mehra, Andrew Miller , et al. (3 additional authors not shown)

Abstract: In 2021, the Coordinated Science Laboratory CSL, an Interdisciplinary Research Unit at the University of Illinois Urbana-Champaign, hosted the Future of Computing Symposium to celebrate its 70th anniversary. CSL's research covers the full computing stack, computing's impact on society and the resulting need for social responsibility. In this white paper, we summarize the major technological points… ▽ More In 2021, the Coordinated Science Laboratory CSL, an Interdisciplinary Research Unit at the University of Illinois Urbana-Champaign, hosted the Future of Computing Symposium to celebrate its 70th anniversary. CSL's research covers the full computing stack, computing's impact on society and the resulting need for social responsibility. In this white paper, we summarize the major technological points, insights, and directions that speakers brought forward during the Future of Computing Symposium. Participants discussed topics related to new computing paradigms, technologies, algorithms, behaviors, and research challenges to be expected in the future. The symposium focused on new computing paradigms that are going beyond traditional computing and the research needed to support their realization. These needs included stressing security and privacy, the end to end human cyber physical systems and with them the analysis of the end to end artificial intelligence needs. Furthermore, advances that enable immersive environments for users, the boundaries between humans and machines will blur and become seamless. Particular integration challenges were made clear in the final discussion on the integration of autonomous driving, robo taxis, pedestrians, and future cities. Innovative approaches were outlined to motivate the next generation of researchers to work on these challenges. The discussion brought out the importance of considering not just individual research areas, but innovations at the intersections between computing research efforts and relevant application domains, such as health care, transportation, energy systems, and manufacturing. △ Less

Submitted 4 October, 2022; originally announced October 2022.

Showing 1–50 of 132 results for author: Mitra, S