subscribe to arXiv mailings

Reliable Quantum Memories with Unreliable Components

Authors: Anuj K. Nayak, Eric Chitambar, Lav R. Varshney

Abstract: Quantum memory systems are vital in quantum information processing for dependable storage and retrieval of quantum states. Inspired by classical reliability theories that synthesize reliable computing systems from unreliable components, we formalize the problem of reliable storage of quantum information using noisy components. We introduce the notion of stable quantum memories and define the stora… ▽ More Quantum memory systems are vital in quantum information processing for dependable storage and retrieval of quantum states. Inspired by classical reliability theories that synthesize reliable computing systems from unreliable components, we formalize the problem of reliable storage of quantum information using noisy components. We introduce the notion of stable quantum memories and define the storage rate as the ratio of the number of logical qubits to the total number of physical qubits, as well as the circuit complexity of the decoder, which includes both quantum gates and measurements. We demonstrate that a strictly positive storage rate can be achieved by constructing a quantum memory system with quantum expander codes. Moreover, by reducing the reliable storage problem to reliable quantum communication, we provide upper bounds on the achievable storage capacity. In the case of physical qubits corrupted by noise satisfying hypercontractivity conditions, we provide a tighter upper bound on storage capacity using an entropy dissipation argument. Furthermore, observing that the time complexity of the decoder scales non-trivially with the number of physical qubits, achieving asymptotic rates may not be possible due to the induced dependence of the noise on the number of physical qubits. In this constrained non-asymptotic setting, we derive upper bounds on storage capacity using finite blocklength communication bounds. Finally, we numerically analyze the gap between upper and lower bounds in both asymptotic and non-asymptotic cases, and provide suggestions to tighten the gap. △ Less

Submitted 8 June, 2024; originally announced June 2024.

Comments: 15 pages, 3 figures

arXiv:2405.04437 [pdf, other]

vAttention: Dynamic Memory Management for Serving LLMs without PagedAttention

Authors: Ramya Prabhu, Ajay Nayak, Jayashree Mohan, Ramachandran Ramjee, Ashish Panwar

Abstract: Efficient management of GPU memory is essential for high throughput LLM inference. Prior systems used to reserve KV-cache memory ahead-of-time that resulted in wasted capacity due to internal fragmentation. Inspired by demand paging, vLLM proposed PagedAttention to enable dynamic memory allocation for KV-cache. This approach eliminates fragmentation and improves serving throughout. However, to be… ▽ More Efficient management of GPU memory is essential for high throughput LLM inference. Prior systems used to reserve KV-cache memory ahead-of-time that resulted in wasted capacity due to internal fragmentation. Inspired by demand paging, vLLM proposed PagedAttention to enable dynamic memory allocation for KV-cache. This approach eliminates fragmentation and improves serving throughout. However, to be able to allocate physical memory dynamically, PagedAttention changes the layout of KV-cache from contiguous virtual memory to non-contiguous virtual memory. As a consequence, one needs to rewrite the attention kernels to support paging, and implement a memory manager in the serving framework. This results in both performance and programming overheads, as well as portability challenges in adopting state-of-the-art attention kernels. In this paper, we propose vAttention, a new approach for dynamic KV-cache memory management. In contrast to PagedAttention, vAttention stores KV-cache in contiguous virtual memory and leverages OS support for on-demand allocation of physical memory. vAttention thus enables one to use state-of-the art attention kernels out-of-the-box by adding support for dynamic allocation of physical memory without having to re-write their code. We implement vAttention in the vLLM serving stack to show that it also helps improve decode throughput by up to 1.99x over vLLM, and the end-to-end serving throughput by up to 1.22x and 1.29x, compared to using the state-of-the-art PagedAttention based kernels of FlashAttention and FlashInfer. △ Less

Submitted 12 July, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

Comments: 14 pages, 13 figures, 10 tables

arXiv:2404.12415 [pdf]

Soil Fertility Prediction Using Combined USB-microscope Based Soil Image, Auxiliary Variables, and Portable X-Ray Fluorescence Spectrometry

Authors: Shubhadip Dasgupta, Satwik Pate, Divya Rathore, L. G. Divyanth, Ayan Das, Anshuman Nayak, Subhadip Dey, Asim Biswas, David C. Weindorf, Bin Li, Sergio Henrique Godinho Silva, Bruno Teixeira Ribeiro, Sanjay Srivastava, Somsubhra Chakraborty

Abstract: This study explored the application of portable X-ray fluorescence (PXRF) spectrometry and soil image analysis to rapidly assess soil fertility, focusing on critical parameters such as available B, organic carbon (OC), available Mn, available S, and the sulfur availability index (SAI). Analyzing 1,133 soil samples from various agro-climatic zones in Eastern India, the research combined color and t… ▽ More This study explored the application of portable X-ray fluorescence (PXRF) spectrometry and soil image analysis to rapidly assess soil fertility, focusing on critical parameters such as available B, organic carbon (OC), available Mn, available S, and the sulfur availability index (SAI). Analyzing 1,133 soil samples from various agro-climatic zones in Eastern India, the research combined color and texture features from microscopic soil images, PXRF data, and auxiliary soil variables (AVs) using a Random Forest model. Results indicated that integrating image features (IFs) with auxiliary variables (AVs) significantly enhanced prediction accuracy for available B (R^2 = 0.80) and OC (R^2 = 0.88). A data fusion approach, incorporating IFs, AVs, and PXRF data, further improved predictions for available Mn and SAI with R^2 values of 0.72 and 0.70, respectively. The study demonstrated how these integrated technologies have the potential to provide quick and affordable options for soil testing, opening up access to more sophisticated prediction models and a better comprehension of the fertility and health of the soil. Future research should focus on the application of deep learning models on a larger dataset of soil images, developed using soils from a broader range of agro-climatic zones under field condition. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: 37 pages, 10 figures; manuscript under peer-review for publication in the jounral 'Computers and Electronics in Agriculture'

arXiv:2403.03295 [pdf, ps, other]

Proper vs Improper Quantum PAC learning

Authors: Ashwin Nayak, Pulkit Sinha

Abstract: A basic question in the PAC model of learning is whether proper learning is harder than improper learning. In the classical case, there are examples of concept classes with VC dimension $d$ that have sample complexity $Ω\left(\frac dε\log\frac1ε\right)$ for proper learning with error $ε$, while the complexity for improper learning is O$\!\left(\frac dε\right)$. One such example arises from the Cou… ▽ More A basic question in the PAC model of learning is whether proper learning is harder than improper learning. In the classical case, there are examples of concept classes with VC dimension $d$ that have sample complexity $Ω\left(\frac dε\log\frac1ε\right)$ for proper learning with error $ε$, while the complexity for improper learning is O$\!\left(\frac dε\right)$. One such example arises from the Coupon Collector problem. Motivated by the efficiency of proper versus improper learning with quantum samples, Arunachalam, Belovs, Childs, Kothari, Rosmanis, and de Wolf (TQC 2020) studied an analogue, the Quantum Coupon Collector problem. Curiously, they discovered that for learning size $k$ subsets of $[n]$ the problem has sample complexity $Θ(k\log\min\{k,n-k+1\})$, in contrast with the complexity of $Θ(k\log k)$ for Coupon Collector. This effectively negates the possibility of a separation between the two modes of learning via the quantum problem, and Arunachalam et al.\ posed the possibility of such a separation as an open question. In this work, we first present an algorithm for the Quantum Coupon Collector problem with sample complexity that matches the sharper lower bound of $(1-o_k(1))k\ln\min\{k,n-k+1\}$ shown recently by Bab Hadiashar, Nayak, and Sinha (IEEE TIT 2024), for the entire range of the parameter $k$. Next, we devise a variant of the problem, the Quantum Padded Coupon Collector. We prove that its sample complexity matches that of the classical Coupon Collector problem for both modes of learning, thereby exhibiting the same asymptotic separation between proper and improper quantum learning as mentioned above. The techniques we develop in the process can be directly applied to any form of padded quantum data. We hope that padding can more generally lift other forms of classical learning behaviour to the quantum setting. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: 23 Pages

arXiv:2312.05003 [pdf, other]

On the Regret of Online Coded Caching

Authors: Anupam Nayak, Sheel Shah, Nikhil Karamchandani

Abstract: We consider the widely studied problem of coded caching under non-uniform requests where users independently request files according to some underlying popularity distribution in each slot. This work is a first step towards analyzing this framework through the lens of online learning. We consider the case where the underlying request distribution is apriori unknown and propose an online policy as… ▽ More We consider the widely studied problem of coded caching under non-uniform requests where users independently request files according to some underlying popularity distribution in each slot. This work is a first step towards analyzing this framework through the lens of online learning. We consider the case where the underlying request distribution is apriori unknown and propose an online policy as well as study its regret with respect to an oracle which knows the underlying distribution and employs a well-known order-optimal placement and coded delivery strategy. We also bound the switching cost of this strategy and also discuss a lower bound on the regret of any online scheme in a restricted but natural class of policies. △ Less

Submitted 8 December, 2023; originally announced December 2023.

arXiv:2312.03733 [pdf]

Methods to Estimate Large Language Model Confidence

Authors: Maia Kotelanski, Robert Gallo, Ashwin Nayak, Thomas Savage

Abstract: Large Language Models have difficulty communicating uncertainty, which is a significant obstacle to applying LLMs to complex medical tasks. This study evaluates methods to measure LLM confidence when suggesting a diagnosis for challenging clinical vignettes. GPT4 was asked a series of challenging case questions using Chain of Thought and Self Consistency prompting. Multiple methods were investigat… ▽ More Large Language Models have difficulty communicating uncertainty, which is a significant obstacle to applying LLMs to complex medical tasks. This study evaluates methods to measure LLM confidence when suggesting a diagnosis for challenging clinical vignettes. GPT4 was asked a series of challenging case questions using Chain of Thought and Self Consistency prompting. Multiple methods were investigated to assess model confidence and evaluated on their ability to predict the models observed accuracy. The methods evaluated were Intrinsic Confidence, SC Agreement Frequency and CoT Response Length. SC Agreement Frequency correlated with observed accuracy, yielding a higher Area under the Receiver Operating Characteristic Curve compared to Intrinsic Confidence and CoT Length analysis. SC agreement is the most useful proxy for model confidence, especially for medical diagnosis. Model Intrinsic Confidence and CoT Response Length exhibit a weaker ability to differentiate between correct and incorrect answers, preventing them from being reliable and interpretable markers for model confidence. We conclude GPT4 has a limited ability to assess its own diagnostic accuracy. SC Agreement Frequency is the most useful method to measure GPT4 confidence. △ Less

Submitted 8 December, 2023; v1 submitted 28 November, 2023; originally announced December 2023.

arXiv:2312.03296 [pdf, other]

Cooperative Probabilistic Trajectory Forecasting under Occlusion

Authors: Anshul Nayak, Azim Eskandarian

Abstract: Perception and planning under occlusion is essential for safety-critical tasks. Occlusion-aware planning often requires communicating the information of the occluded object to the ego agent for safe navigation. However, communicating rich sensor information under adverse conditions during communication loss and limited bandwidth may not be always feasible. Further, in GPS denied environments and i… ▽ More Perception and planning under occlusion is essential for safety-critical tasks. Occlusion-aware planning often requires communicating the information of the occluded object to the ego agent for safe navigation. However, communicating rich sensor information under adverse conditions during communication loss and limited bandwidth may not be always feasible. Further, in GPS denied environments and indoor navigation, localizing and sharing of occluded objects can be challenging. To overcome this, relative pose estimation between connected agents sharing a common field of view can be a computationally effective way of communicating information about surrounding objects. In this paper, we design an end-to-end network that cooperatively estimates the current states of occluded pedestrian in the reference frame of ego agent and then predicts the trajectory with safety guarantees. Experimentally, we show that the uncertainty-aware trajectory prediction of occluded pedestrian by the ego agent is almost similar to the ground truth trajectory assuming no occlusion. The current research holds promise for uncertainty-aware navigation among multiple connected agents under occlusion. △ Less

Submitted 6 December, 2023; originally announced December 2023.

Comments: 10 pages, 13 figures, 1 table

arXiv:2311.12825 [pdf, ps, other]

A PSO Based Method to Generate Actionable Counterfactuals for High Dimensional Data

Authors: Shashank Shekhar, Asif Salim, Adesh Bansode, Vivaswan Jinturkar, Anirudha Nayak

Abstract: Counterfactual explanations (CFE) are methods that explain a machine learning model by giving an alternate class prediction of a data point with some minimal changes in its features. It helps the users to identify their data attributes that caused an undesirable prediction like a loan or credit card rejection. We describe an efficient and an actionable counterfactual (CF) generation method based o… ▽ More Counterfactual explanations (CFE) are methods that explain a machine learning model by giving an alternate class prediction of a data point with some minimal changes in its features. It helps the users to identify their data attributes that caused an undesirable prediction like a loan or credit card rejection. We describe an efficient and an actionable counterfactual (CF) generation method based on particle swarm optimization (PSO). We propose a simple objective function for the optimization of the instance-centric CF generation problem. The PSO brings in a lot of flexibility in terms of carrying out multi-objective optimization in large dimensions, capability for multiple CF generation, and setting box constraints or immutability of data attributes. An algorithm is proposed that incorporates these features and it enables greater control over the proximity and sparsity properties over the generated CFs. The proposed algorithm is evaluated with a set of action-ability metrics in real-world datasets, and the results were superior compared to that of the state-of-the-arts. △ Less

Submitted 30 November, 2023; v1 submitted 30 September, 2023; originally announced November 2023.

Comments: Accepted in IEEE CSDE 2023

arXiv:2310.04271 [pdf, other]

Compositional Servoing by Recombining Demonstrations

Authors: Max Argus, Abhijeet Nayak, Martin Büchner, Silvio Galesso, Abhinav Valada, Thomas Brox

Abstract: Learning-based manipulation policies from image inputs often show weak task transfer capabilities. In contrast, visual servoing methods allow efficient task transfer in high-precision scenarios while requiring only a few demonstrations. In this work, we present a framework that formulates the visual servoing task as graph traversal. Our method not only extends the robustness of visual servoing, bu… ▽ More Learning-based manipulation policies from image inputs often show weak task transfer capabilities. In contrast, visual servoing methods allow efficient task transfer in high-precision scenarios while requiring only a few demonstrations. In this work, we present a framework that formulates the visual servoing task as graph traversal. Our method not only extends the robustness of visual servoing, but also enables multitask capability based on a few task-specific demonstrations. We construct demonstration graphs by splitting existing demonstrations and recombining them. In order to traverse the demonstration graph in the inference case, we utilize a similarity function that helps select the best demonstration for a specific task. This enables us to compute the shortest path through the graph. Ultimately, we show that recombining demonstrations leads to higher task-respective success. We present extensive simulation and real-world experimental results that demonstrate the efficacy of our approach. △ Less

Submitted 6 October, 2023; originally announced October 2023.

Comments: http://compservo.cs.uni-freiburg.de

arXiv:2309.09875 [pdf, other]

RaLF: Flow-based Global and Metric Radar Localization in LiDAR Maps

Authors: Abhijeet Nayak, Daniele Cattaneo, Abhinav Valada

Abstract: Localization is paramount for autonomous robots. While camera and LiDAR-based approaches have been extensively investigated, they are affected by adverse illumination and weather conditions. Therefore, radar sensors have recently gained attention due to their intrinsic robustness to such conditions. In this paper, we propose RaLF, a novel deep neural network-based approach for localizing radar sca… ▽ More Localization is paramount for autonomous robots. While camera and LiDAR-based approaches have been extensively investigated, they are affected by adverse illumination and weather conditions. Therefore, radar sensors have recently gained attention due to their intrinsic robustness to such conditions. In this paper, we propose RaLF, a novel deep neural network-based approach for localizing radar scans in a LiDAR map of the environment, by jointly learning to address both place recognition and metric localization. RaLF is composed of radar and LiDAR feature encoders, a place recognition head that generates global descriptors, and a metric localization head that predicts the 3-DoF transformation between the radar scan and the map. We tackle the place recognition task by learning a shared embedding space between the two modalities via cross-modal metric learning. Additionally, we perform metric localization by predicting pixel-level flow vectors that align the query radar scan with the LiDAR map. We extensively evaluate our approach on multiple real-world driving datasets and show that RaLF achieves state-of-the-art performance for both place recognition and metric localization. Moreover, we demonstrate that our approach can effectively generalize to different cities and sensor setups than the ones used during training. We make the code and trained models publicly available at http://ralf.cs.uni-freiburg.de. △ Less

Submitted 18 September, 2023; originally announced September 2023.

arXiv:2308.16321 [pdf, other]

Exposing and Addressing Security Vulnerabilities in Browser Text Input Fields

Authors: Asmit Nayak, Rishabh Khandelwal, Kassem Fawaz

Abstract: In this work, we perform a comprehensive analysis of the security of text input fields in web browsers. We find that browsers' coarse-grained permission model violates two security design principles: least privilege and complete mediation. We further uncover two vulnerabilities in input fields, including the alarming discovery of passwords in plaintext within the HTML source code of the web page.… ▽ More In this work, we perform a comprehensive analysis of the security of text input fields in web browsers. We find that browsers' coarse-grained permission model violates two security design principles: least privilege and complete mediation. We further uncover two vulnerabilities in input fields, including the alarming discovery of passwords in plaintext within the HTML source code of the web page. To demonstrate the real-world impact of these vulnerabilities, we design a proof-of-concept extension, leveraging techniques from static and dynamic code injection attacks to bypass the web store review process. Our measurements and case studies reveal that these vulnerabilities are prevalent across various websites, with sensitive user information, such as passwords, exposed in the HTML source code of even high-traffic sites like Google and Cloudflare. We find that a significant percentage (12.5\%) of extensions possess the necessary permissions to exploit these vulnerabilities and identify 190 extensions that directly access password fields. Finally, we propose two countermeasures to address these risks: a bolt-on JavaScript package for immediate adoption by website developers allowing them to protect sensitive input fields, and a browser-level solution that alerts users when an extension accesses sensitive input fields. Our research highlights the urgent need for improved security measures to protect sensitive user information online. △ Less

Submitted 30 August, 2023; originally announced August 2023.

arXiv:2308.14089 [pdf, other]

MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records

Authors: Scott L. Fleming, Alejandro Lozano, William J. Haberkorn, Jenelle A. Jindal, Eduardo P. Reis, Rahul Thapa, Louis Blankemeier, Julian Z. Genkins, Ethan Steinberg, Ashwin Nayak, Birju S. Patel, Chia-Chun Chiang, Alison Callahan, Zepeng Huo, Sergios Gatidis, Scott J. Adams, Oluseyi Fayanju, Shreya J. Shah, Thomas Savage, Ethan Goh, Akshay S. Chaudhari, Nima Aghaeepour, Christopher Sharp, Michael A. Pfeffer, Percy Liang , et al. (5 additional authors not shown)

Abstract: The ability of large language models (LLMs) to follow natural language instructions with human-level fluency suggests many opportunities in healthcare to reduce administrative burden and improve quality of care. However, evaluating LLMs on realistic text generation tasks for healthcare remains challenging. Existing question answering datasets for electronic health record (EHR) data fail to capture… ▽ More The ability of large language models (LLMs) to follow natural language instructions with human-level fluency suggests many opportunities in healthcare to reduce administrative burden and improve quality of care. However, evaluating LLMs on realistic text generation tasks for healthcare remains challenging. Existing question answering datasets for electronic health record (EHR) data fail to capture the complexity of information needs and documentation burdens experienced by clinicians. To address these challenges, we introduce MedAlign, a benchmark dataset of 983 natural language instructions for EHR data. MedAlign is curated by 15 clinicians (7 specialities), includes clinician-written reference responses for 303 instructions, and provides 276 longitudinal EHRs for grounding instruction-response pairs. We used MedAlign to evaluate 6 general domain LLMs, having clinicians rank the accuracy and quality of each LLM response. We found high error rates, ranging from 35% (GPT-4) to 68% (MPT-7B-Instruct), and an 8.3% drop in accuracy moving from 32k to 2k context lengths for GPT-4. Finally, we report correlations between clinician rankings and automated natural language generation metrics as a way to rank LLMs without human review. We make MedAlign available under a research data use agreement to enable LLM evaluations on tasks aligned with clinician needs and preferences. △ Less

Submitted 24 December, 2023; v1 submitted 27 August, 2023; originally announced August 2023.

arXiv:2308.13207 [pdf, other]

LLM2KB: Constructing Knowledge Bases using instruction tuned context aware Large Language Models

Authors: Anmol Nayak, Hari Prasad Timmapathini

Abstract: The advent of Large Language Models (LLM) has revolutionized the field of natural language processing, enabling significant progress in various applications. One key area of interest is the construction of Knowledge Bases (KB) using these powerful models. Knowledge bases serve as repositories of structured information, facilitating information retrieval and inference tasks. Our paper proposes LLM2… ▽ More The advent of Large Language Models (LLM) has revolutionized the field of natural language processing, enabling significant progress in various applications. One key area of interest is the construction of Knowledge Bases (KB) using these powerful models. Knowledge bases serve as repositories of structured information, facilitating information retrieval and inference tasks. Our paper proposes LLM2KB, a system for constructing knowledge bases using large language models, with a focus on the Llama 2 architecture and the Wikipedia dataset. We perform parameter efficient instruction tuning for Llama-2-13b-chat and StableBeluga-13B by training small injection models that have only 0.05 % of the parameters of the base models using the Low Rank Adaptation (LoRA) technique. These injection models have been trained with prompts that are engineered to utilize Wikipedia page contexts of subject entities fetched using a Dense Passage Retrieval (DPR) algorithm, to answer relevant object entities for a given subject entity and relation. Our best performing model achieved an average F1 score of 0.6185 across 21 relations in the LM-KBC challenge held at the ISWC 2023 conference. △ Less

Submitted 25 August, 2023; originally announced August 2023.

Comments: 16 pages, 1 figure, LM-KBC 2023 Challenge at International Semantic Web Conference 2023

arXiv:2308.06834 [pdf]

Diagnostic Reasoning Prompts Reveal the Potential for Large Language Model Interpretability in Medicine

Authors: Thomas Savage, Ashwin Nayak, Robert Gallo, Ekanath Rangan, Jonathan H Chen

Abstract: One of the major barriers to using large language models (LLMs) in medicine is the perception they use uninterpretable methods to make clinical decisions that are inherently different from the cognitive processes of clinicians. In this manuscript we develop novel diagnostic reasoning prompts to study whether LLMs can perform clinical reasoning to accurately form a diagnosis. We find that GPT4 can… ▽ More One of the major barriers to using large language models (LLMs) in medicine is the perception they use uninterpretable methods to make clinical decisions that are inherently different from the cognitive processes of clinicians. In this manuscript we develop novel diagnostic reasoning prompts to study whether LLMs can perform clinical reasoning to accurately form a diagnosis. We find that GPT4 can be prompted to mimic the common clinical reasoning processes of clinicians without sacrificing diagnostic accuracy. This is significant because an LLM that can use clinical reasoning to provide an interpretable rationale offers physicians a means to evaluate whether LLMs can be trusted for patient care. Novel prompting methods have the potential to expose the black box of LLMs, bringing them one step closer to safe and effective use in medicine. △ Less

Submitted 13 August, 2023; originally announced August 2023.

arXiv:2306.08111 [pdf, other]

Unpacking Privacy Labels: A Measurement and Developer Perspective on Google's Data Safety Section

Authors: Rishabh Khandelwal, Asmit Nayak, Paul Chung, Kassem Fawaz

Abstract: Google has mandated developers to use Data Safety Sections (DSS) to increase transparency in data collection and sharing practices. In this paper, we present a comprehensive analysis of Google's Data Safety Section (DSS) using both quantitative and qualitative methods. We conduct the first large-scale measurement study of DSS using apps from Android Play store (n=1.1M). We find that there are inte… ▽ More Google has mandated developers to use Data Safety Sections (DSS) to increase transparency in data collection and sharing practices. In this paper, we present a comprehensive analysis of Google's Data Safety Section (DSS) using both quantitative and qualitative methods. We conduct the first large-scale measurement study of DSS using apps from Android Play store (n=1.1M). We find that there are internal inconsistencies within the reported practices. We also find trends of both over and under-reporting practices in the DSSs. Next, we conduct a longitudinal study of DSS to explore how the reported practices evolve over time, and find that the developers are still adjusting their practices. To contextualize these findings, we conduct a developer study, uncovering the process that app developers undergo when working with DSS. We highlight the challenges faced and strategies employed by developers for DSS submission, and the factors contributing to changes in the DSS. Our research contributes valuable insights into the complexities of implementing and maintaining privacy labels, underlining the need for better resources, tools, and guidelines to aid developers. This understanding is crucial as the accuracy and reliability of privacy labels directly impact their effectiveness. △ Less

Submitted 13 June, 2023; originally announced June 2023.

arXiv:2305.16620 [pdf, other]

Pedestrian Trajectory Forecasting Using Deep Ensembles Under Sensing Uncertainty

Authors: Anshul Nayak, Azim Eskandarian, Zachary Doerzaph, Prasenjit Ghorai

Abstract: One of the fundamental challenges in the prediction of dynamic agents is robustness. Usually, most predictions are deterministic estimates of future states which are over-confident and prone to error. Recently, few works have addressed capturing uncertainty during forecasting of future states. However, these probabilistic estimation methods fail to account for the upstream noise in perception data… ▽ More One of the fundamental challenges in the prediction of dynamic agents is robustness. Usually, most predictions are deterministic estimates of future states which are over-confident and prone to error. Recently, few works have addressed capturing uncertainty during forecasting of future states. However, these probabilistic estimation methods fail to account for the upstream noise in perception data during tracking. Sensors always have noise and state estimation becomes even more difficult under adverse weather conditions and occlusion. Traditionally, Bayes filters have been used to fuse information from noisy sensors to update states with associated belief. But, they fail to address non-linearities and long-term predictions. Therefore, we propose an end-to-end estimator that can take noisy sensor measurements and make robust future state predictions with uncertainty bounds while simultaneously taking into consideration the upstream perceptual uncertainty. For the current research, we consider an encoder-decoder based deep ensemble network for capturing both perception and predictive uncertainty simultaneously. We compared the current model to other approximate Bayesian inference methods. Overall, deep ensembles provided more robust predictions and the consideration of upstream uncertainty further increased the estimation accuracy for the model. △ Less

Submitted 26 May, 2023; originally announced May 2023.

Comments: 13 pages, 11 figures, 3 tables

arXiv:2303.08213 [pdf, other]

The Overview of Privacy Labels and their Compatibility with Privacy Policies

Authors: Rishabh Khandelwal, Asmit Nayak, Paul Chung, Kassem Fawaz

Abstract: Privacy nutrition labels provide a way to understand an app's key data practices without reading the long and hard-to-read privacy policies. Recently, the app distribution platforms for iOS(Apple) and Android(Google) have implemented mandates requiring app developers to fill privacy nutrition labels highlighting their privacy practices such as data collection, data sharing, and security practices.… ▽ More Privacy nutrition labels provide a way to understand an app's key data practices without reading the long and hard-to-read privacy policies. Recently, the app distribution platforms for iOS(Apple) and Android(Google) have implemented mandates requiring app developers to fill privacy nutrition labels highlighting their privacy practices such as data collection, data sharing, and security practices. These privacy labels contain very fine-grained information about the apps' data practices such as the data types and purposes associated with each data type. This provides us with a unique vantage point from which we can understand apps' data practices at scale. △ Less

Submitted 24 April, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

arXiv:2302.06951 [pdf, other]

Few-shot learning approaches for classifying low resource domain specific software requirements

Authors: Anmol Nayak, Hari Prasad Timmapathini, Vidhya Murali, Atul Anil Gohad

Abstract: With the advent of strong pre-trained natural language processing models like BERT, DeBERTa, MiniLM, T5, the data requirement for industries to fine-tune these models to their niche use cases has drastically reduced (typically to a few hundred annotated samples for achieving a reasonable performance). However, the availability of even a few hundred annotated samples may not always be guaranteed in… ▽ More With the advent of strong pre-trained natural language processing models like BERT, DeBERTa, MiniLM, T5, the data requirement for industries to fine-tune these models to their niche use cases has drastically reduced (typically to a few hundred annotated samples for achieving a reasonable performance). However, the availability of even a few hundred annotated samples may not always be guaranteed in low resource domains like automotive, which often limits the usage of such deep learning models in an industrial setting. In this paper we aim to address the challenge of fine-tuning such pre-trained models with only a few annotated samples, also known as Few-shot learning. Our experiments focus on evaluating the performance of a diverse set of algorithms and methodologies to achieve the task of classifying BOSCH automotive domain textual software requirements into 3 categories, while utilizing only 15 annotated samples per category for fine-tuning. We find that while SciBERT and DeBERTa based models tend to be the most accurate at 15 training samples, their performance improvement scales minimally as the number of annotated samples is increased to 50 in comparison to Siamese and T5 based models. △ Less

Submitted 14 February, 2023; originally announced February 2023.

Comments: 6 pages, 1 figure

arXiv:2301.10987 [pdf, other]

A Decentralized Policy for Minimization of Age of Incorrect Information in Slotted ALOHA Systems

Authors: Anupam Nayak, Anders E. Kalør, Federico Chiariotti, Petar Popovski

Abstract: The Age of Incorrect Information (AoII) is a metric that can combine the freshness of the information available to a gateway in an Internet of Things (IoT) network with the accuracy of that information. As such, minimizing the AoII can allow the operators of IoT systems to have a more precise and up-to-date picture of the environment in which the sensors are deployed. However, most IoT systems do… ▽ More The Age of Incorrect Information (AoII) is a metric that can combine the freshness of the information available to a gateway in an Internet of Things (IoT) network with the accuracy of that information. As such, minimizing the AoII can allow the operators of IoT systems to have a more precise and up-to-date picture of the environment in which the sensors are deployed. However, most IoT systems do not allow for centralized scheduling or explicit coordination, as sensors need to be extremely simple and consume as little power as possible. Finding a decentralized policy to minimize the AoII can be extremely challenging in this setting. This paper presents a heuristic to optimize AoII for a slotted ALOHA system, starting from a threshold-based policy and using dual methods to converge to a better solution. This method can significantly outperform state-independent policies, finding an efficient balance between frequent updates and a low number of packet collisions. △ Less

Submitted 26 January, 2023; originally announced January 2023.

Comments: Accepted to IEEE ICC 2023

arXiv:2301.02227 [pdf, ps, other]

doi 10.1109/TIT.2023.3324527

Optimal lower bounds for Quantum Learning via Information Theory

Authors: Shima Bab Hadiashar, Ashwin Nayak, Pulkit Sinha

Abstract: Although a concept class may be learnt more efficiently using quantum samples as compared with classical samples in certain scenarios, Arunachalam and de Wolf (JMLR, 2018) proved that quantum learners are asymptotically no more efficient than classical ones in the quantum PAC and Agnostic learning models. They established lower bounds on sample complexity via quantum state identification and Fouri… ▽ More Although a concept class may be learnt more efficiently using quantum samples as compared with classical samples in certain scenarios, Arunachalam and de Wolf (JMLR, 2018) proved that quantum learners are asymptotically no more efficient than classical ones in the quantum PAC and Agnostic learning models. They established lower bounds on sample complexity via quantum state identification and Fourier analysis. In this paper, we derive optimal lower bounds for quantum sample complexity in both the PAC and agnostic models via an information-theoretic approach. The proofs are arguably simpler, and the same ideas can potentially be used to derive optimal bounds for other problems in quantum learning theory. We then turn to a quantum analogue of the Coupon Collector problem, a classic problem from probability theory also of importance in the study of PAC learning. Arunachalam, Belovs, Childs, Kothari, Rosmanis, and de Wolf (TQC, 2020) characterized the quantum sample complexity of this problem up to constant factors. First, we show that the information-theoretic approach mentioned above provably does not yield the optimal lower bound. As a by-product, we get a natural ensemble of pure states in arbitrarily high dimensions which are not easily (simultaneously) distinguishable, while the ensemble has close to maximal Holevo information. Second, we discover that the information-theoretic approach yields an asymptotically optimal bound for an approximation variant of the problem. Finally, we derive a sharper lower bound for the Quantum Coupon Collector problem, via the generalized Holevo-Curlander bounds on the distinguishability of an ensemble. All the aspects of the Quantum Coupon Collector problem we study rest on properties of the spectrum of the associated Gram matrix, which may be of independent interest. △ Less

Submitted 27 February, 2024; v1 submitted 5 January, 2023; originally announced January 2023.

Comments: v3: 40 pages; Added references; edited extensively; simplified the proof of Theorem 3.2; results unchanged. A preliminary version of the results in Section 3 was included in the S.B.H.'s PhD thesis at University of Waterloo (Dec. 2020). An extended abstract of the results in Section 4 was included in the P.S.' bachelor's project report at Indian Institute of Science (Apr. 2022)

ACM Class: F.2.2

Journal ref: IEEE Transactions on Information Theory, vol. 70, no. 3, pp. 1876-1896, March 2024

arXiv:2211.00697 [pdf, other]

doi 10.22331/q-2023-08-16-1087

A Converse for Fault-tolerant Quantum Computation

Authors: Uthirakalyani G, Anuj K. Nayak, Avhishek Chatterjee

Abstract: As techniques for fault-tolerant quantum computation keep improving, it is natural to ask: what is the fundamental lower bound on redundancy? In this paper, we obtain a lower bound on the redundancy required for $ε$-accurate implementation of a large class of operations that includes unitary operators. For the practically relevant case of sub-exponential depth and sub-linear gate size, our bound o… ▽ More As techniques for fault-tolerant quantum computation keep improving, it is natural to ask: what is the fundamental lower bound on redundancy? In this paper, we obtain a lower bound on the redundancy required for $ε$-accurate implementation of a large class of operations that includes unitary operators. For the practically relevant case of sub-exponential depth and sub-linear gate size, our bound on redundancy is tighter than the known lower bounds. We obtain this bound by connecting fault-tolerant computation with a set of finite blocklength quantum communication problems whose accuracy requirements satisfy a joint constraint. The lower bound on redundancy obtained here leads to a strictly smaller upper bound on the noise threshold for non-degradable noise. Our bound directly extends to the case where noise at the outputs of a gate are non-i.i.d. but noise across gates are i.i.d. △ Less

Submitted 9 August, 2023; v1 submitted 1 November, 2022; originally announced November 2022.

Comments: Some changes were made, and results changed in the lower order term only (scaling factor added), Edited license

Journal ref: Quantum 7, 1087 (2023)

arXiv:2207.14438 [pdf, ps, other]

Lower bounds for learning quantum states with single-copy measurements

Authors: Angus Lowe, Ashwin Nayak

Abstract: We study the problems of quantum tomography and shadow tomography using measurements performed on individual, identical copies of an unknown $d$-dimensional state. We first revisit a known lower bound due to Haah et al. (2017) on quantum tomography with accuracy $ε$ in trace distance, when the measurements choices are independent of previously observed outcomes (i.e., they are nonadaptive). We giv… ▽ More We study the problems of quantum tomography and shadow tomography using measurements performed on individual, identical copies of an unknown $d$-dimensional state. We first revisit a known lower bound due to Haah et al. (2017) on quantum tomography with accuracy $ε$ in trace distance, when the measurements choices are independent of previously observed outcomes (i.e., they are nonadaptive). We give a succinct proof of this result. This leads to stronger lower bounds when the learner uses measurements with a constant number of outcomes. In particular, this rigorously establishes the optimality of the folklore ``Pauli tomography" algorithm in terms of its sample complexity. We also derive novel bounds of $Ω(r^2 d/ε^2)$ and $Ω(r^2 d^2/ε^2)$ for learning rank $r$ states using arbitrary and constant-outcome measurements, respectively, in the nonadaptive case. In addition to the sample complexity, a resource of practical significance for learning quantum states is the number of different measurements used by an algorithm. We extend our lower bounds to the case where the learner performs possibly adaptive measurements from a fixed set of $\exp(O(d))$ measurements. This implies in particular that adaptivity does not give us any advantage using single-copy measurements that are efficiently implementable. We also obtain a similar bound in the case where the goal is to predict the expectation values of a given sequence of observables, a task known as shadow tomography. Finally, in the case of adaptive, single-copy measurements implementable with polynomial-size circuits, we prove that a straightforward strategy based on computing sample means of the given observables is optimal. △ Less

Submitted 1 August, 2022; v1 submitted 28 July, 2022; originally announced July 2022.

Comments: v2. 43 pages. Source files updated to correct PDF generation error. Added some details on prior work; results unchanged. v1. 43 pages. Most of the results in this article were included in the first author's Master's thesis at U. Waterloo (Oct. 2021) and were presented at QIP 2022 (Mar. 2022)

arXiv:2207.10629 [pdf, other]

A Solution to Adaptive Mobile Manipulator Throwing

Authors: Yang Liu, Aradhana Nayak, Aude Billard

Abstract: Mobile manipulator throwing is a promising method to increase the flexibility and efficiency of dynamic manipulation in factories. Its major challenge is to efficiently plan a feasible throw under a wide set of task specifications. We show that the mobile manipulator throwing problem can be simplified to a planar problem, hence greatly reducing the computational costs. Using machine learning appro… ▽ More Mobile manipulator throwing is a promising method to increase the flexibility and efficiency of dynamic manipulation in factories. Its major challenge is to efficiently plan a feasible throw under a wide set of task specifications. We show that the mobile manipulator throwing problem can be simplified to a planar problem, hence greatly reducing the computational costs. Using machine learning approaches, we build a model of the object's inverted flying dynamics and the robot's kinematic feasibility, which enables throwing motion generation within 1 ms for given query of target position. Thanks to the computational efficiency of our method, we show that the system is adaptive under disturbance, via replanning on the fly for alternative solutions, instead of sticking to the original throwing plan. △ Less

Submitted 3 August, 2022; v1 submitted 21 July, 2022; originally announced July 2022.

Comments: Accepted at IROS 2022. Code is available at: https://github.com/epfl-lasa/mobile-throwing

arXiv:2207.09127 [pdf, ps, other]

Smart Contract Assisted Blockchain based PKI System

Authors: Amrutanshu Panigrahi, Ajit Kumar Nayak, Rourab Paul

Abstract: The proposed smart contract can prevent seven cyber attacks, such as Denial of Service (DoS), Man in the Middle Attack (MITM), Distributed Denial of Service (DDoS), 51\%, Injection attacks, Routing Attack, and Eclipse attack. The Delegated Proof of Stake (DPoS) consensus algorithm used in this model reduces the number of validators for each transaction which makes it suitable for lightweight appli… ▽ More The proposed smart contract can prevent seven cyber attacks, such as Denial of Service (DoS), Man in the Middle Attack (MITM), Distributed Denial of Service (DDoS), 51\%, Injection attacks, Routing Attack, and Eclipse attack. The Delegated Proof of Stake (DPoS) consensus algorithm used in this model reduces the number of validators for each transaction which makes it suitable for lightweight applications. The timing complexity of key/certificate validation and signature/certificate revocation processes do not depend on the number of transactions. The comparisons of various timing parameters with existing solutions show that the proposed PKI is competitively better. △ Less

Submitted 19 July, 2022; originally announced July 2022.

Comments: manuscript

arXiv:2207.08714 [pdf, other]

Learning High Dimensional Demonstrations Using Laplacian Eigenmaps

Authors: Sthithpragya Gupta, Aradhana Nayak, Aude Billard

Abstract: This article proposes a novel methodology to learn a stable robot control law driven by dynamical systems. The methodology requires a single demonstration and can deduce a stable dynamics in arbitrary high dimensions. The method relies on the idea that there exists a latent space in which the nonlinear dynamics appears quasi linear. The original nonlinear dynamics is mapped into a stable linear DS… ▽ More This article proposes a novel methodology to learn a stable robot control law driven by dynamical systems. The methodology requires a single demonstration and can deduce a stable dynamics in arbitrary high dimensions. The method relies on the idea that there exists a latent space in which the nonlinear dynamics appears quasi linear. The original nonlinear dynamics is mapped into a stable linear DS, by leveraging on the properties of graph embeddings. We show that the eigendecomposition of the Graph Laplacian results in linear embeddings in two dimensions and quasi-linear in higher dimensions. The nonlinear terms vanish, exponentially as the number of datapoints increase, and for large density of points, the embedding appears linear. We show that this new embedding enables to model highly nonlinear dynamics in high dimension and overcomes alternative techniques in both precision of reconstruction and number of parameters required for the embedding. We demonstrate its applicability to control real robot tasked to perform complex free motion in space. △ Less

Submitted 18 July, 2022; originally announced July 2022.

arXiv:2205.01887 [pdf, other]

doi 10.1109/OJITS.2022.3205504

Uncertainty estimation of pedestrian future trajectory using Bayesian approximation

Authors: Anshul Nayak, Azim Eskandarian, Zachary Doerzaph

Abstract: Past research on pedestrian trajectory forecasting mainly focused on deterministic predictions which provide only point estimates of future states. These future estimates can help an autonomous vehicle plan its trajectory and avoid collision. However, under dynamic traffic scenarios, planning based on deterministic predictions is not trustworthy. Rather, estimating the uncertainty associated with… ▽ More Past research on pedestrian trajectory forecasting mainly focused on deterministic predictions which provide only point estimates of future states. These future estimates can help an autonomous vehicle plan its trajectory and avoid collision. However, under dynamic traffic scenarios, planning based on deterministic predictions is not trustworthy. Rather, estimating the uncertainty associated with the predicted states with a certain level of confidence can lead to robust path planning. Hence, the authors propose to quantify this uncertainty during forecasting using stochastic approximation which deterministic approaches fail to capture. The current method is simple and applies Bayesian approximation during inference to standard neural network architectures for estimating uncertainty. The authors compared the predictions between the probabilistic neural network (NN) models with the standard deterministic models. The results indicate that the mean predicted path of probabilistic models was closer to the ground truth when compared with the deterministic prediction. Further, the effect of stochastic dropout of weights and long-term prediction on future state uncertainty has been studied. It was found that the probabilistic models produced better performance metrics like average displacement error (ADE) and final displacement error (FDE). Finally, the study has been extended to multiple datasets providing a comprehensive comparison for each model. △ Less

Submitted 4 May, 2022; originally announced May 2022.

Comments: 12 pages, 17 figures, 1 table

arXiv:2204.11886 [pdf, ps, other]

doi 10.1109/TIT.2023.3269524

Mutually Unbiased Measurements, Hadamard Matrices, and Superdense Coding

Authors: Máté Farkas, Jędrzej Kaniewski, Ashwin Nayak

Abstract: Mutually unbiased bases (MUBs) are highly symmetric bases on complex Hilbert spaces, and the corresponding rank-1 projective measurements are ubiquitous in quantum information theory. In this work, we study a recently introduced generalization of MUBs called mutually unbiased measurements (MUMs). These measurements inherit the essential property of complementarity from MUBs, but the Hilbert space… ▽ More Mutually unbiased bases (MUBs) are highly symmetric bases on complex Hilbert spaces, and the corresponding rank-1 projective measurements are ubiquitous in quantum information theory. In this work, we study a recently introduced generalization of MUBs called mutually unbiased measurements (MUMs). These measurements inherit the essential property of complementarity from MUBs, but the Hilbert space dimension is no longer required to match the number of outcomes. This operational complementarity property renders MUMs highly useful for device-independent quantum information processing. It has been shown that MUMs are strictly more general than MUBs. In this work we provide a complete proof of the characterization of MUMs that are direct sums of MUBs. We then proceed to construct new examples of MUMs that are not direct sums of MUBs. A crucial technical tool for these construction is a correspondence with quaternionic Hadamard matrices, which allows us to map known examples of such matrices to MUMs that are not direct sums of MUBs. Furthermore, we show that -- in stark contrast with MUBs -- the number of MUMs for a fixed outcome number is unbounded. Next, we focus on the use of MUMs in quantum communication. We demonstrate how any pair of MUMs with d outcomes defines a d-dimensional superdense coding protocol. Using MUMs that are not direct sums of MUBs, we disprove a recent conjecture due to Nayak and Yuen on the rigidity of superdense coding for infinitely many dimensions. The superdense coding protocols arising in the refutation reveal how shared entanglement may be used in a manner heretofore unknown. △ Less

Submitted 12 October, 2023; v1 submitted 25 April, 2022; originally announced April 2022.

Comments: v2: Added some references and related discussion. v1: 20 pages. Comments welcome!

Journal ref: IEEE Transactions on Information Theory, vol. 69, no. 6, pp. 3814--3824, June 2023

arXiv:2204.04221 [pdf, other]

CookieEnforcer: Automated Cookie Notice Analysis and Enforcement

Authors: Rishabh Khandelwal, Asmit Nayak, Hamza Harkous, Kassem Fawaz

Abstract: Online websites use cookie notices to elicit consent from the users, as required by recent privacy regulations like the GDPR and the CCPA. Prior work has shown that these notices use dark patterns to manipulate users into making website-friendly choices which put users' privacy at risk. In this work, we develop CookieEnforcer, a new system for automatically discovering cookie notices and deciding… ▽ More Online websites use cookie notices to elicit consent from the users, as required by recent privacy regulations like the GDPR and the CCPA. Prior work has shown that these notices use dark patterns to manipulate users into making website-friendly choices which put users' privacy at risk. In this work, we develop CookieEnforcer, a new system for automatically discovering cookie notices and deciding on the options that result in disabling all non-essential cookies. In order to achieve this, we first build an automatic cookie notice detector that utilizes the rendering pattern of the HTML elements to identify the cookie notices. Next, CookieEnforcer analyzes the cookie notices and predicts the set of actions required to disable all unnecessary cookies. This is done by modeling the problem as a sequence-to-sequence task, where the input is a machine-readable cookie notice and the output is the set of clicks to make. We demonstrate the efficacy of CookieEnforcer via an end-to-end accuracy evaluation, showing that it can generate the required steps in 91% of the cases. Via a user study, we show that CookieEnforcer can significantly reduce the user effort. Finally, we use our system to perform several measurements on the top 5k websites from the Tranco list (as accessed from the US and the UK), drawing comparisons and observations at scale. △ Less

Submitted 14 April, 2022; v1 submitted 8 April, 2022; originally announced April 2022.

arXiv:2204.02559 [pdf]

doi 10.1109/MITS.2022.3168801

X-CAR: An Experimental Vehicle Platform for Connected Autonomy Research Powered by CARMA

Authors: Goodarz Mehr, Prasenjit Ghorai, Ce Zhang, Anshul Nayak, Darshit Patel, Shathushan Sivashangaran, Azim Eskandarian

Abstract: Autonomous vehicles promise a future with a safer, cleaner, more efficient, and more reliable transportation system. However, the current approach to autonomy has focused on building small, disparate intelligences that are closed off to the rest of the world. Vehicle connectivity has been proposed as a solution, relying on a vision of the future where a mix of connected autonomous and human-driven… ▽ More Autonomous vehicles promise a future with a safer, cleaner, more efficient, and more reliable transportation system. However, the current approach to autonomy has focused on building small, disparate intelligences that are closed off to the rest of the world. Vehicle connectivity has been proposed as a solution, relying on a vision of the future where a mix of connected autonomous and human-driven vehicles populate the road. Developed by the U.S. Department of Transportation Federal Highway Administration as a reusable, extensible platform for controlling connected autonomous vehicles, the CARMA Platform is one of the technologies enabling this connected future. Nevertheless, the adoption of the CARMA Platform has been slow, with a contributing factor being the limited, expensive, and relatively old vehicle configurations that are officially supported. To alleviate this problem, we propose X-CAR (eXperimental vehicle platform for Connected Autonomy Research). By implementing the CARMA Platform on more affordable, high quality hardware, X-CAR aims to increase the versatility of the CARMA Platform and facilitate its adoption for research and development of connected driving automation. △ Less

Submitted 25 May, 2022; v1 submitted 5 April, 2022; originally announced April 2022.

Comments: ©2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

arXiv:2202.12205 [pdf, ps, other]

doi 10.3233/SW-223228

Is Neuro-Symbolic AI Meeting its Promise in Natural Language Processing? A Structured Review

Authors: Kyle Hamilton, Aparna Nayak, Bojan Božić, Luca Longo

Abstract: Advocates for Neuro-Symbolic Artificial Intelligence (NeSy) assert that combining deep learning with symbolic reasoning will lead to stronger AI than either paradigm on its own. As successful as deep learning has been, it is generally accepted that even our best deep learning systems are not very good at abstract reasoning. And since reasoning is inextricably linked to language, it makes intuitive… ▽ More Advocates for Neuro-Symbolic Artificial Intelligence (NeSy) assert that combining deep learning with symbolic reasoning will lead to stronger AI than either paradigm on its own. As successful as deep learning has been, it is generally accepted that even our best deep learning systems are not very good at abstract reasoning. And since reasoning is inextricably linked to language, it makes intuitive sense that Natural Language Processing (NLP), would be a particularly well-suited candidate for NeSy. We conduct a structured review of studies implementing NeSy for NLP, with the aim of answering the question of whether NeSy is indeed meeting its promises: reasoning, out-of-distribution generalization, interpretability, learning and reasoning from small data, and transferability to new domains. We examine the impact of knowledge representation, such as rules and semantic networks, language structure and relational structure, and whether implicit or explicit reasoning contributes to higher promise scores. We find that systems where logic is compiled into the neural network lead to the most NeSy goals being satisfied, while other factors such as knowledge representation, or type of neural architecture do not exhibit a clear correlation with goals being met. We find many discrepancies in how reasoning is defined, specifically in relation to human level reasoning, which impact decisions about model architectures and drive conclusions which are not always consistent across studies. Hence we advocate for a more methodical approach to the application of theories of human reasoning as well as the development of appropriate benchmarks, which we hope can lead to a better understanding of progress in the field. We make our data and code available on github for further analysis. △ Less

Submitted 30 June, 2022; v1 submitted 24 February, 2022; originally announced February 2022.

Comments: Survey

Journal ref: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-42, 2022

arXiv:2112.00283 [pdf, other]

Wiki to Automotive: Understanding the Distribution Shift and its impact on Named Entity Recognition

Authors: Anmol Nayak, Hari Prasad Timmapathini

Abstract: While transfer learning has become a ubiquitous technique used across Natural Language Processing (NLP) tasks, it is often unable to replicate the performance of pre-trained models on text of niche domains like Automotive. In this paper we aim to understand the main characteristics of the distribution shift with automotive domain text (describing technical functionalities such as Cruise Control) a… ▽ More While transfer learning has become a ubiquitous technique used across Natural Language Processing (NLP) tasks, it is often unable to replicate the performance of pre-trained models on text of niche domains like Automotive. In this paper we aim to understand the main characteristics of the distribution shift with automotive domain text (describing technical functionalities such as Cruise Control) and attempt to explain the potential reasons for the gap in performance. We focus on performing the Named Entity Recognition (NER) task as it requires strong lexical, syntactic and semantic understanding by the model. Our experiments with 2 different encoders, namely BERT-Base-Uncased and SciBERT-Base-Scivocab-Uncased have lead to interesting findings that showed: 1) The performance of SciBERT is better than BERT when used for automotive domain, 2) Fine-tuning the language models with automotive domain text did not make significant improvements to the NER performance, 3) The distribution shift is challenging as it is characterized by lack of repeating contexts, sparseness of entities, large number of Out-Of-Vocabulary (OOV) words and class overlap due to domain specific nuances. △ Less

Submitted 1 December, 2021; originally announced December 2021.

Comments: 6 pages, 1 figure

arXiv:2111.14535 [pdf, other]

Enabling Reusable Physical Design Flows with Modular Flow Generators

Authors: Alex Carsello, James Thomas, Ankita Nayak, Po-Han Chen, Mark Horowitz, Priyanka Raina, Christopher Torng

Abstract: Achieving high code reuse in physical design flows is challenging but increasingly necessary to build complex systems. Unfortunately, existing approaches based on parameterized Tcl generators support very limited reuse and struggle to preserve reusable code as designers customize flows for specific designs and technologies. We present a vision and framework based on modular flow generators that en… ▽ More Achieving high code reuse in physical design flows is challenging but increasingly necessary to build complex systems. Unfortunately, existing approaches based on parameterized Tcl generators support very limited reuse and struggle to preserve reusable code as designers customize flows for specific designs and technologies. We present a vision and framework based on modular flow generators that encapsulates coarse-grain and fine-grain reusable code in modular nodes and assembles them into complete flows. The key feature is a flow consistency and instrumentation layer embedded in Python, which supports mechanisms for rapid and early feedback on inconsistent composition. The approach gradually types the Tcl language and allows both automatic and user-annotated static assertion checks. We evaluate the design flows of successive generations of silicon prototypes designed in TSMC16, TSMC28, TSMC40, SKY130, and IBM180 technologies, showing how our approach can enable significant code reuse in future flows. △ Less

Submitted 29 November, 2021; originally announced November 2021.

arXiv:2109.01655 [pdf, other]

Instabilities in Plug-and-Play (PnP) algorithms from a learned denoiser

Authors: Abinash Nayak

Abstract: It's well-known that inverse problems are ill-posed and to solve them meaningfully, one has to employ regularization methods. Traditionally, popular regularization methods are the penalized Variational approaches. In recent years, the classical regularization approaches have been outclassed by the so-called plug-and-play (PnP) algorithms, which copy the proximal gradient minimization processes, su… ▽ More It's well-known that inverse problems are ill-posed and to solve them meaningfully, one has to employ regularization methods. Traditionally, popular regularization methods are the penalized Variational approaches. In recent years, the classical regularization approaches have been outclassed by the so-called plug-and-play (PnP) algorithms, which copy the proximal gradient minimization processes, such as ADMM or FISTA, but with any general denoiser. However, unlike the traditional proximal gradient methods, the theoretical underpinnings, convergence, and stability results have been insufficient for these PnP-algorithms. Hence, the results obtained from these algorithms, though empirically outstanding, can't always be completely trusted, as they may contain certain instabilities or (hallucinated) features arising from the denoiser, especially when using a pre-trained learned denoiser. In fact, in this paper, we show that a PnP-algorithm can induce hallucinated features, when using a pre-trained deep-learning-based (DnCNN) denoiser. We show that such instabilities are quite different than the instabilities inherent to an ill-posed problem. We also present methods to subdue these instabilities and significantly improve the recoveries. We compare the advantages and disadvantages of a learned denoiser over a classical denoiser (here, BM3D), as well as, the effectiveness of the FISTA-PnP algorithm vs. the ADMM-PnP algorithm. In addition, we also provide an algorithm to combine these two denoisers, the learned and the classical, in a weighted fashion to produce even better results. We conclude with numerical results which validate the developed theories. △ Less

Submitted 17 August, 2021; originally announced September 2021.

Comments: arXiv admin note: text overlap with arXiv:2106.07795

MSC Class: Primary 65K05; 65K10; Secondary 65R30; 65R32

arXiv:2108.13551 [pdf, other]

Regularizing Instabilities in Image Reconstruction Arising from Learned Denoisers

Authors: Abinash Nayak

Abstract: It's well-known that inverse problems are ill-posed and to solve them meaningfully one has to employ regularization methods. Traditionally, popular regularization methods have been the penalized Variational approaches. In recent years, the classical regularized-reconstruction approaches have been outclassed by the (deep-learning-based) learned reconstruction algorithms. However, unlike the traditi… ▽ More It's well-known that inverse problems are ill-posed and to solve them meaningfully one has to employ regularization methods. Traditionally, popular regularization methods have been the penalized Variational approaches. In recent years, the classical regularized-reconstruction approaches have been outclassed by the (deep-learning-based) learned reconstruction algorithms. However, unlike the traditional regularization methods, the theoretical underpinnings, such as stability and regularization, have been insufficient for such learned reconstruction algorithms. Hence, the results obtained from such algorithms, though empirically outstanding, can't always be completely trusted, as they may contain certain instabilities or (hallucinated) features arising from the learned process. In fact, it has been shown that such learning algorithms are very susceptible to small (adversarial) noises in the data and can lead to severe instabilities in the recovered solution, which can be quite different than the inherent instabilities of the ill-posed (inverse) problem. Whereas, the classical regularization methods can handle such (adversarial) noises very well and can produce stable recovery. Here, we try to present certain regularization methods to stabilize such (unstable) learned reconstruction methods and recover a regularized solution, even in the presence of adversarial noises. For this, we need to extend the classical notion of regularization and incorporate it in the learned reconstruction algorithms. We also present some regularization techniques to regularize two of the most popular learning reconstruction algorithms, the Learned Post-Processing Reconstruction and the Learned Unrolling Reconstruction. △ Less

Submitted 15 February, 2022; v1 submitted 21 August, 2021; originally announced August 2021.

MSC Class: Primary 65K05; 65K10; Secondary 65R30; 65R32

arXiv:2108.11860 [pdf, ps, other]

Auxiliary Heuristics for Frontier Based Planners

Authors: Arsh Tangri, Dhruv Joshi, Ashalatha Nayak

Abstract: Autonomous exploration of unknown environments is a vital function for robots and has applications in a wide variety of scenarios. Our focus primarily lies in its application for the task of efficient coverage of unknown environments. Various methods have been proposed for this task and frontier based methods are an efficient category in this class of methods. Efficiency is of utmost importance in… ▽ More Autonomous exploration of unknown environments is a vital function for robots and has applications in a wide variety of scenarios. Our focus primarily lies in its application for the task of efficient coverage of unknown environments. Various methods have been proposed for this task and frontier based methods are an efficient category in this class of methods. Efficiency is of utmost importance in exploration and heuristics play a critical role in guiding our search. In this work we demonstrate the ability of heuristics that are learnt by imitating clairvoyant oracles. These learnt heuristics can be used to predict the expected future return from selected states without building search trees, which are inefficient and limited by on-board compute. We also propose an additional filter-based heuristic which results in an enhancement in the performance of the frontier-based planner with respect to certain tasks such as coverage planning. △ Less

Submitted 26 August, 2021; originally announced August 2021.

arXiv:2107.14036 [pdf, ps, other]

Self-Driving Cars and Driver Alertness

Authors: Nguyen H Tran, Abhaya C Nayak

Abstract: Recent years have seen growing interest in the development of self-driving vehicles that promise (or threaten) to replace human drivers with intelligent software. However, current self-driving cars still require human supervision and prompt takeover of control when necessary. Poor alertness while controlling self-driving cars could hinder the drivers' ability to intervene during unpredictable situ… ▽ More Recent years have seen growing interest in the development of self-driving vehicles that promise (or threaten) to replace human drivers with intelligent software. However, current self-driving cars still require human supervision and prompt takeover of control when necessary. Poor alertness while controlling self-driving cars could hinder the drivers' ability to intervene during unpredictable situations, thus increasing the risk of avoidable accidents. In this paper we examine the key factors that contribute to drivers' poor alertness, and the potential solutions that have been proposed to address them. Based on this examination we make some recommendations for various stakeholders, such as researchers, drivers, industry and policy makers. △ Less

Submitted 20 July, 2021; originally announced July 2021.

Comments: 12 pages. Planned to be submitted to the 34th Australasian Joint Conference on Artificial Intelligence (AJCAI) 2021

arXiv:2107.04140 [pdf, other]

First-Generation Inference Accelerator Deployment at Facebook

Authors: Michael Anderson, Benny Chen, Stephen Chen, Summer Deng, Jordan Fix, Michael Gschwind, Aravind Kalaiah, Changkyu Kim, Jaewon Lee, Jason Liang, Haixin Liu, Yinghai Lu, Jack Montgomery, Arun Moorthy, Satish Nadathur, Sam Naghshineh, Avinash Nayak, Jongsoo Park, Chris Petersen, Martin Schatz, Narayanan Sundaram, Bangsheng Tang, Peter Tang, Amy Yang, Jiecao Yu , et al. (90 additional authors not shown)

Abstract: In this paper, we provide a deep dive into the deployment of inference accelerators at Facebook. Many of our ML workloads have unique characteristics, such as sparse memory accesses, large model sizes, as well as high compute, memory and network bandwidth requirements. We co-designed a high-performance, energy-efficient inference accelerator platform based on these requirements. We describe the in… ▽ More In this paper, we provide a deep dive into the deployment of inference accelerators at Facebook. Many of our ML workloads have unique characteristics, such as sparse memory accesses, large model sizes, as well as high compute, memory and network bandwidth requirements. We co-designed a high-performance, energy-efficient inference accelerator platform based on these requirements. We describe the inference accelerator platform ecosystem we developed and deployed at Facebook: both hardware, through Open Compute Platform (OCP), and software framework and tooling, through Pytorch/Caffe2/Glow. A characteristic of this ecosystem from the start is its openness to enable a variety of AI accelerators from different vendors. This platform, with six low-power accelerator cards alongside a single-socket host CPU, allows us to serve models of high complexity that cannot be easily or efficiently run on CPUs. We describe various performance optimizations, at both platform and accelerator level, which enables this platform to serve production traffic at Facebook. We also share deployment challenges, lessons learned during performance optimization, as well as provide guidance for future inference hardware co-design. △ Less

Submitted 4 August, 2021; v1 submitted 8 July, 2021; originally announced July 2021.

arXiv:2106.07349 [pdf, other]

Using Integrated Gradients and Constituency Parse Trees to explain Linguistic Acceptability learnt by BERT

Authors: Anmol Nayak, Hari Prasad Timmapathini

Abstract: Linguistic Acceptability is the task of determining whether a sentence is grammatical or ungrammatical. It has applications in several use cases like Question-Answering, Natural Language Generation, Neural Machine Translation, where grammatical correctness is crucial. In this paper we aim to understand the decision-making process of BERT (Devlin et al., 2019) in distinguishing between Linguistical… ▽ More Linguistic Acceptability is the task of determining whether a sentence is grammatical or ungrammatical. It has applications in several use cases like Question-Answering, Natural Language Generation, Neural Machine Translation, where grammatical correctness is crucial. In this paper we aim to understand the decision-making process of BERT (Devlin et al., 2019) in distinguishing between Linguistically Acceptable sentences (LA) and Linguistically Unacceptable sentences (LUA). We leverage Layer Integrated Gradients Attribution Scores (LIG) to explain the Linguistic Acceptability criteria that are learnt by BERT on the Corpus of Linguistic Acceptability (CoLA) (Warstadt et al., 2018) benchmark dataset. Our experiments on 5 categories of sentences lead to the following interesting findings: 1) LIG for LA are significantly smaller in comparison to LUA, 2) There are specific subtrees of the Constituency Parse Tree (CPT) for LA and LUA which contribute larger LIG, 3) Across the different categories of sentences we observed around 88% to 100% of the Correctly classified sentences had positive LIG, indicating a strong positive relationship to the prediction confidence of the model, and 4) Around 43% of the Misclassified sentences had negative LIG, which we believe can become correctly classified sentences if the LIG are parameterized in the loss function of the model. △ Less

Submitted 8 March, 2022; v1 submitted 1 June, 2021; originally announced June 2021.

Comments: Accepted at International Conference on Natural Language Processing (ICON) 2021. 6 pages, 3 figures

arXiv:2104.14436 [pdf, ps, other]

doi 10.26421/QIC22.9-10-3

Deterministic Algorithms for the Hidden Subgroup Problem

Authors: Ashwin Nayak

Abstract: We present deterministic algorithms for the Hidden Subgroup Problem. The first algorithm, for abelian groups, achieves the same asymptotic worst-case query complexity as the optimal randomized algorithm, namely O($\sqrt{ n}\,$), where $n$ is the order of the group. The analogous algorithm for non-abelian groups comes within a $\sqrt{ \log n}$ factor of the optimal randomized query complexity. The… ▽ More We present deterministic algorithms for the Hidden Subgroup Problem. The first algorithm, for abelian groups, achieves the same asymptotic worst-case query complexity as the optimal randomized algorithm, namely O($\sqrt{ n}\,$), where $n$ is the order of the group. The analogous algorithm for non-abelian groups comes within a $\sqrt{ \log n}$ factor of the optimal randomized query complexity. The best known randomized algorithm for the Hidden Subgroup Problem has expected query complexity that is sensitive to the input, namely O($\sqrt{ n/m}\,$), where $m$ is the order of the hidden subgroup. In the first version of this article (arXiv:2104.14436v1 [cs.DS]), we asked if there is a deterministic algorithm whose query complexity has a similar dependence on the order of the hidden subgroup. Prompted by this question, Ye and Li (arXiv:2110.00827v1 [cs.DS]) present deterministic algorithms for abelian groups which solve the problem with O($\sqrt{ n/m }\,$ ) queries, and find the hidden subgroup with O($\sqrt{ n (\log m) / m} + \log m$) queries. Moreover, they exhibit instances which show that in general, the deterministic query complexity of the problem may be o($\sqrt{ n/m } \,$), and that of finding the entire subgroup may also be o($\sqrt{ n/m } \,$) or even $ω(\sqrt{ n/m } \,)$. We present a different deterministic algorithm for the Hidden Subgroup Problem that also has query complexity O($\sqrt{ n/m }\,$) for abelian groups. The algorithm is arguably simpler. Moreover, it works for non-abelian groups, and has query complexity O($\sqrt{ (n/m) \log (n/m) }\,$) for a large class of instances, such as those over supersolvable groups. We build on this to design deterministic algorithms to find the hidden subgroup for all abelian and some non-abelian instances, at the cost of a $\log m$ multiplicative factor increase in the query complexity. △ Less

Submitted 10 June, 2022; v1 submitted 29 April, 2021; originally announced April 2021.

Comments: v3: 14 pages. Added some details. Minor edits for clarity. Close to published version. v2: 14 pages. Added references to past and recent related work, added new algorithms partially resolving an open question from the previous version. v1: 7 pages

MSC Class: 68Q25; 68Q12 (Primary) 20D99 (Secondary) ACM Class: F.2.2

Journal ref: Quantum Information and Computation, Vol. 22, No. 9&10 (2022), pp. 755--769

arXiv:2104.08753 [pdf, ps, other]

doi 10.1109/TIT.2023.3271316

One-shot quantum state redistribution and quantum Markov chains

Authors: Anurag Anshu, Shima Bab Hadiashar, Rahul Jain, Ashwin Nayak, Dave Touchette

Abstract: We revisit the task of quantum state redistribution in the one-shot setting, and design a protocol for this task with communication cost in terms of a measure of distance from quantum Markov chains. More precisely, the distance is defined in terms of quantum max-relative entropy and quantum hypothesis testing entropy. Our result is the first to operationally connect quantum state redistribution… ▽ More We revisit the task of quantum state redistribution in the one-shot setting, and design a protocol for this task with communication cost in terms of a measure of distance from quantum Markov chains. More precisely, the distance is defined in terms of quantum max-relative entropy and quantum hypothesis testing entropy. Our result is the first to operationally connect quantum state redistribution and quantum Markov chains, and can be interpreted as an operational interpretation for a possible one-shot analogue of quantum conditional mutual information. The communication cost of our protocol is lower than all previously known ones and asymptotically achieves the well-known rate of quantum conditional mutual information. Thus, our work takes a step towards the important open question of near-optimal characterization of the one-shot quantum state redistribution. △ Less

Submitted 12 October, 2023; v1 submitted 18 April, 2021; originally announced April 2021.

Comments: v2: Exposition improved, references added, typos corrected. 32 pages, 2 figures. v1: 26 pages, 2 figures

Journal ref: IEEE Transactions on Information Theory, vol. 69, no. 9, pp. 5788--5804, Sept. 2023

arXiv:2104.05158 [pdf, other]

Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models

Authors: Dheevatsa Mudigere, Yuchen Hao, Jianyu Huang, Zhihao Jia, Andrew Tulloch, Srinivas Sridharan, Xing Liu, Mustafa Ozdal, Jade Nie, Jongsoo Park, Liang Luo, Jie Amy Yang, Leon Gao, Dmytro Ivchenko, Aarti Basant, Yuxi Hu, Jiyan Yang, Ehsan K. Ardestani, Xiaodong Wang, Rakesh Komuravelli, Ching-Hsiang Chu, Serhat Yilmaz, Huayu Li, Jiyuan Qian, Zhuobo Feng , et al. (28 additional authors not shown)

Abstract: Deep learning recommendation models (DLRMs) are used across many business-critical services at Facebook and are the single largest AI application in terms of infrastructure demand in its data-centers. In this paper we discuss the SW/HW co-designed solution for high-performance distributed training of large-scale DLRMs. We introduce a high-performance scalable software stack based on PyTorch and pa… ▽ More Deep learning recommendation models (DLRMs) are used across many business-critical services at Facebook and are the single largest AI application in terms of infrastructure demand in its data-centers. In this paper we discuss the SW/HW co-designed solution for high-performance distributed training of large-scale DLRMs. We introduce a high-performance scalable software stack based on PyTorch and pair it with the new evolution of Zion platform, namely ZionEX. We demonstrate the capability to train very large DLRMs with up to 12 Trillion parameters and show that we can attain 40X speedup in terms of time to solution over previous systems. We achieve this by (i) designing the ZionEX platform with dedicated scale-out network, provisioned with high bandwidth, optimal topology and efficient transport (ii) implementing an optimized PyTorch-based training stack supporting both model and data parallelism (iii) developing sharding algorithms capable of hierarchical partitioning of the embedding tables along row, column dimensions and load balancing them across multiple workers; (iv) adding high-performance core operators while retaining flexibility to support optimizers with fully deterministic updates (v) leveraging reduced precision communications, multi-level memory hierarchy (HBM+DDR+SSD) and pipelining. Furthermore, we develop and briefly comment on distributed data ingestion and other supporting services that are required for the robust and efficient end-to-end training in production environments. △ Less

Submitted 26 February, 2023; v1 submitted 11 April, 2021; originally announced April 2021.

arXiv:2011.10036 [pdf, other]

On the Dynamics of Training Attention Models

Authors: Haoye Lu, Yongyi Mao, Amiya Nayak

Abstract: The attention mechanism has been widely used in deep neural networks as a model component. By now, it has become a critical building block in many state-of-the-art natural language models. Despite its great success established empirically, the working mechanism of attention has not been investigated at a sufficient theoretical depth to date. In this paper, we set up a simple text classification ta… ▽ More The attention mechanism has been widely used in deep neural networks as a model component. By now, it has become a critical building block in many state-of-the-art natural language models. Despite its great success established empirically, the working mechanism of attention has not been investigated at a sufficient theoretical depth to date. In this paper, we set up a simple text classification task and study the dynamics of training a simple attention-based classification model using gradient descent. In this setting, we show that, for the discriminative words that the model should attend to, a persisting identity exists relating its embedding and the inner product of its key and the query. This allows us to prove that training must converge to attending to the discriminative words when the attention output is classified by a linear classifier. Experiments are performed, which validate our theoretical analysis and provide further insights. △ Less

Submitted 18 March, 2021; v1 submitted 19 November, 2020; originally announced November 2020.

arXiv:2006.09815 [pdf, other]

A Deep Neural Network for Audio Classification with a Classifier Attention Mechanism

Authors: Haoye Lu, Haolong Zhang, Amit Nayak

Abstract: Audio classification is considered as a challenging problem in pattern recognition. Recently, many algorithms have been proposed using deep neural networks. In this paper, we introduce a new attention-based neural network architecture called Classifier-Attention-Based Convolutional Neural Network (CAB-CNN). The algorithm uses a newly designed architecture consisting of a list of simple classifiers… ▽ More Audio classification is considered as a challenging problem in pattern recognition. Recently, many algorithms have been proposed using deep neural networks. In this paper, we introduce a new attention-based neural network architecture called Classifier-Attention-Based Convolutional Neural Network (CAB-CNN). The algorithm uses a newly designed architecture consisting of a list of simple classifiers and an attention mechanism as a classifier selector. This design significantly reduces the number of parameters required by the classifiers and thus their complexities. In this way, it becomes easier to train the classifiers and achieve a high and steady performance. Our claims are corroborated by the experimental results. Compared to the state-of-the-art algorithms, our algorithm achieves more than 10% improvements on all selected test scores. △ Less

Submitted 14 June, 2020; originally announced June 2020.

arXiv:2002.11795 [pdf, other]

doi 10.1145/3512751

Quantum Distributed Complexity of Set Disjointness on a Line

Authors: Frederic Magniez, Ashwin Nayak

Abstract: Set Disjointness on a Line is a variant of the Set Disjointness problem in a distributed computing scenario with $d+1$ processors arranged on a path of length $d$. It was introduced by Le Gall and Magniez (PODC 2018) for proving lower bounds on the quantum distributed complexity of computing the diameter of an arbitrary network in the CONGEST model. However, they were only able to provide a lower… ▽ More Set Disjointness on a Line is a variant of the Set Disjointness problem in a distributed computing scenario with $d+1$ processors arranged on a path of length $d$. It was introduced by Le Gall and Magniez (PODC 2018) for proving lower bounds on the quantum distributed complexity of computing the diameter of an arbitrary network in the CONGEST model. However, they were only able to provide a lower bound when the local memory used by the processors on the intermediate vertices of the path consists of O$( \log n)$ qubits for $n$-bit inputs. We prove an unconditional lower bound of $\widetildeΩ\big(\sqrt[3]{n d^2}+\sqrt{n} \, \big)$ rounds for Set Disjointness on a Line with $d + 1$ processors. The result gives us a new lower bound of $\widetildeΩ \big( \sqrt[3]{nδ^2}+\sqrt{n} \, \big)$ on the number of rounds required for computing the diameter $δ$ of any $n$-node network with quantum messages of size O$(\log n)$ in the CONGEST model. We draw a connection between the distributed computing scenario above and a new model of query complexity. The information-theoretic technique we use for deriving the round lower bound for Set Disjointness on a Line also applies to the number of rounds in this query model. We provide an algorithm for Set Disjointness in this query model with round complexity that matches the round lower bound stated above, up to a polylogarithmic factor. This presents a barrier for obtaining a better round lower bound for Set Disjointness on the Line. At the same time, it hints at the possibility of better communication protocols for the problem. △ Less

Submitted 8 March, 2022; v1 submitted 26 February, 2020; originally announced February 2020.

Comments: 24 pages, 2 figures. In v3, background on Theorem 3.5 (from prior work) was included in an appendix. In v4, a detail in the statement of Theorem 3.5 has been corrected, and corresponding changes have been made in the rest of the paper. The results remain unchanged. In v5, extended main theorems to the entanglement-assisted setting, added explanations and clarifications, corrected typos

Journal ref: ACM Transactions on Computation Theory, Volume 14, Issue 1, March 2022, Article No.: 5, pp 1-22

arXiv:2001.02818 [pdf, other]

Capacity Approaching Coding for Low Noise Interactive Quantum Communication, Part I: Large Alphabets

Authors: Debbie Leung, Ashwin Nayak, Ala Shayeghi, Dave Touchette, Penghui Yao, Nengkun Yu

Abstract: We consider the problem of implementing two-party interactive quantum communication over noisy channels, a necessary endeavor if we wish to fully reap quantum advantages for communication. For an arbitrary protocol with $n$ messages, designed for a noiseless qudit channel over a $\mathrm{poly}(n)$ size alphabet, our main result is a simulation method that fails with probability less than… ▽ More We consider the problem of implementing two-party interactive quantum communication over noisy channels, a necessary endeavor if we wish to fully reap quantum advantages for communication. For an arbitrary protocol with $n$ messages, designed for a noiseless qudit channel over a $\mathrm{poly}(n)$ size alphabet, our main result is a simulation method that fails with probability less than $2^{-Θ(nε)}$ and uses a qudit channel over the same alphabet $n\left(1+Θ\left(\sqrtε\right)\right)$ times, of which an $ε$ fraction can be corrupted adversarially. The simulation is thus capacity achieving to leading order, and we conjecture that it is optimal up to a constant factor in the $\sqrtε$ term. Furthermore, the simulation is in a model that does not require pre-shared resources such as randomness or entanglement between the communicating parties. Our work improves over the best previously known quantum result where the overhead is a non-explicit large constant [Brassard et al., FOCS'14] for low $ε$. △ Less

Submitted 8 January, 2020; originally announced January 2020.

Comments: 94 pages, 7 figures

arXiv:1907.01692 [pdf, other]

An Approximation Algorithm for a Task Allocation, Sequencing and Scheduling Problem involving a Human-Robot Team

Authors: Sai Krishna Hari, Abhishek Nayak, Sivakumar Rathinam

Abstract: This article presents an approximation algorithm for a task allocation, sequencing and scheduling problem involving a team of human operators and robots. Specifically, we present an algorithm with an approximation ratio as a function of the number of human operators ($m$) and the number of robots ($k$) in the team. The approximation ratios are $\frac{7}{2} -\frac{5}{2k}$,… ▽ More This article presents an approximation algorithm for a task allocation, sequencing and scheduling problem involving a team of human operators and robots. Specifically, we present an algorithm with an approximation ratio as a function of the number of human operators ($m$) and the number of robots ($k$) in the team. The approximation ratios are $\frac{7}{2} -\frac{5}{2k}$, $\frac{5}{2} -\frac{1}{k}$ and $\frac{7}{2} -\frac{1}{k}$ when $m=1$, $m\geq k\geq 2$ and $k>m\geq 2$ respectively. We also present computational results to corroborate the performance of the proposed approximation algorithm. △ Less

Submitted 11 September, 2019; v1 submitted 2 July, 2019; originally announced July 2019.

arXiv:1907.01682 [pdf, ps, other]

On Conforming and Conflicting Values

Authors: Kinzang Chhogyal, Abhaya Nayak, Aditya Ghose, Mehmet Orgun, Hoa Dam

Abstract: Values are things that are important to us. Actions activate values - they either go against our values or they promote our values. Values themselves can either be conforming or conflicting depending on the action that is taken. In this short paper, we argue that values may be classified as one of two types - conflicting and inherently conflicting values. They are distinguished by the fact that th… ▽ More Values are things that are important to us. Actions activate values - they either go against our values or they promote our values. Values themselves can either be conforming or conflicting depending on the action that is taken. In this short paper, we argue that values may be classified as one of two types - conflicting and inherently conflicting values. They are distinguished by the fact that the latter in some sense can be thought of as being independent of actions. This allows us to do two things: i) check whether a set of values is consistent and ii) check whether it is in conflict with other sets of values. △ Less

Submitted 7 July, 2019; v1 submitted 2 July, 2019; originally announced July 2019.

Comments: AI for Social Good Workshop, IJCAI 2019

arXiv:1905.13380 [pdf, ps, other]

A Value-based Trust Assessment Model for Multi-agent Systems

Authors: Kinzang Chhogyal, Abhaya Nayak, Aditya Ghose, Hoa Khanh Dam

Abstract: An agent's assessment of its trust in another agent is commonly taken to be a measure of the reliability/predictability of the latter's actions. It is based on the trustor's past observations of the behaviour of the trustee and requires no knowledge of the inner-workings of the trustee. However, in situations that are new or unfamiliar, past observations are of little help in assessing trust. In s… ▽ More An agent's assessment of its trust in another agent is commonly taken to be a measure of the reliability/predictability of the latter's actions. It is based on the trustor's past observations of the behaviour of the trustee and requires no knowledge of the inner-workings of the trustee. However, in situations that are new or unfamiliar, past observations are of little help in assessing trust. In such cases, knowledge about the trustee can help. A particular type of knowledge is that of values - things that are important to the trustor and the trustee. In this paper, based on the premise that the more values two agents share, the more they should trust one another, we propose a simple approach to trust assessment between agents based on values, taking into account if agents trust cautiously or boldly, and if they depend on others in carrying out a task. △ Less

Submitted 30 May, 2019; originally announced May 2019.

arXiv:1905.02110 [pdf, other]

doi 10.22331/q-2020-06-25-286

On the Entanglement Cost of One-Shot Compression

Authors: Shima Bab Hadiashar, Ashwin Nayak

Abstract: We revisit the task of visible compression of an ensemble of quantum states with entanglement assistance in the one-shot setting. The protocols achieving the best compression use many more qubits of shared entanglement than the number of qubits in the states in the ensemble. Other compression protocols, with potentially larger communication cost, have entanglement cost bounded by the number of qub… ▽ More We revisit the task of visible compression of an ensemble of quantum states with entanglement assistance in the one-shot setting. The protocols achieving the best compression use many more qubits of shared entanglement than the number of qubits in the states in the ensemble. Other compression protocols, with potentially larger communication cost, have entanglement cost bounded by the number of qubits in the given states. This motivates the question as to whether entanglement is truly necessary for compression, and if so, how much of it is needed. Motivated by questions in communication complexity, we lift certain restrictions that are placed on compression protocols in tasks such as state-splitting and channel simulation. We show that an ensemble of the form designed by Jain, Radhakrishnan, and Sen (ICALP'03) saturates the known bounds on the sum of communication and entanglement costs, even with the relaxed compression protocols we study. The ensemble and the associated one-way communication protocol have several remarkable properties. The ensemble is incompressible by more than a constant number of qubits without shared entanglement, even when constant error is allowed. Moreover, in the presence of shared entanglement, the communication cost of compression can be arbitrarily smaller than the entanglement cost. The quantum information cost of the protocol can thus be arbitrarily smaller than the cost of compression without shared entanglement. The ensemble can also be used to show the impossibility of reducing, via compression, the shared entanglement used in two-party protocols for computing Boolean functions. △ Less

Submitted 2 July, 2020; v1 submitted 6 May, 2019; originally announced May 2019.

Comments: 23 pages, 1 figure. The paper is the same as v3; only metadata are updated. In v3: Main body reorganized; Thm 1.1 rephrased; a discussion of the entanglement measure added to Sec. 4; several improvements to the text throughout; Cor. 3.5 and Cor. 3.6 merged; Appendix created for the proofs of Cor. 3.5 and another statement in the main body; typos fixed; three references added

Journal ref: Quantum 4, 286 (2020)

arXiv:1901.00209 [pdf, other]

Smart Information Spreading for Opinion Maximization in Social Networks

Authors: Anuj Nayak, Seyyedali Hosseinalipour, Huaiyu Dai

Abstract: The goal of opinion maximization is to maximize the positive view towards a product, an ideology or any entity among the individuals in social networks. So far, opinion maximization is mainly studied as finding a set of influential nodes for fast content dissemination in a social network. In this paper, we propose a novel approach to solve the problem, where opinion maximization is achieved throug… ▽ More The goal of opinion maximization is to maximize the positive view towards a product, an ideology or any entity among the individuals in social networks. So far, opinion maximization is mainly studied as finding a set of influential nodes for fast content dissemination in a social network. In this paper, we propose a novel approach to solve the problem, where opinion maximization is achieved through efficient information spreading. In our model, multiple sources inject information continuously into the network, while the regular nodes with heterogeneous social learning abilities spread the information to their acquaintances through gossip mechanism. One of the sources employs smart information spreading and the rest spread information randomly. We model the social interactions and evolution of opinions as a dynamic Bayesian network (DBN), using which the opinion maximization is formulated as a sequential decision problem. Since the problem is intractable, we develop multiple variants of centralized and decentralized learning algorithms to obtain approximate solutions. Through simulations in synthetic and real-world networks, we demonstrate two key results: 1) the proposed methods perform better than random spreading by a large margin, and 2) even though the smart source (that spreads the desired content) is unfavorably located in the network, it can outperform the contending random sources located at favorable positions. △ Less

Submitted 1 January, 2019; originally announced January 2019.

Comments: 13 pages, 11 figures, INFOCOM extended version

Showing 1–50 of 85 results for author: Nayak, A