subscribe to arXiv mailings

Resilience of the Electric Grid through Trustable IoT-Coordinated Assets

Authors: Vineet J. Nair, Venkatesh Venkataramanan, Priyank Srivastava, Partha S. Sarker, Anurag Srivastava, Laurentiu D. Marinovici, Jun Zha, Christopher Irwin, Prateek Mittal, John Williams, H. Vincent Poor, Anuradha M. Annaswamy

Abstract: The electricity grid has evolved from a physical system to a cyber-physical system with digital devices that perform measurement, control, communication, computation, and actuation. The increased penetration of distributed energy resources (DERs) that include renewable generation, flexible loads, and storage provides extraordinary opportunities for improvements in efficiency and sustainability. Ho… ▽ More The electricity grid has evolved from a physical system to a cyber-physical system with digital devices that perform measurement, control, communication, computation, and actuation. The increased penetration of distributed energy resources (DERs) that include renewable generation, flexible loads, and storage provides extraordinary opportunities for improvements in efficiency and sustainability. However, they can introduce new vulnerabilities in the form of cyberattacks, which can cause significant challenges in ensuring grid resilience. %, i.e. the ability to rapidly restore grid services in the face of severe disruptions. We propose a framework in this paper for achieving grid resilience through suitably coordinated assets including a network of Internet of Things (IoT) devices. A local electricity market is proposed to identify trustable assets and carry out this coordination. Situational Awareness (SA) of locally available DERs with the ability to inject power or reduce consumption is enabled by the market, together with a monitoring procedure for their trustability and commitment. With this SA, we show that a variety of cyberattacks can be mitigated using local trustable resources without stressing the bulk grid. The demonstrations are carried out using a variety of platforms with a high-fidelity co-simulation platform, real-time hardware-in-the-loop validation, and a utility-friendly simulator. △ Less

Submitted 21 June, 2024; originally announced June 2024.

Comments: Submitted to the Proceedings of the National Academy of Sciences (PNAS), under review

arXiv:2406.04517 [pdf, other]

FOX: Coverage-guided Fuzzing as Online Stochastic Control

Authors: Dongdong She, Adam Storek, Yuchong Xie, Seoyoung Kweon, Prashast Srivastava, Suman Jana

Abstract: Fuzzing is an effective technique for discovering software vulnerabilities by generating random test inputs and executing them against the target program. However, fuzzing large and complex programs remains challenging due to difficulties in uncovering deeply hidden vulnerabilities. This paper addresses the limitations of existing coverage-guided fuzzers, focusing on the scheduler and mutator comp… ▽ More Fuzzing is an effective technique for discovering software vulnerabilities by generating random test inputs and executing them against the target program. However, fuzzing large and complex programs remains challenging due to difficulties in uncovering deeply hidden vulnerabilities. This paper addresses the limitations of existing coverage-guided fuzzers, focusing on the scheduler and mutator components. Existing schedulers suffer from information sparsity and the inability to handle fine-grained feedback metrics. The mutators are agnostic of target program branches, leading to wasted computation and slower coverage exploration. To overcome these issues, we propose an end-to-end online stochastic control formulation for coverage-guided fuzzing. Our approach incorporates a novel scheduler and custom mutator that can adapt to branch logic, maximizing aggregate edge coverage achieved over multiple stages. The scheduler utilizes fine-grained branch distance measures to identify frontier branches, where new coverage is likely to be achieved. The mutator leverages branch distance information to perform efficient and targeted seed mutations, leading to robust progress with minimal overhead. We present FOX, a proof-of-concept implementation of our control-theoretic approach, and compare it to industry-standard coverage-guided fuzzers. 6 CPU-years of extensive evaluations on the FuzzBench dataset and complex real-world programs (a total of 38 test programs) demonstrate that FOX outperforms existing state-of-the-art fuzzers, achieving average coverage improvements up to 26.45% in real-world standalone programs and 6.59% in FuzzBench programs over the state-of-the-art AFL++. In addition, it uncovers 20 unique bugs in popular real-world applications including eight that are previously unknown, showcasing real-world security impact. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: To Appear in Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security (CCS '24)

arXiv:2405.16639 [pdf, ps, other]

A unified law of robustness for Bregman divergence losses

Authors: Santanu Das, Jatin Batra, Piyush Srivastava

Abstract: In contemporary deep learning practice, models are often trained to near zero loss i.e. to nearly interpolate the training data. However, the number of parameters in the model is usually far more than the number of data points $n$, the theoretical minimum needed for interpolation: a phenomenon referred to as overparameterization. In an interesting piece of work that contributes to the considerable… ▽ More In contemporary deep learning practice, models are often trained to near zero loss i.e. to nearly interpolate the training data. However, the number of parameters in the model is usually far more than the number of data points $n$, the theoretical minimum needed for interpolation: a phenomenon referred to as overparameterization. In an interesting piece of work that contributes to the considerable research that has been devoted to understand overparameterization, Bubeck, and Sellke showed that for a broad class of covariate distributions (specifically those satisfying a natural notion of concentration of measure), overparameterization is necessary for robust interpolation i.e. if the interpolating function is required to be Lipschitz. However, their robustness results were proved only in the setting of regression with square loss. In practice, however many other kinds of losses are used, e.g. cross entropy loss for classification. In this work, we generalize Bubeck and Selke's result to Bregman divergence losses, which form a common generalization of square loss and cross-entropy loss. Our generalization relies on identifying a bias variance-type decomposition that lies at the heart of the proof and Bubeck and Sellke. △ Less

Submitted 26 May, 2024; originally announced May 2024.

Comments: 16 pages

arXiv:2405.16048 [pdf, ps, other]

A New Construction of Optimal Symmetrical ZCCS

Authors: Rajen Kumar, Prashant Kumar Srivastava, Sudhan Majhi

Abstract: We propose new constructions for a two-dimensional ($2$D) perfect array, complete complementary code (CCC), and multiple CCCs as an optimal symmetrical $Z$-complementary code set (ZCCS). We propose a method to generate a two-dimensional perfect array and CCC. By utilising mutually orthogonal sequences, we developed a method to extend the length of a CCC without affecting the set or code size. Addi… ▽ More We propose new constructions for a two-dimensional ($2$D) perfect array, complete complementary code (CCC), and multiple CCCs as an optimal symmetrical $Z$-complementary code set (ZCCS). We propose a method to generate a two-dimensional perfect array and CCC. By utilising mutually orthogonal sequences, we developed a method to extend the length of a CCC without affecting the set or code size. Additionally, this concept is extended to include the development of multiple CCCs, and the correlation characteristics of these multiple CCCs are identical with the characteristics of optimal symmetrical ZCCS. △ Less

Submitted 25 May, 2024; originally announced May 2024.

Comments: This paper has been accepted in 'IEEE International Symposium on Information Theory (ISIT 2024)'

arXiv:2403.10644 [pdf, ps, other]

Multiple Spectrally Null Constrained Complete Complementary Codes of Various Lengths Over Small Alphabet

Authors: Rajen Kumar, Palash Sarkar, Prashant Kumar Srivastava, Sudhan Majhi

Abstract: Complete complementary codes (CCCs) are highly valuable in the fields of information security, radar and communication. The spectrally null constrained (SNC) problem arises in radar and modern communication systems due to the reservation or prohibition of specific spectrums from transmission. The literature on SNC-CCCs is somewhat limited in comparison to the literature on traditional CCCs. The ma… ▽ More Complete complementary codes (CCCs) are highly valuable in the fields of information security, radar and communication. The spectrally null constrained (SNC) problem arises in radar and modern communication systems due to the reservation or prohibition of specific spectrums from transmission. The literature on SNC-CCCs is somewhat limited in comparison to the literature on traditional CCCs. The main objective of this paper is to discover several configurations of SNC-CCCs that possess more flexibility in their parameters. The proposed construction utilised the existing CCCs and mutually orthogonal sequences. The proposed construction can cover almost all lengths with the smallest alphabets $\{-1,0,1\}$. Further, the idea of SNC-CCC is extended to multiple SNC-CCC with an inter-set zero cross-correlation zone (ZCCZ). Based on our construction, we can also control the correlation value outside the ZCCZ. The beauty of the obtained codes have aperiodic and periodic inter-set ZCCZ and low cross-correlation side-lobs. △ Less

Submitted 15 March, 2024; originally announced March 2024.

arXiv:2402.11194 [pdf, other]

Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering

Authors: Pragya Srivastava, Manuj Malik, Vivek Gupta, Tanuja Ganu, Dan Roth

Abstract: Large Language Models (LLMs), excel in natural language understanding, but their capability for complex mathematical reasoning with an amalgamation of structured tables and unstructured text is uncertain. This study explores LLMs' mathematical reasoning on four financial tabular question-answering datasets: TATQA, FinQA, ConvFinQA, and Multihiertt. Through extensive experiments with various models… ▽ More Large Language Models (LLMs), excel in natural language understanding, but their capability for complex mathematical reasoning with an amalgamation of structured tables and unstructured text is uncertain. This study explores LLMs' mathematical reasoning on four financial tabular question-answering datasets: TATQA, FinQA, ConvFinQA, and Multihiertt. Through extensive experiments with various models and prompting techniques, we assess how LLMs adapt to complex tables and mathematical tasks. We focus on sensitivity to table complexity and performance variations with an increasing number of arithmetic reasoning steps. The results provide insights into LLMs' capabilities and limitations in handling complex mathematical scenarios for semi-structured tables. Ultimately, we introduce a novel prompting technique tailored to semi-structured documents, matching or outperforming other baselines in performance while providing a nuanced understanding of LLMs abilities for such a task. △ Less

Submitted 29 February, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

Comments: 25 pages, 17 figures

arXiv:2402.06733 [pdf, other]

NICE: To Optimize In-Context Examples or Not?

Authors: Pragya Srivastava, Satvik Golechha, Amit Deshpande, Amit Sharma

Abstract: Recent work shows that in-context learning and optimization of in-context examples (ICE) can significantly improve the accuracy of large language models (LLMs) on a wide range of tasks, leading to an apparent consensus that ICE optimization is crucial for better performance. However, most of these studies assume a fixed or no instruction provided in the prompt. We challenge this consensus by inves… ▽ More Recent work shows that in-context learning and optimization of in-context examples (ICE) can significantly improve the accuracy of large language models (LLMs) on a wide range of tasks, leading to an apparent consensus that ICE optimization is crucial for better performance. However, most of these studies assume a fixed or no instruction provided in the prompt. We challenge this consensus by investigating the necessity of optimizing ICE when task-specific instructions are provided and find that there are many tasks for which it yields diminishing returns. In particular, using a diverse set of tasks and a systematically created instruction set with gradually added details, we find that as the prompt instruction becomes more detailed, the returns on ICE optimization diminish. To characterize this behavior, we introduce a task-specific metric called Normalized Invariability to Choice of Examples (NICE) that quantifies the learnability of tasks from a given instruction, and provides a heuristic to help decide whether to optimize instructions or ICE for a new task. Given a task, the proposed metric can reliably predict the utility of optimizing ICE compared to using random ICE. Our code is available at https://github.com/microsoft/nice-icl. △ Less

Submitted 6 June, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

Comments: Accepted as a full paper (9 pages) at ACL 2024 (Main)

Journal ref: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics 2024 (Volume 1: Long Papers)

arXiv:2402.06086 [pdf, other]

Rhizomes and Diffusions for Processing Highly Skewed Graphs on Fine-Grain Message-Driven Systems

Authors: Bibrak Qamar Chandio, Prateek Srivastava, Maciej Brodowicz, Martin Swany, Thomas Sterling

Abstract: The paper provides a unified co-design of 1) a programming and execution model that allows spawning tasks from within the vertex data at runtime, 2) language constructs for \textit{actions} that send work to where the data resides, combining parallel expressiveness of local control objects (LCOs) to implement asynchronous graph processing primitives, 3) and an innovative vertex-centric data-struct… ▽ More The paper provides a unified co-design of 1) a programming and execution model that allows spawning tasks from within the vertex data at runtime, 2) language constructs for \textit{actions} that send work to where the data resides, combining parallel expressiveness of local control objects (LCOs) to implement asynchronous graph processing primitives, 3) and an innovative vertex-centric data-structure, using the concept of Rhizomes, that parallelizes both the out and in-degree load of vertex objects across many cores and yet provides a single programming abstraction to the vertex objects. The data structure hierarchically parallelizes the out-degree load of vertices and the in-degree load laterally. The rhizomes internally communicate and remain consistent, using event-driven synchronization mechanisms, to provide a unified and correct view of the vertex. Simulated experimental results show performance gains for BFS, SSSP, and Page Rank on large chip sizes for the tested input graph datasets containing highly skewed degree distribution. The improvements come from the ability to express and create fine-grain dynamic computing task in the form of \textit{actions}, language constructs that aid the compiler to generate code that the runtime system uses to optimally schedule tasks, and the data structure that shares both in and out-degree compute workload among memory-processing elements. △ Less

Submitted 7 May, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

Comments: arXiv admin note: text overlap with arXiv:2402.02576

ACM Class: C.1.4; C.3; C.4; D.1.3

arXiv:2402.03702 [pdf, ps, other]

On Learning Spatial Provenance in Privacy-Constrained Wireless Networks

Authors: Manish Bansal, Pramsu Srivastava, J. Harshan

Abstract: In Vehicle-to-Everything networks that involve multi-hop communication, the Road Side Units (RSUs) typically aim to collect location information from the participating vehicles to provide security and network diagnostics features. While the vehicles commonly use the Global Positioning System (GPS) for navigation, they may refrain from sharing their precise GPS coordinates with the RSUs due to priv… ▽ More In Vehicle-to-Everything networks that involve multi-hop communication, the Road Side Units (RSUs) typically aim to collect location information from the participating vehicles to provide security and network diagnostics features. While the vehicles commonly use the Global Positioning System (GPS) for navigation, they may refrain from sharing their precise GPS coordinates with the RSUs due to privacy concerns. Therefore, to jointly address the high localization requirements by the RSUs as well as the vehicles' privacy, we present a novel spatial-provenance framework wherein each vehicle uses Bloom filters to embed their partial location information when forwarding the packets. In this framework, the RSUs and the vehicles agree upon fragmenting the coverage area into several smaller regions so that the vehicles can embed the identity of their regions through Bloom filters. Given the probabilistic nature of Bloom filters, we derive an analytical expression on the error-rates in provenance recovery and then pose an optimization problem to choose the underlying parameters. With the help of extensive simulation results, we show that our method offers near-optimal Bloom filter parameters in learning spatial provenance. Some interesting trade-offs between the communication-overhead, spatial privacy of the vehicles and the error rates in provenance recovery are also discussed. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Comments: To be presented in IEEE WCNC 2024

arXiv:2312.06071 [pdf, other]

Precipitation Downscaling with Spatiotemporal Video Diffusion

Authors: Prakhar Srivastava, Ruihan Yang, Gavin Kerrigan, Gideon Dresdner, Jeremy McGibbon, Christopher Bretherton, Stephan Mandt

Abstract: In climate science and meteorology, high-resolution local precipitation (rain and snowfall) predictions are limited by the computational costs of simulation-based methods. Statistical downscaling, or super-resolution, is a common workaround where a low-resolution prediction is improved using statistical approaches. Unlike traditional computer vision tasks, weather and climate applications require… ▽ More In climate science and meteorology, high-resolution local precipitation (rain and snowfall) predictions are limited by the computational costs of simulation-based methods. Statistical downscaling, or super-resolution, is a common workaround where a low-resolution prediction is improved using statistical approaches. Unlike traditional computer vision tasks, weather and climate applications require capturing the accurate conditional distribution of high-resolution given low-resolution patterns to assure reliable ensemble averages and unbiased estimates of extreme events, such as heavy rain. This work extends recent video diffusion models to precipitation super-resolution, employing a deterministic downscaler followed by a temporally-conditioned diffusion model to capture noise characteristics and high-frequency patterns. We test our approach on FV3GFS output, an established large-scale global atmosphere model, and compare it against six state-of-the-art baselines. Our analysis, capturing CRPS, MSE, precipitation distributions, and qualitative aspects using California and the Himalayas as examples, establishes our method as a new standard for data-driven precipitation downscaling. △ Less

Submitted 20 June, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

arXiv:2311.02103 [pdf, other]

Relax: Composable Abstractions for End-to-End Dynamic Machine Learning

Authors: Ruihang Lai, Junru Shao, Siyuan Feng, Steven S. Lyubomirsky, Bohan Hou, Wuwei Lin, Zihao Ye, Hongyi Jin, Yuchen Jin, Jiawei Liu, Lesheng Jin, Yaxing Cai, Ziheng Jiang, Yong Wu, Sunghyun Park, Prakalp Srivastava, Jared G. Roesch, Todd C. Mowry, Tianqi Chen

Abstract: Dynamic shape computations have become critical in modern machine learning workloads, especially in emerging large language models. The success of these models has driven demand for deploying them to a diverse set of backend environments. In this paper, we present Relax, a compiler abstraction for optimizing end-to-end dynamic machine learning workloads. Relax introduces first-class symbolic shape… ▽ More Dynamic shape computations have become critical in modern machine learning workloads, especially in emerging large language models. The success of these models has driven demand for deploying them to a diverse set of backend environments. In this paper, we present Relax, a compiler abstraction for optimizing end-to-end dynamic machine learning workloads. Relax introduces first-class symbolic shape annotations to track dynamic shape computations globally across the program. It also introduces a cross-level abstraction that encapsulates computational graphs, loop-level tensor programs, and library calls in a single representation to enable cross-level optimizations. We build an end-to-end compilation framework using the proposed approach to optimize dynamic shape models. Experimental results on large language models show that Relax delivers performance competitive with state-of-the-art hand-optimized systems across platforms and enables deployment of emerging dynamic models to a broader set of environments, including mobile phones, embedded devices, and web browsers. △ Less

Submitted 1 November, 2023; originally announced November 2023.

arXiv:2310.10691 [pdf, other]

Enhancing ML model accuracy for Digital VLSI circuits using diffusion models: A study on synthetic data generation

Authors: Prasha Srivastava, Pawan Kumar, Zia Abbas

Abstract: Generative AI has seen remarkable growth over the past few years, with diffusion models being state-of-the-art for image generation. This study investigates the use of diffusion models in generating artificial data generation for electronic circuits for enhancing the accuracy of subsequent machine learning models in tasks such as performance assessment, design, and testing when training data is us… ▽ More Generative AI has seen remarkable growth over the past few years, with diffusion models being state-of-the-art for image generation. This study investigates the use of diffusion models in generating artificial data generation for electronic circuits for enhancing the accuracy of subsequent machine learning models in tasks such as performance assessment, design, and testing when training data is usually known to be very limited. We utilize simulations in the HSPICE design environment with 22nm CMOS technology nodes to obtain representative real training data for our proposed diffusion model. Our results demonstrate the close resemblance of synthetic data using diffusion model to real data. We validate the quality of generated data, and demonstrate that data augmentation certainly effective in predictive analysis of VLSI design for digital circuits. △ Less

Submitted 15 October, 2023; originally announced October 2023.

Comments: 7 pages, submitted to NeurIPS workshop 2023

arXiv:2307.10200 [pdf, other]

Disentangling Societal Inequality from Model Biases: Gender Inequality in Divorce Court Proceedings

Authors: Sujan Dutta, Parth Srivastava, Vaishnavi Solunke, Swaprava Nath, Ashiqur R. KhudaBukhsh

Abstract: Divorce is the legal dissolution of a marriage by a court. Since this is usually an unpleasant outcome of a marital union, each party may have reasons to call the decision to quit which is generally documented in detail in the court proceedings. Via a substantial corpus of 17,306 court proceedings, this paper investigates gender inequality through the lens of divorce court proceedings. While emerg… ▽ More Divorce is the legal dissolution of a marriage by a court. Since this is usually an unpleasant outcome of a marital union, each party may have reasons to call the decision to quit which is generally documented in detail in the court proceedings. Via a substantial corpus of 17,306 court proceedings, this paper investigates gender inequality through the lens of divorce court proceedings. While emerging data sources (e.g., public court records) on sensitive societal issues hold promise in aiding social science research, biases present in cutting-edge natural language processing (NLP) methods may interfere with or affect such studies. We thus require a thorough analysis of potential gaps and limitations present in extant NLP resources. In this paper, on the methodological side, we demonstrate that existing NLP resources required several non-trivial modifications to quantify societal inequalities. On the substantive side, we find that while a large number of court cases perhaps suggest changing norms in India where women are increasingly challenging patriarchy, AI-powered analyses of these court proceedings indicate striking gender inequality with women often subjected to domestic violence. △ Less

Submitted 8 July, 2023; originally announced July 2023.

Comments: This paper is accepted at IJCAI 2023 (AI for good track)

arXiv:2305.04433 [pdf, other]

Accelerated Algorithms for a Class of Optimization Problems with Equality and Box Constraints

Authors: Anjali Parashar, Priyank Srivastava, Anuradha M. Annaswamy

Abstract: Convex optimization with equality and inequality constraints is a ubiquitous problem in several optimization and control problems in large-scale systems. Recently there has been a lot of interest in establishing accelerated convergence of the loss function. A class of high-order tuners was recently proposed in an effort to lead to accelerated convergence for the case when no constraints are pres… ▽ More Convex optimization with equality and inequality constraints is a ubiquitous problem in several optimization and control problems in large-scale systems. Recently there has been a lot of interest in establishing accelerated convergence of the loss function. A class of high-order tuners was recently proposed in an effort to lead to accelerated convergence for the case when no constraints are present. In this paper, we propose a new high-order tuner that can accommodate the presence of equality constraints. In order to accommodate the underlying box constraints, time-varying gains are introduced in the high-order tuner which leverage convexity and ensure anytime feasibility of the constraints. Numerical examples are provided to support the theoretical derivations. △ Less

Submitted 7 May, 2023; originally announced May 2023.

Comments: 6 pages, accepted in ACC 2023 (American Control Conference, 2023)

arXiv:2305.01290 [pdf, other]

A Construction of Arbitrarily Large Type-II $Z$ Complementary Code Set

Authors: Rajen Kumar, Prashant Kumar Srivastava, Sudhan Majhi

Abstract: For a type-I $(K,M,Z,N)$-ZCCS, it follows $K \leq M \left\lfloor \frac{N}{Z}\right\rfloor$. In this paper, we propose a construction of type-II $(p^{k+n},p^k,p^{n+r}-p^r+1,p^{n+r})$-$Z$ complementary code set (ZCCS) using an extended Boolean function, its properties of Hamiltonian paths and the concept of isolated vertices, where $p\ge 2$. However, the proposed type-II ZCCS provides… ▽ More For a type-I $(K,M,Z,N)$-ZCCS, it follows $K \leq M \left\lfloor \frac{N}{Z}\right\rfloor$. In this paper, we propose a construction of type-II $(p^{k+n},p^k,p^{n+r}-p^r+1,p^{n+r})$-$Z$ complementary code set (ZCCS) using an extended Boolean function, its properties of Hamiltonian paths and the concept of isolated vertices, where $p\ge 2$. However, the proposed type-II ZCCS provides $K = M(N-Z+1)$ codes, where as for type-I $(K,M,N,Z)$-ZCCS, it is $K \leq M \left\lfloor \frac{N}{Z}\right\rfloor$. Therefore, the proposed type-II ZCCS provides a larger number of codes compared to type-I ZCCS. Further, as a special case of the proposed construction, $(p^k,p^k,p^n)$-CCC can be generated, for any integral value of $p\ge2$ and $k\le n$. △ Less

Submitted 14 May, 2024; v1 submitted 2 May, 2023; originally announced May 2023.

arXiv:2302.13906 [pdf]

Argument Mining using BERT and Self-Attention based Embeddings

Authors: Pranjal Srivastava, Pranav Bhatnagar, Anurag Goel

Abstract: Argument mining automatically identifies and extracts the structure of inference and reasoning conveyed in natural language arguments. To the best of our knowledge, most of the state-of-the-art works in this field have focused on using tree-like structures and linguistic modeling. But, these approaches are not able to model more complex structures which are often found in online forums and real wo… ▽ More Argument mining automatically identifies and extracts the structure of inference and reasoning conveyed in natural language arguments. To the best of our knowledge, most of the state-of-the-art works in this field have focused on using tree-like structures and linguistic modeling. But, these approaches are not able to model more complex structures which are often found in online forums and real world argumentation structures. In this paper, a novel methodology for argument mining is proposed which employs attention-based embeddings for link prediction to model the causational hierarchies in typical argument structures prevalent in online discourse. △ Less

Submitted 27 February, 2023; originally announced February 2023.

Comments: 2022 4th International Conference on Advances in Computing, Communication Control and Networking (ICAC3N)

arXiv:2302.07566 [pdf, other]

Qualitative Data Augmentation for Performance Prediction in VLSI circuits

Authors: Prasha Srivastava, Pawan Kumar, Zia Abbas

Abstract: Various studies have shown the advantages of using Machine Learning (ML) techniques for analog and digital IC design automation and optimization. Data scarcity is still an issue for electronic designs, while training highly accurate ML models. This work proposes generating and evaluating artificial data using generative adversarial networks (GANs) for circuit data to aid and improve the accuracy o… ▽ More Various studies have shown the advantages of using Machine Learning (ML) techniques for analog and digital IC design automation and optimization. Data scarcity is still an issue for electronic designs, while training highly accurate ML models. This work proposes generating and evaluating artificial data using generative adversarial networks (GANs) for circuit data to aid and improve the accuracy of ML models trained with a small training data set. The training data is obtained by various simulations in the Cadence Virtuoso, HSPICE, and Microcap design environment with TSMC 180nm and 22nm CMOS technology nodes. The artificial data is generated and tested for an appropriate set of analog and digital circuits. The experimental results show that the proposed artificial data generation significantly improves ML models and reduces the percentage error by more than 50\% of the original percentage error, which were previously trained with insufficient data. Furthermore, this research aims to contribute to the extensive application of AI/ML in the field of VLSI design and technology by relieving the training data availability-related challenges. △ Less

Submitted 15 February, 2023; originally announced February 2023.

Comments: 14 pages, 13 figures

arXiv:2301.08695 [pdf, other]

Baechi: Fast Device Placement of Machine Learning Graphs

Authors: Beomyeol Jeon, Linda Cai, Chirag Shetty, Pallavi Srivastava, Jintao Jiang, Xiaolan Ke, Yitao Meng, Cong Xie, Indranil Gupta

Abstract: Machine Learning graphs (or models) can be challenging or impossible to train when either devices have limited memory, or models are large. To split the model across devices, learning-based approaches are still popular. While these result in model placements that train fast on data (i.e., low step times), learning-based model-parallelism is time-consuming, taking many hours or days to create a pla… ▽ More Machine Learning graphs (or models) can be challenging or impossible to train when either devices have limited memory, or models are large. To split the model across devices, learning-based approaches are still popular. While these result in model placements that train fast on data (i.e., low step times), learning-based model-parallelism is time-consuming, taking many hours or days to create a placement plan of operators on devices. We present the Baechi system, the first to adopt an algorithmic approach to the placement problem for running machine learning training graphs on small clusters of memory-constrained devices. We integrate our implementation of Baechi into two popular open-source learning frameworks: TensorFlow and PyTorch. Our experimental results using GPUs show that: (i) Baechi generates placement plans 654 X - 206K X faster than state-of-the-art learning-based approaches, and (ii) Baechi-placed model's step (training) time is comparable to expert placements in PyTorch, and only up to 6.2% worse than expert placements in TensorFlow. We prove mathematically that our two algorithms are within a constant factor of the optimal. Our work shows that compared to learning-based approaches, algorithmic approaches can face different challenges for adaptation to Machine learning systems, but also they offer proven bounds, and significant performance benefits. △ Less

Submitted 20 January, 2023; originally announced January 2023.

Comments: Extended version of SoCC 2020 paper: https://dl.acm.org/doi/10.1145/3419111.3421302

arXiv:2211.16958 [pdf, ps, other]

How to (virtually) train your speaker localizer

Authors: Prerak Srivastava, Antoine Deleforge, Archontis Politis, Emmanuel Vincent

Abstract: Learning-based methods have become ubiquitous in speaker localization. Existing systems rely on simulated training sets for the lack of sufficiently large, diverse and annotated real datasets. Most room acoustics simulators used for this purpose rely on the image source method (ISM) because of its computational efficiency. This paper argues that carefully extending the ISM to incorporate more real… ▽ More Learning-based methods have become ubiquitous in speaker localization. Existing systems rely on simulated training sets for the lack of sufficiently large, diverse and annotated real datasets. Most room acoustics simulators used for this purpose rely on the image source method (ISM) because of its computational efficiency. This paper argues that carefully extending the ISM to incorporate more realistic surface, source and microphone responses into training sets can significantly boost the real-world performance of speaker localization systems. It is shown that increasing the training-set realism of a state-of-the-art direction-of-arrival estimator yields consistent improvements across three different real test sets featuring human speakers in a variety of rooms and various microphone arrays. An ablation study further reveals that every added layer of realism contributes positively to these improvements. △ Less

Submitted 25 May, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

Comments: Published in INTERSPEECH 2023

arXiv:2211.04439 [pdf, other]

Sampling from convex sets with a cold start using multiscale decompositions

Authors: Hariharan Narayanan, Amit Rajaraman, Piyush Srivastava

Abstract: Running a random walk in a convex body $K\subseteq\mathbb{R}^n$ is a standard approach to sample approximately uniformly from the body. The requirement is that from a suitable initial distribution, the distribution of the walk comes close to the uniform distribution $π_K$ on $K$ after a number of steps polynomial in $n$ and the aspect ratio $R/r$ (i.e., when $rB_2 \subseteq K \subseteq RB_{2}$).… ▽ More Running a random walk in a convex body $K\subseteq\mathbb{R}^n$ is a standard approach to sample approximately uniformly from the body. The requirement is that from a suitable initial distribution, the distribution of the walk comes close to the uniform distribution $π_K$ on $K$ after a number of steps polynomial in $n$ and the aspect ratio $R/r$ (i.e., when $rB_2 \subseteq K \subseteq RB_{2}$). Proofs of rapid mixing of such walks often require the probability density $η_0$ of the initial distribution with respect to $π_K$ to be at most $\mathrm{poly}(n)$: this is called a "warm start". Achieving a warm start often requires non-trivial pre-processing before starting the random walk. This motivates proving rapid mixing from a "cold start", wherein $η_0$ can be as high as $\exp(\mathrm{poly}(n))$. Unlike warm starts, a cold start is usually trivial to achieve. However, a random walk need not mix rapidly from a cold start: an example being the well-known "ball walk". On the other hand, Lovász and Vempala proved that the "hit-and-run" random walk mixes rapidly from a cold start. For the related coordinate hit-and-run (CHR) walk, which has been found to be promising in computational experiments, rapid mixing from a warm start was proved only recently but the question of rapid mixing from a cold start remained open. We construct a family of random walks inspired by classical decompositions of subsets of $\mathbb{R}^n$ into countably many axis-aligned dyadic cubes. We show that even with a cold start, the mixing times of these walks are bounded by a polynomial in $n$ and the aspect ratio. Our main technical ingredient is an isoperimetric inequality for $K$ for a metric that magnifies distances between points close to the boundary of $K$. As a corollary, we show that the CHR walk also mixes rapidly both from a cold start and from a point not too close to the boundary of $K$. △ Less

Submitted 30 November, 2022; v1 submitted 8 November, 2022; originally announced November 2022.

Comments: Changes from v1: Added a corollary on mixing of coordinate hit-and-run from a point. Also includes some minor corrections/simplifications and explanations

arXiv:2210.17284 [pdf, other]

Towards Zero-Shot and Few-Shot Table Question Answering using GPT-3

Authors: Pragya Srivastava, Tanuja Ganu, Saikat Guha

Abstract: We present very early results on using GPT-3 to perform question answering on tabular data. We find that stock pre-trained GPT-3 is able to zero-shot learn the table structure from a serialized JSON array-of-arrays representation, and able to answer lookup queries and simple comparison questions in natural language without any fine-tuning. We further find that simple prompt engineering to include… ▽ More We present very early results on using GPT-3 to perform question answering on tabular data. We find that stock pre-trained GPT-3 is able to zero-shot learn the table structure from a serialized JSON array-of-arrays representation, and able to answer lookup queries and simple comparison questions in natural language without any fine-tuning. We further find that simple prompt engineering to include few-shot static Q&A examples significantly improves accuracy. Lastly, we find that intermixing passage text improves accuracy even further on heterogeneous data. We apply our approach on a novel dataset of simple tables in newspaper infographics with promising results. Overall, we find much cause for optimism in this basic approach. △ Less

Submitted 31 October, 2022; originally announced October 2022.

Comments: 7 pages

MSC Class: 14J60 (Primary)

arXiv:2207.09133 [pdf, other]

Realistic sources, receivers and walls improve the generalisability of virtually-supervised blind acoustic parameter estimators

Authors: Prerak Srivastava, Antoine Deleforge, Emmanuel Vincent

Abstract: Blind acoustic parameter estimation consists in inferring the acoustic properties of an environment from recordings of unknown sound sources. Recent works in this area have utilized deep neural networks trained either partially or exclusively on simulated data, due to the limited availability of real annotated measurements. In this paper, we study whether a model purely trained using a fast image-… ▽ More Blind acoustic parameter estimation consists in inferring the acoustic properties of an environment from recordings of unknown sound sources. Recent works in this area have utilized deep neural networks trained either partially or exclusively on simulated data, due to the limited availability of real annotated measurements. In this paper, we study whether a model purely trained using a fast image-source room impulse response simulator can generalize to real data. We present an ablation study on carefully crafted simulated training sets that account for different levels of realism in source, receiver and wall responses. The extent of realism is controlled by the sampling of wall absorption coefficients and by applying measured directivity patterns to microphones and sources. A state-of-the-art model trained on these datasets is evaluated on the task of jointly estimating the room's volume, total surface area, and octave-band reverberation times from multiple, multichannel speech recordings. Results reveal that every added layer of simulation realism at train time significantly improves the estimation of all quantities on real signals. △ Less

Submitted 19 July, 2022; originally announced July 2022.

arXiv:2207.02552 [pdf, other]

doi 10.1016/j.dsp.2024.104570

A Construction of Type-II ZCCS for the MC-CDMA System with Low PMEPR

Authors: Rajen Kumar, Sushant Kumar Jha, Prashant Kumar Srivastava, Sudhan Majhi

Abstract: In this letter, we propose a novel construction of type-II $Z$-complementary code set (ZCCS) having arbitrary sequence length using the Kronecker product between a complete complementary code (CCC) and mutually orthogonal uni-modular sequences. In this construction, Barker sequences are used to reduce row sequence peak-to-mean envelope power ratio (PMEPR) for some specific lengths sequence and col… ▽ More In this letter, we propose a novel construction of type-II $Z$-complementary code set (ZCCS) having arbitrary sequence length using the Kronecker product between a complete complementary code (CCC) and mutually orthogonal uni-modular sequences. In this construction, Barker sequences are used to reduce row sequence peak-to-mean envelope power ratio (PMEPR) for some specific lengths sequence and column sequence PMEPR for some specific sizes of codes. The column sequence PMEPR of the proposed type-II ZCCS is upper bounded by a number smaller than $2$. The proposed construction also contributes new lengths of type-II $Z$-complementary pair (ZCP) and type-II $Z$-complementary set (ZCS). Furthermore, the PMEPR of these new type-II ZCPs is also lower than existing type-II ZCPs. △ Less

Submitted 22 August, 2023; v1 submitted 6 July, 2022; originally announced July 2022.

arXiv:2206.09812 [pdf, other]

ConvGeN: Convex space learning improves deep-generative oversampling for tabular imbalanced classification on smaller datasets

Authors: Kristian Schultz, Saptarshi Bej, Waldemar Hahn, Markus Wolfien, Prashant Srivastava, Olaf Wolkenhauer

Abstract: Data is commonly stored in tabular format. Several fields of research are prone to small imbalanced tabular data. Supervised Machine Learning on such data is often difficult due to class imbalance. Synthetic data generation, i.e., oversampling, is a common remedy used to improve classifier performance. State-of-the-art linear interpolation approaches, such as LoRAS and ProWRAS can be used to gener… ▽ More Data is commonly stored in tabular format. Several fields of research are prone to small imbalanced tabular data. Supervised Machine Learning on such data is often difficult due to class imbalance. Synthetic data generation, i.e., oversampling, is a common remedy used to improve classifier performance. State-of-the-art linear interpolation approaches, such as LoRAS and ProWRAS can be used to generate synthetic samples from the convex space of the minority class to improve classifier performance in such cases. Deep generative networks are common deep learning approaches for synthetic sample generation, widely used for synthetic image generation. However, their scope on synthetic tabular data generation in the context of imbalanced classification is not adequately explored. In this article, we show that existing deep generative models perform poorly compared to linear interpolation based approaches for imbalanced classification problems on smaller tabular datasets. To overcome this, we propose a deep generative model, ConvGeN that combines the idea of convex space learning with deep generative models. ConvGeN learns the coefficients for the convex combinations of the minority class samples, such that the synthetic data is distinct enough from the majority class. Our benchmarking experiments demonstrate that our proposed model ConvGeN improves imbalanced classification on such small datasets, as compared to existing deep generative models, while being at-par with the existing linear interpolation approaches. Moreover, we discuss how our model can be used for synthetic tabular data generation in general, even outside the scope of data imbalance and thus, improves the overall applicability of convex space learning. △ Less

Submitted 13 July, 2022; v1 submitted 20 June, 2022; originally announced June 2022.

arXiv:2204.04868 [pdf, other]

On complex roots of the independence polynomial

Authors: Ferenc Bencs, Péter Csikvári, Piyush Srivastava, Jan Vondrák

Abstract: It is known from the work of Shearer (1985) (and also Scott and Sokal (2005)) that the independence polynomial $Z_G(λ)$ of a graph $G$ of maximum degree at most $d+1$ does not vanish provided that $\vertλ\vert \leq \frac{d^d}{(d+1)^{d+1}}$. Significant extensions of this result have recently been given in the case $\Re λ\geq 0$ by Peters and Regts (2019) and Bencs and Csikvári (arxiv:1807.08963).… ▽ More It is known from the work of Shearer (1985) (and also Scott and Sokal (2005)) that the independence polynomial $Z_G(λ)$ of a graph $G$ of maximum degree at most $d+1$ does not vanish provided that $\vertλ\vert \leq \frac{d^d}{(d+1)^{d+1}}$. Significant extensions of this result have recently been given in the case $\Re λ\geq 0$ by Peters and Regts (2019) and Bencs and Csikvári (arxiv:1807.08963). In this paper, our motivation is to further extend these results and find zero free regions when $\Re λ\leq 0$. We begin by giving new geometric criteria for establishing zero-free regions as well as for carrying out semi-rigorous numerical explorations. We then provide two examples of the (rigorous) use of these criteria, by establishing two new zero-free regions in the left-half plane. We also improve upon the results of Bencs and Csikvári (arxiv:1807.08963) for the right half-plane using our framework. By a direct application of the interpolation method of Barvinok, combined with extensions due to Patel and Regts, these results also imply deterministic polynomial time approximation algorithms for the independence polynomial of bounded degree graphs in the new zero-free regions. △ Less

Submitted 13 November, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

Comments: Extended version., to appear in proceedings of SODA 2023

arXiv:2203.09481 [pdf, other]

Diffusion Probabilistic Modeling for Video Generation

Authors: Ruihan Yang, Prakhar Srivastava, Stephan Mandt

Abstract: Denoising diffusion probabilistic models are a promising new class of generative models that mark a milestone in high-quality image generation. This paper showcases their ability to sequentially generate video, surpassing prior methods in perceptual and probabilistic forecasting metrics. We propose an autoregressive, end-to-end optimized video diffusion model inspired by recent advances in neural… ▽ More Denoising diffusion probabilistic models are a promising new class of generative models that mark a milestone in high-quality image generation. This paper showcases their ability to sequentially generate video, surpassing prior methods in perceptual and probabilistic forecasting metrics. We propose an autoregressive, end-to-end optimized video diffusion model inspired by recent advances in neural video compression. The model successively generates future frames by correcting a deterministic next-frame prediction using a stochastic residual generated by an inverse diffusion process. We compare this approach against five baselines on four datasets involving natural and simulation-based videos. We find significant improvements in terms of perceptual quality for all datasets. Furthermore, by introducing a scalable version of the Continuous Ranked Probability Score (CRPS) applicable to video, we show that our model also outperforms existing approaches in their probabilistic frame forecasting ability. △ Less

Submitted 7 December, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

arXiv:2202.09418

Uniting Control and Data Parallelism: Towards Scalable Memory-Driven Dynamic Graph Processing

Authors: Bibrak Qamar Chandio, Thomas Sterling, Prateek Srivastava

Abstract: Control parallelism and data parallelism is mostly reasoned and optimized as separate functions. Because of this, workloads that are irregular, fine-grain and dynamic such as dynamic graph processing become very hard to scale. An experimental research approach to computer architecture that synthesizes prior techniques of parallel computing along with new innovations is proposed in this paper. We e… ▽ More Control parallelism and data parallelism is mostly reasoned and optimized as separate functions. Because of this, workloads that are irregular, fine-grain and dynamic such as dynamic graph processing become very hard to scale. An experimental research approach to computer architecture that synthesizes prior techniques of parallel computing along with new innovations is proposed in this paper. We establish the background and motivation of the research undertaking and provide a detailed description of the proposed omputing system that is highly parallel non-von Neumann, memory-centric and memory-driven. We also present a message-driven (or even-driven) programming model called "diffusive computation" and provide insights into its properties using SSSP and Triangle Counting problems as examples. △ Less

Submitted 7 March, 2023; v1 submitted 18 February, 2022; originally announced February 2022.

Comments: The paper did not publish and we are working on a new paper that is very different than this one but contains some information that is in this paper

arXiv:2202.00471

Causal effect of racial bias in data and machine learning algorithms on user persuasiveness & discriminatory decision making: An Empirical Study

Authors: Kinshuk Sengupta, Praveen Ranjan Srivastava

Abstract: Language data and models demonstrate various types of bias, be it ethnic, religious, gender, or socioeconomic. AI/NLP models, when trained on the racially biased dataset, AI/NLP models instigate poor model explainability, influence user experience during decision making and thus further magnifies societal biases, raising profound ethical implications for society. The motivation of the study is to… ▽ More Language data and models demonstrate various types of bias, be it ethnic, religious, gender, or socioeconomic. AI/NLP models, when trained on the racially biased dataset, AI/NLP models instigate poor model explainability, influence user experience during decision making and thus further magnifies societal biases, raising profound ethical implications for society. The motivation of the study is to investigate how AI systems imbibe bias from data and produce unexplainable discriminatory outcomes and influence an individual's articulateness of system outcome due to the presence of racial bias features in datasets. The design of the experiment involves studying the counterfactual impact of racial bias features present in language datasets and its associated effect on the model outcome. A mixed research methodology is adopted to investigate the cross implication of biased model outcome on user experience, effect on decision-making through controlled lab experimentation. The findings provide foundation support for correlating the implication of carry-over an artificial intelligence model solving NLP task due to biased concept presented in the dataset. Further, the research outcomes justify the negative influence on users' persuasiveness that leads to alter the decision-making quotient of an individual when trying to rely on the model outcome to act. The paper bridges the gap across the harm caused in establishing poor customer trustworthiness due to an inequitable system design and provides strong support for researchers, policymakers, and data scientists to build responsible AI frameworks within organizations. △ Less

Submitted 25 November, 2022; v1 submitted 22 January, 2022; originally announced February 2022.

Comments: Fresh experiments need to be added to the design of experiments

arXiv:2112.09047 [pdf, other]

doi 10.1038/s41567-022-01770-1

Citation inequity and gendered citation practices in contemporary physics

Authors: Erin G. Teich, Jason Z. Kim, Christopher W. Lynn, Samantha C. Simon, Andrei A. Klishin, Karol P. Szymula, Pragya Srivastava, Lee C. Bassett, Perry Zurn, Jordan D. Dworkin, Dani S. Bassett

Abstract: The historical and contemporary under-attribution of women's contributions to scientific scholarship is well-known and well-studied, with effects that are felt today in myriad ways by women scientists. One measure of this under-attribution is the so-called citation gap between men and women: the under-citation of papers authored by women relative to expected rates coupled with a corresponding over… ▽ More The historical and contemporary under-attribution of women's contributions to scientific scholarship is well-known and well-studied, with effects that are felt today in myriad ways by women scientists. One measure of this under-attribution is the so-called citation gap between men and women: the under-citation of papers authored by women relative to expected rates coupled with a corresponding over-citation of papers authored by men relative to expected rates. We explore the citation gap in contemporary physics, analyzing over one million articles published over the last 25 years in 35 physics journals that span a wide range of subfields. Using a model that predicts papers' expected citation rates according to a set of characteristics separate from author gender, we find a global bias wherein papers authored by women are significantly under-cited, and papers authored by men are significantly over-cited. Moreover, we find that citation behavior varies along several dimensions, such that imbalances differ according to who is citing, where they are citing, and how they are citing. Specifically, citation imbalance in favor of man-authored papers is highest for papers authored by men, papers published in general physics journals, and papers likely to be less familiar to citing authors. Our results suggest that, although deciding which papers to cite is an individual choice, the cumulative effects of these choices needlessly harm a subset of scholars. We discuss several strategies for the mitigation of these effects, including conscious behavioral changes at the individual, journal, and community levels. △ Less

Submitted 16 December, 2021; originally announced December 2021.

arXiv:2111.15208 [pdf]

doi 10.1007/s42979-022-01023-1

HRNET: AI on Edge for mask detection and social distancing

Authors: Kinshuk Sengupta, Praveen Ranjan Srivastava

Abstract: The purpose of the paper is to provide innovative emerging technology framework for community to combat epidemic situations. The paper proposes a unique outbreak response system framework based on artificial intelligence and edge computing for citizen centric services to help track and trace people eluding safety policies like mask detection and social distancing measure in public or workplace set… ▽ More The purpose of the paper is to provide innovative emerging technology framework for community to combat epidemic situations. The paper proposes a unique outbreak response system framework based on artificial intelligence and edge computing for citizen centric services to help track and trace people eluding safety policies like mask detection and social distancing measure in public or workplace setup. The framework further provides implementation guideline in industrial setup as well for governance and contact tracing tasks. The adoption will thus lead in smart city planning and development focusing on citizen health systems contributing to improved quality of life. The conceptual framework presented is validated through quantitative data analysis via secondary data collection from researcher's public websites, GitHub repositories and renowned journals and further benchmarking were conducted for experimental results in Microsoft Azure cloud environment. The study includes selective AI-models for benchmark analysis and were assessed on performance and accuracy in edge computing environment for large scale societal setup. Overall YOLO model Outperforms in object detection task and is faster enough for mask detection and HRNetV2 outperform semantic segmentation problem applied to solve social distancing task in AI-Edge inferencing environmental setup. The paper proposes new Edge-AI algorithm for building technology-oriented solutions for detecting mask in human movement and social distance. The paper enriches the technological advancement in artificial intelligence and edge-computing applied to problems in society and healthcare systems. The framework further equips government agency, system providers to design and constructs technology-oriented models in community setup to Increase the quality of life using emerging technologies into smart urban environments. △ Less

Submitted 3 February, 2022; v1 submitted 30 November, 2021; originally announced November 2021.

Report number: Volume 3, Issue 2, March 2022

Journal ref: SN Computer Science, 2022

arXiv:2111.05070 [pdf, other]

Universal Lower Bound for Learning Causal DAGs with Atomic Interventions

Authors: Vibhor Porwal, Piyush Srivastava, Gaurav Sinha

Abstract: A well-studied challenge that arises in the structure learning problem of causal directed acyclic graphs (DAG) is that using observational data, one can only learn the graph up to a "Markov equivalence class" (MEC). The remaining undirected edges have to be oriented using interventions, which can be very expensive to perform in applications. Thus, the problem of minimizing the number of interventi… ▽ More A well-studied challenge that arises in the structure learning problem of causal directed acyclic graphs (DAG) is that using observational data, one can only learn the graph up to a "Markov equivalence class" (MEC). The remaining undirected edges have to be oriented using interventions, which can be very expensive to perform in applications. Thus, the problem of minimizing the number of interventions needed to fully orient the MEC has received a lot of recent attention, and is also the focus of this work. Our first result is a new universal lower bound on the number of single-node interventions that any algorithm (whether active or passive) would need to perform in order to orient a given MEC. Our second result shows that this bound is, in fact, within a factor of two of the size of the smallest set of single-node interventions that can orient the MEC. Our lower bound is provably better than previously known lower bounds. Further, using simulations on synthetic graphs and by giving examples of special graph families, we show that our bound is often significantly better. To prove our lower bound, we develop the notion of clique-block shared-parents (CBSP) orderings, which are topological orderings of DAGs without v-structures and satisfy certain special properties. We also use the techniques developed here to extend our results to the setting of multi-node interventions. △ Less

Submitted 19 May, 2022; v1 submitted 9 November, 2021; originally announced November 2021.

Comments: Extended version of AISTATS 2022 paper. Added results for multi-node interventions, and shortened title

arXiv:2110.03785 [pdf, other]

Addressing practical challenges in Active Learning via a hybrid query strategy

Authors: Deepesh Agarwal, Pravesh Srivastava, Sergio Martin-del-Campo, Balasubramaniam Natarajan, Babji Srinivasan

Abstract: Active Learning (AL) is a powerful tool to address modern machine learning problems with significantly fewer labeled training instances. However, implementation of traditional AL methodologies in practical scenarios is accompanied by multiple challenges due to the inherent assumptions. There are several hindrances, such as unavailability of labels for the AL algorithm at the beginning; unreliable… ▽ More Active Learning (AL) is a powerful tool to address modern machine learning problems with significantly fewer labeled training instances. However, implementation of traditional AL methodologies in practical scenarios is accompanied by multiple challenges due to the inherent assumptions. There are several hindrances, such as unavailability of labels for the AL algorithm at the beginning; unreliable external source of labels during the querying process; or incompatible mechanisms to evaluate the performance of Active Learner. Inspired by these practical challenges, we present a hybrid query strategy-based AL framework that addresses three practical challenges simultaneously: cold-start, oracle uncertainty and performance evaluation of Active Learner in the absence of ground truth. While a pre-clustering approach is employed to address the cold-start problem, the uncertainty surrounding the expertise of labeler and confidence in the given labels is incorporated to handle oracle uncertainty. The heuristics obtained during the querying process serve as the fundamental premise for accessing the performance of Active Learner. The robustness of the proposed AL framework is evaluated across three different environments and industrial settings. The results demonstrate the capability of the proposed framework to tackle practical challenges during AL implementation in real-world scenarios. △ Less

Submitted 7 October, 2021; originally announced October 2021.

Comments: 15 pages, 4 figures, 6 tables

arXiv:2107.13832 [pdf, other]

Blind Room Parameter Estimation Using Multiple-Multichannel Speech Recordings

Authors: Prerak Srivastava, Antoine Deleforge, Emmanuel Vincent

Abstract: Knowing the geometrical and acoustical parameters of a room may benefit applications such as audio augmented reality, speech dereverberation or audio forensics. In this paper, we study the problem of jointly estimating the total surface area, the volume, as well as the frequency-dependent reverberation time and mean surface absorption of a room in a blind fashion, based on two-channel noisy speech… ▽ More Knowing the geometrical and acoustical parameters of a room may benefit applications such as audio augmented reality, speech dereverberation or audio forensics. In this paper, we study the problem of jointly estimating the total surface area, the volume, as well as the frequency-dependent reverberation time and mean surface absorption of a room in a blind fashion, based on two-channel noisy speech recordings from multiple, unknown source-receiver positions. A novel convolutional neural network architecture leveraging both single- and inter-channel cues is proposed and trained on a large, realistic simulated dataset. Results on both simulated and real data show that using multiple observations in one room significantly reduces estimation errors and variances on all target quantities, and that using two channels helps the estimation of surface and volume. The proposed model outperforms a recently proposed blind volume estimation method on the considered datasets. △ Less

Submitted 29 July, 2021; originally announced July 2021.

Comments: Accepted In WASPAA 2021 ( IEEE Workshop on Applications of Signal Processing to Audio and Acoustics )

arXiv:2107.11066 [pdf, other]

SALADnet: Self-Attentive multisource Localization in the Ambisonics Domain

Authors: Pierre-Amaury Grumiaux, Srdan Kitic, Prerak Srivastava, Laurent Girin, Alexandre Guérin

Abstract: In this work, we propose a novel self-attention based neural network for robust multi-speaker localization from Ambisonics recordings. Starting from a state-of-the-art convolutional recurrent neural network, we investigate the benefit of replacing the recurrent layers by self-attention encoders, inherited from the Transformer architecture. We evaluate these models on synthetic and real-world data,… ▽ More In this work, we propose a novel self-attention based neural network for robust multi-speaker localization from Ambisonics recordings. Starting from a state-of-the-art convolutional recurrent neural network, we investigate the benefit of replacing the recurrent layers by self-attention encoders, inherited from the Transformer architecture. We evaluate these models on synthetic and real-world data, with up to 3 simultaneous speakers. The obtained results indicate that the majority of the proposed architectures either perform on par, or outperform the CRNN baseline, especially in the multisource scenario. Moreover, by avoiding the recurrent layers, the proposed models lend themselves to parallel computing, which is shown to produce considerable savings in execution time. △ Less

Submitted 23 July, 2021; originally announced July 2021.

Comments: Accepted to Workshop on Applications of Signal Processing to Audio and Acoustics

arXiv:2107.07349 [pdf, other]

A multi-schematic classifier-independent oversampling approach for imbalanced datasets

Authors: Saptarshi Bej, Kristian Schultz, Prashant Srivastava, Markus Wolfien, Olaf Wolkenhauer

Abstract: Over 85 oversampling algorithms, mostly extensions of the SMOTE algorithm, have been built over the past two decades, to solve the problem of imbalanced datasets. However, it has been evident from previous studies that different oversampling algorithms have different degrees of efficiency with different classifiers. With numerous algorithms available, it is difficult to decide on an oversampling a… ▽ More Over 85 oversampling algorithms, mostly extensions of the SMOTE algorithm, have been built over the past two decades, to solve the problem of imbalanced datasets. However, it has been evident from previous studies that different oversampling algorithms have different degrees of efficiency with different classifiers. With numerous algorithms available, it is difficult to decide on an oversampling algorithm for a chosen classifier. Here, we overcome this problem with a multi-schematic and classifier-independent oversampling approach: ProWRAS(Proximity Weighted Random Affine Shadowsampling). ProWRAS integrates the Localized Random Affine Shadowsampling (LoRAS)algorithm and the Proximity Weighted Synthetic oversampling (ProWSyn) algorithm. By controlling the variance of the synthetic samples, as well as a proximity-weighted clustering system of the minority classdata, the ProWRAS algorithm improves performance, compared to algorithms that generate synthetic samples through modelling high dimensional convex spaces of the minority class. ProWRAS has four oversampling schemes, each of which has its unique way to model the variance of the generated data. Most importantly, the performance of ProWRAS with proper choice of oversampling schemes, is independent of the classifier used. We have benchmarked our newly developed ProWRAS algorithm against five sate-of-the-art oversampling models and four different classifiers on 20 publicly available datasets. ProWRAS outperforms other oversampling algorithms in a statistically significant way, in terms of both F1-score and Kappa-score. Moreover, we have introduced a novel measure for classifier independence I-score, and showed quantitatively that ProWRAS performs better, independent of the classifier used. In practice, ProWRAS customizes synthetic sample generation according to a classifier of choice and thereby reduces benchmarking efforts. △ Less

Submitted 15 July, 2021; originally announced July 2021.

Comments: 12 tables, 6 figures

arXiv:2106.06999 [pdf, other]

A Dataset of Dynamic Reverberant Sound Scenes with Directional Interferers for Sound Event Localization and Detection

Authors: Archontis Politis, Sharath Adavanne, Daniel Krause, Antoine Deleforge, Prerak Srivastava, Tuomas Virtanen

Abstract: This report presents the dataset and baseline of Task 3 of the DCASE2021 Challenge on Sound Event Localization and Detection (SELD). The dataset is based on emulation of real recordings of static or moving sound events under real conditions of reverberation and ambient noise, using spatial room impulse responses captured in a variety of rooms and delivered in two spatial formats. The acoustical sy… ▽ More This report presents the dataset and baseline of Task 3 of the DCASE2021 Challenge on Sound Event Localization and Detection (SELD). The dataset is based on emulation of real recordings of static or moving sound events under real conditions of reverberation and ambient noise, using spatial room impulse responses captured in a variety of rooms and delivered in two spatial formats. The acoustical synthesis remains the same as in the previous iteration of the challenge, however the new dataset brings more challenging conditions of polyphony and overlapping instances of the same class. The most important difference of the new dataset is the introduction of directional interferers, meaning sound events that are localized in space but do not belong to the target classes to be detected and are not annotated. Since such interfering events are expected in every real-world scenario of SELD, the new dataset aims to promote systems that deal with this condition effectively. A modified SELDnet baseline employing the recent ACCDOA representation of SELD problems accompanies the dataset and it is shown to outperform the previous one. The new dataset is shown to be significantly more challenging for both baselines according to all considered metrics. To investigate the individual and combined effects of ambient noise, interferers, and reverberation, we study the performance of the baseline on different versions of the dataset excluding or including combinations of these factors. The results indicate that by far the most detrimental effects are caused by directional interferers. △ Less

Submitted 4 July, 2021; v1 submitted 13 June, 2021; originally announced June 2021.

arXiv:2102.00509 [pdf, other]

TruthBot: An Automated Conversational Tool for Intent Learning, Curated Information Presenting, and Fake News Alerting

Authors: Ankur Gupta, Yash Varun, Prarthana Das, Nithya Muttineni, Parth Srivastava, Hamim Zafar, Tanmoy Chakraborty, Swaprava Nath

Abstract: We present TruthBot, an all-in-one multilingual conversational chatbot designed for seeking truth (trustworthy and verified information) on specific topics. It helps users to obtain information specific to certain topics, fact-check information, and get recent news. The chatbot learns the intent of a query by training a deep neural network from the data of the previous intents and responds appropr… ▽ More We present TruthBot, an all-in-one multilingual conversational chatbot designed for seeking truth (trustworthy and verified information) on specific topics. It helps users to obtain information specific to certain topics, fact-check information, and get recent news. The chatbot learns the intent of a query by training a deep neural network from the data of the previous intents and responds appropriately when it classifies the intent in one of the classes above. Each class is implemented as a separate module that uses either its own curated knowledge-base or searches the web to obtain the correct information. The topic of the chatbot is currently set to COVID-19. However, the bot can be easily customized to any topic-specific responses. Our experimental results show that each module performs significantly better than its closest competitor, which is verified both quantitatively and through several user-based surveys in multiple languages. TruthBot has been deployed in June 2020 and is currently running. △ Less

Submitted 31 January, 2021; originally announced February 2021.

Comments: 17 pages, 11 images

arXiv:2011.04100 [pdf, other]

Network Optimization via Smooth Exact Penalty Functions Enabled by Distributed Gradient Computation

Authors: Priyank Srivastava, Jorge Cortes

Abstract: This paper proposes a distributed algorithm for a network of agents to solve an optimization problem with separable objective function and locally coupled constraints. Our strategy is based on reformulating the original constrained problem as the unconstrained optimization of a smooth (continuously differentiable) exact penalty function. Computing the gradient of this penalty function in a distrib… ▽ More This paper proposes a distributed algorithm for a network of agents to solve an optimization problem with separable objective function and locally coupled constraints. Our strategy is based on reformulating the original constrained problem as the unconstrained optimization of a smooth (continuously differentiable) exact penalty function. Computing the gradient of this penalty function in a distributed way is challenging even under the separability assumptions on the original optimization problem. Our technical approach shows that the distributed computation problem for the gradient can be formulated as a system of linear algebraic equations defined by separable problem data. To solve it, we design an exponentially fast, input-to-state stable distributed algorithm that does not require the individual agent matrices to be invertible. We employ this strategy to compute the gradient of the penalty function at the current network state. Our distributed algorithmic solver for the original constrained optimization problem interconnects this estimation with the prescription of having the agents follow the resulting direction. Numerical simulations illustrate the convergence and robustness properties of the proposed algorithm. △ Less

Submitted 11 March, 2021; v1 submitted 8 November, 2020; originally announced November 2020.

Comments: 12 pages, 3 figures

arXiv:2009.14004 [pdf, other]

doi 10.1017/S0963548321000328

On the mixing time of coordinate Hit-and-Run

Authors: Hariharan Narayanan, Piyush Srivastava

Abstract: We obtain a polynomial upper bound on the mixing time $T_{CHR}(ε)$ of the coordinate Hit-and-Run random walk on an $n-$dimensional convex body, where $T_{CHR}(ε)$ is the number of steps needed in order to reach within $ε$ of the uniform distribution with respect to the total variation distance, starting from a warm start (i.e., a distribution which has a density with respect to the uniform distrib… ▽ More We obtain a polynomial upper bound on the mixing time $T_{CHR}(ε)$ of the coordinate Hit-and-Run random walk on an $n-$dimensional convex body, where $T_{CHR}(ε)$ is the number of steps needed in order to reach within $ε$ of the uniform distribution with respect to the total variation distance, starting from a warm start (i.e., a distribution which has a density with respect to the uniform distribution on the convex body that is bounded above by a constant). Our upper bound is polynomial in $n, R$ and $\frac{1}ε$, where we assume that the convex body contains the unit $\Vert\cdot\Vert_\infty$-unit ball $B_\infty$ and is contained in its $R$-dilation $R\cdot B_\infty$. Whether coordinate Hit-and-Run has a polynomial mixing time has been an open question. △ Less

Submitted 11 April, 2022; v1 submitted 29 September, 2020; originally announced September 2020.

Comments: Close to the final published version

Journal ref: Combinatorics, Probability and Computing, 31(2), 320-332, 2022

arXiv:2008.04849 [pdf, other]

doi 10.1007/s41745-020-00211-3

City-Scale Agent-Based Simulators for the Study of Non-Pharmaceutical Interventions in the Context of the COVID-19 Epidemic

Authors: Shubhada Agrawal, Siddharth Bhandari, Anirban Bhattacharjee, Anand Deo, Narendra M. Dixit, Prahladh Harsha, Sandeep Juneja, Poonam Kesarwani, Aditya Krishna Swamy, Preetam Patil, Nihesh Rathod, Ramprasad Saptharishi, Sharad Shriram, Piyush Srivastava, Rajesh Sundaresan, Nidhin Koshy Vaidhiyan, Sarath Yasodharan

Abstract: We highlight the usefulness of city-scale agent-based simulators in studying various non-pharmaceutical interventions to manage an evolving pandemic. We ground our studies in the context of the COVID-19 pandemic and demonstrate the power of the simulator via several exploratory case studies in two metropolises, Bengaluru and Mumbai. Such tools become common-place in any city administration's tool… ▽ More We highlight the usefulness of city-scale agent-based simulators in studying various non-pharmaceutical interventions to manage an evolving pandemic. We ground our studies in the context of the COVID-19 pandemic and demonstrate the power of the simulator via several exploratory case studies in two metropolises, Bengaluru and Mumbai. Such tools become common-place in any city administration's tool kit in our march towards digital health. △ Less

Submitted 11 August, 2020; originally announced August 2020.

Comments: 56 pages

Journal ref: Journal of the Indian Institute of Science, volume 100, pages 809-847, 2020

arXiv:2008.02731 [pdf, other]

Analysing Risk of Coronary Heart Disease through Discriminative Neural Networks

Authors: Ayush Khaneja, Siddharth Srivastava, Astha Rai, A S Cheema, P K Srivastava

Abstract: The application of data mining, machine learning and artificial intelligence techniques in the field of diagnostics is not a new concept, and these techniques have been very successfully applied in a variety of applications, especially in dermatology and cancer research. But, in the case of medical problems that involve tests resulting in true or false (binary classification), the data generally h… ▽ More The application of data mining, machine learning and artificial intelligence techniques in the field of diagnostics is not a new concept, and these techniques have been very successfully applied in a variety of applications, especially in dermatology and cancer research. But, in the case of medical problems that involve tests resulting in true or false (binary classification), the data generally has a class imbalance with samples majorly belonging to one class (ex: a patient undergoes a regular test and the results are false). Such disparity in data causes problems when trying to model predictive systems on the data. In critical applications like diagnostics, this class imbalance cannot be overlooked and must be given extra attention. In our research, we depict how we can handle this class imbalance through neural networks using a discriminative model and contrastive loss using a Siamese neural network structure. Such a model does not work on a probability-based approach to classify samples into labels. Instead it uses a distance-based approach to differentiate between samples classified under different labels. The code is available at https://tinyurl.com/DiscriminativeCHD/ △ Less

Submitted 17 June, 2020; originally announced August 2020.

arXiv:2007.07971 [pdf, other]

Frequency Regulation with Heterogeneous Energy Resources: A Realization using Distributed Control

Authors: Tor Anderson, Manasa Muralidharan, Priyank Srivastava, Hamed Valizadeh Haghi, Jorge Cortes, Jan Kleissl, Sonia Martinez, Byron Washom

Abstract: This paper presents one of the first real-life demonstrations of coordinated and distributed resource control for secondary frequency response in a power distribution grid. We conduct a series of tests with up to 69 heterogeneous active devices consisting of air handling units, unidirectional and bidirectional electric vehicle charging stations, a battery energy storage system, and 107 passive dev… ▽ More This paper presents one of the first real-life demonstrations of coordinated and distributed resource control for secondary frequency response in a power distribution grid. We conduct a series of tests with up to 69 heterogeneous active devices consisting of air handling units, unidirectional and bidirectional electric vehicle charging stations, a battery energy storage system, and 107 passive devices consisting of building loads and photovoltaic generators. Actuation commands for the test devices are obtained by solving an economic dispatch problem at every regulation instant using distributed ratio-consensus, primal-dual, and Newton-like algorithms. The distributed control setup consists of a set of Raspberry Pi end-points exchanging messages via an ethernet switch. The problem formulation minimizes the sum of device costs while tracking the setpoints provided by the system operator. We demonstrate accurate and fast real-time distributed computation of the optimization solution and effective tracking of the regulation signal by measuring physical device outputs over 40-minute time horizons. We also perform an economic benefit analysis which confirms eligibility to participate in an ancillary services market and demonstrates up to $53K of potential annual revenue for the selected population of devices. △ Less

Submitted 4 February, 2021; v1 submitted 15 July, 2020; originally announced July 2020.

arXiv:2006.08266 [pdf, other]

The PSPACE-hardness of understanding neural circuits

Authors: Vidya Sagar Sharma, Piyush Srivastava

Abstract: In neuroscience, an important aspect of understanding the function of a neural circuit is to determine which, if any, of the neurons in the circuit are vital for the biological behavior governed by the neural circuit. A similar problem is to determine whether a given small set of neurons may be enough for the behavior to be displayed, even if all other neurons in the circuit are deactivated. Such… ▽ More In neuroscience, an important aspect of understanding the function of a neural circuit is to determine which, if any, of the neurons in the circuit are vital for the biological behavior governed by the neural circuit. A similar problem is to determine whether a given small set of neurons may be enough for the behavior to be displayed, even if all other neurons in the circuit are deactivated. Such a subset of neurons forms what is called a degenerate circuit for the behavior being studied. Recent advances in experimental techniques have provided researchers with tools to activate and deactivate subsets of neurons with a very high resolution, even in living animals. The data collected from such experiments may be of the following form: when a given subset of neurons is deactivated, is the behavior under study observed? This setting leads to the algorithmic question of determining the minimal vital or degenerate sets of neurons when one is given as input a description of the neural circuit. The algorithmic problem entails both figuring out which subsets of neurons should be perturbed (activated/deactivated), and then using the data from those perturbations to determine the minimal vital or degenerate sets. Given the large number of possible perturbations, and the recurrent nature of neural circuits, the possibility of a combinatorial explosion in such an approach has been recognized in the biology and the neuroscience literature. In this paper, we prove that the problems of finding minimal or minimum-size degenerate sets, and of finding the set of vital neurons, of a neural circuit given as input, are in fact PSPACE-hard. More importantly, we prove our hardness results by showing that a simpler problem, that of simulating such neural circuits, is itself PSPACE-hard. △ Less

Submitted 22 June, 2020; v1 submitted 15 June, 2020; originally announced June 2020.

Comments: 2 figures

arXiv:2006.03375 [pdf, other]

COVID-19 Epidemic Study II: Phased Emergence From the Lockdown in Mumbai

Authors: Prahladh Harsha, Sandeep Juneja, Preetam Patil, Nihesh Rathod, Ramprasad Saptharishi, A. Y. Sarath, Sharad Sriram, Piyush Srivastava, Rajesh Sundaresan, Nidhin Koshy Vaidhiyan

Abstract: The nation-wide lockdown starting 25 March 2020, aimed at suppressing the spread of the COVID-19 disease, was extended until 31 May 2020 in three subsequent orders by the Government of India. The extended lockdown has had significant social and economic consequences and `lockdown fatigue' has likely set in. Phased reopening began from 01 June 2020 onwards. Mumbai, one of the most crowded cities in… ▽ More The nation-wide lockdown starting 25 March 2020, aimed at suppressing the spread of the COVID-19 disease, was extended until 31 May 2020 in three subsequent orders by the Government of India. The extended lockdown has had significant social and economic consequences and `lockdown fatigue' has likely set in. Phased reopening began from 01 June 2020 onwards. Mumbai, one of the most crowded cities in the world, has witnessed both the largest number of cases and deaths among all the cities in India (41986 positive cases and 1368 deaths as of 02 June 2020). Many tough decisions are going to be made on re-opening in the next few days. In an earlier IISc-TIFR Report, we presented an agent-based city-scale simulator(ABCS) to model the progression and spread of the infection in large metropolises like Mumbai and Bengaluru. As discussed in IISc-TIFR Report 1, ABCS is a useful tool to model interactions of city residents at an individual level and to capture the impact of non-pharmaceutical interventions on the infection spread. In this report we focus on Mumbai. Using our simulator, we consider some plausible scenarios for phased emergence of Mumbai from the lockdown, 01 June 2020 onwards. These include phased and gradual opening of the industry, partial opening of public transportation (modelling of infection spread in suburban trains), impact of containment zones on controlling infections, and the role of compliance with respect to various intervention measures including use of masks, case isolation, home quarantine, etc. The main takeaway of our simulation results is that a phased opening of workplaces, say at a conservative attendance level of 20 to 33\%, is a good way to restart economic activity while ensuring that the city's medical care capacity remains adequate to handle the possible rise in the number of COVID-19 patients in June and July. △ Less

Submitted 5 June, 2020; originally announced June 2020.

Comments: 34 pages

arXiv:1912.07546 [pdf, other]

A Robust Spectral Clustering Algorithm for Sub-Gaussian Mixture Models with Outliers

Authors: Prateek R. Srivastava, Purnamrita Sarkar, Grani A. Hanasusanto

Abstract: We consider the problem of clustering datasets in the presence of arbitrary outliers. Traditional clustering algorithms such as k-means and spectral clustering are known to perform poorly for datasets contaminated with even a small number of outliers. In this paper, we develop a provably robust spectral clustering algorithm that applies a simple rounding scheme to denoise a Gaussian kernel matrix… ▽ More We consider the problem of clustering datasets in the presence of arbitrary outliers. Traditional clustering algorithms such as k-means and spectral clustering are known to perform poorly for datasets contaminated with even a small number of outliers. In this paper, we develop a provably robust spectral clustering algorithm that applies a simple rounding scheme to denoise a Gaussian kernel matrix built from the data points and uses vanilla spectral clustering to recover the cluster labels of data points. We analyze the performance of our algorithm under the assumption that the "good" data points are generated from a mixture of sub-gaussians (we term these "inliers"), while the outlier points can come from any arbitrary probability distribution. For this general class of models, we show that the misclassification error decays at an exponential rate in the signal-to-noise ratio, provided the number of outliers is a small fraction of the inlier points. Surprisingly, this derived error bound matches with the best-known bound for semidefinite programs (SDPs) under the same setting without outliers. We conduct extensive experiments on a variety of simulated and real-world datasets to demonstrate that our algorithm is less sensitive to outliers compared to other state-of-the-art algorithms proposed in the literature. △ Less

Submitted 31 January, 2021; v1 submitted 16 December, 2019; originally announced December 2019.

Comments: 54 pages, 8 figures

arXiv:1906.01228 [pdf, other]

doi 10.1137/20M1317384

Correlation decay and partition function zeros: Algorithms and phase transitions

Authors: Jingcheng Liu, Alistair Sinclair, Piyush Srivastava

Abstract: We explore connections between the phenomenon of correlation decay and the location of Lee-Yang and Fisher zeros for various spin systems. In particular we show that, in many instances, proofs showing that weak spatial mixing on the Bethe lattice (infinite $Δ$-regular tree) implies strong spatial mixing on all graphs of maximum degree $Δ$ can be lifted to the complex plane, establishing the absenc… ▽ More We explore connections between the phenomenon of correlation decay and the location of Lee-Yang and Fisher zeros for various spin systems. In particular we show that, in many instances, proofs showing that weak spatial mixing on the Bethe lattice (infinite $Δ$-regular tree) implies strong spatial mixing on all graphs of maximum degree $Δ$ can be lifted to the complex plane, establishing the absence of zeros of the associated partition function in a complex neighborhood of the region in parameter space corresponding to strong spatial mixing. This allows us to give unified proofs of several recent results of this kind, including the resolution by Peters and Regts of the Sokal conjecture for the partition function of the hard core lattice gas. It also allows us to prove new results on the location of Lee-Yang zeros of the anti-ferromagnetic Ising model. We show further that our methods extend to the case when weak spatial mixing on the Bethe lattice is not known to be equivalent to strong spatial mixing on all graphs. In particular, we show that results on strong spatial mixing in the anti-ferromagnetic Potts model can be lifted to the complex plane to give new zero-freeness results for the associated partition function. This extension allows us to give the first deterministic FPTAS for counting the number of $q$-colorings of a graph of maximum degree $Δ$ provided only that $q\ge 2Δ$. This matches the natural bound for randomized algorithms obtained by a straightforward application of Markov chain Monte Carlo. We also give an improved version of this result for triangle-free graphs. △ Less

Submitted 30 March, 2022; v1 submitted 4 June, 2019; originally announced June 2019.

Comments: Incorporated reviewers' comments, and fixed typos

arXiv:1807.06577 [pdf, other]

doi 10.1063/1.5082552

Fisher zeros and correlation decay in the Ising model

Authors: Jingcheng Liu, Alistair Sinclair, Piyush Srivastava

Abstract: We study the complex zeros of the partition function of the Ising model, viewed as a polynomial in the "interaction parameter"; these are known as Fisher zeros in light of their introduction by Fisher in 1965. While the zeros of the partition function as a polynomial in the "field" parameter have been extensively studied since the classical work of Lee and Yang, comparatively little is known about… ▽ More We study the complex zeros of the partition function of the Ising model, viewed as a polynomial in the "interaction parameter"; these are known as Fisher zeros in light of their introduction by Fisher in 1965. While the zeros of the partition function as a polynomial in the "field" parameter have been extensively studied since the classical work of Lee and Yang, comparatively little is known about Fisher zeros for general graphs. Our main result shows that the zero-field Ising model has no Fisher zeros in a complex neighborhood of the entire region of parameters where the model exhibits correlation decay. In addition to shedding light on Fisher zeros themselves, this result also establishes a formal connection between two distinct notions of phase transition for the Ising model: the absence of complex zeros (analyticity of the free energy, or the logarithm of the partition function) and decay of correlations with distance. We also discuss the consequences of our result for efficient deterministic approximation of the partition function. Our proof relies heavily on algorithmic techniques, notably Weitz's self-avoiding walk tree, and as such belongs to a growing body of work that uses algorithmic methods to resolve classical questions in statistical physics. △ Less

Submitted 25 November, 2018; v1 submitted 17 July, 2018; originally announced July 2018.

Comments: 21 pages, 1 figure

arXiv:1708.00897 [pdf, other]

Domain Aware Neural Dialog System

Authors: Sajal Choudhary, Prerna Srivastava, Lyle Ungar, João Sedoc

Abstract: We investigate the task of building a domain aware chat system which generates intelligent responses in a conversation comprising of different domains. The domain, in this case, is the topic or theme of the conversation. To achieve this, we present DOM-Seq2Seq, a domain aware neural network model based on the novel technique of using domain-targeted sequence-to-sequence models (Sutskever et al., 2… ▽ More We investigate the task of building a domain aware chat system which generates intelligent responses in a conversation comprising of different domains. The domain, in this case, is the topic or theme of the conversation. To achieve this, we present DOM-Seq2Seq, a domain aware neural network model based on the novel technique of using domain-targeted sequence-to-sequence models (Sutskever et al., 2014) and a domain classifier. The model captures features from current utterance and domains of the previous utterances to facilitate the formation of relevant responses. We evaluate our model on automatic metrics and compare our performance with the Seq2Seq model. △ Less

Submitted 2 August, 2017; originally announced August 2017.

arXiv:1707.05199 [pdf, other]

doi 10.1109/TIT.2019.2919632

Online codes for analog signals

Authors: Leonard J. Schulman, Piyush Srivastava

Abstract: This paper revisits a classical scenario in communication theory: a waveform sampled at regular intervals is to be encoded so as to minimize distortion in its reconstruction, despite noise. This transformation must be online (causal), to enable real-time signaling; and should use no more power than the original signal. The noise model we consider is an "atomic norm" convex relaxation of the standa… ▽ More This paper revisits a classical scenario in communication theory: a waveform sampled at regular intervals is to be encoded so as to minimize distortion in its reconstruction, despite noise. This transformation must be online (causal), to enable real-time signaling; and should use no more power than the original signal. The noise model we consider is an "atomic norm" convex relaxation of the standard (discrete alphabet) Hamming-weight-bounded model: namely, adversarial $\ell_1$-bounded. In the "block coding" (noncausal) setting, such encoding is possible due to the existence of large almost-Euclidean sections in $\ell_1$ spaces, a notion first studied in the work of Dvoretzky in 1961. Our main result is that an analogous result is achievable even causally. Equivalently, our work may be seen as a "lower triangular" version of $\ell_1$ Dvoretzky theorems. In terms of communication, the guarantees are expressed in terms of certain time-weighted norms: the time-weighted $\ell_2$ norm imposed on the decoder forces increasingly accurate reconstruction of the distant past signal, while the time-weighted $\ell_1$ norm on the noise ensures vanishing interference from distant past noise. Encoding is linear (hence easy to implement in analog hardware). Decoding is performed by an LP analogous to those used in compressed sensing. △ Less

Submitted 1 June, 2019; v1 submitted 17 July, 2017; originally announced July 2017.

Journal ref: IEEE Trans. Inf. Theory. "Early access", 2019. DOI: 10.1109/TIT.2019.2919632

arXiv:1704.06493 [pdf, other]

doi 10.1007/s10955-018-2199-2

The Ising Partition Function: Zeros and Deterministic Approximation

Authors: Jingcheng Liu, Alistair Sinclair, Piyush Srivastava

Abstract: We study the problem of approximating the partition function of the ferromagnetic Ising model in graphs and hypergraphs. Our first result is a deterministic approximation scheme (an FPTAS) for the partition function in bounded degree graphs that is valid over the entire range of parameters $β$ (the interaction) and $λ$ (the external field), except for the case $\vertλ\vert=1$ (the "zero-field" cas… ▽ More We study the problem of approximating the partition function of the ferromagnetic Ising model in graphs and hypergraphs. Our first result is a deterministic approximation scheme (an FPTAS) for the partition function in bounded degree graphs that is valid over the entire range of parameters $β$ (the interaction) and $λ$ (the external field), except for the case $\vertλ\vert=1$ (the "zero-field" case). A randomized algorithm (FPRAS) for all graphs, and all $β,λ$, has long been known. Unlike most other deterministic approximation algorithms for problems in statistical physics and counting, our algorithm does not rely on the "decay of correlations" property. Rather, we exploit and extend machinery developed recently by Barvinok, and Patel and Regts, based on the location of the complex zeros of the partition function, which can be seen as an algorithmic realization of the classical Lee-Yang approach to phase transitions. Our approach extends to the more general setting of the Ising model on hypergraphs of bounded degree and edge size, where no previous algorithms (even randomized) were known for a wide range of parameters. In order to achieve this extension, we establish a tight version of the Lee-Yang theorem for the Ising model on hypergraphs, improving a classical result of Suzuki and Fisher. △ Less

Submitted 28 June, 2018; v1 submitted 21 April, 2017; originally announced April 2017.

Comments: clarified presentation of combinatorial arguments, added new results on optimality of univariate Lee-Yang theorems

Showing 1–50 of 60 results for author: Srivastava, P