subscribe to arXiv mailings

AI Agents That Matter

Authors: Sayash Kapoor, Benedikt Stroebl, Zachary S. Siegel, Nitya Nadgir, Arvind Narayanan

Abstract: AI agents are an exciting new research direction, and agent development is driven by benchmarks. Our analysis of current agent benchmarks and evaluation practices reveals several shortcomings that hinder their usefulness in real-world applications. First, there is a narrow focus on accuracy without attention to other metrics. As a result, SOTA agents are needlessly complex and costly, and the comm… ▽ More AI agents are an exciting new research direction, and agent development is driven by benchmarks. Our analysis of current agent benchmarks and evaluation practices reveals several shortcomings that hinder their usefulness in real-world applications. First, there is a narrow focus on accuracy without attention to other metrics. As a result, SOTA agents are needlessly complex and costly, and the community has reached mistaken conclusions about the sources of accuracy gains. Our focus on cost in addition to accuracy motivates the new goal of jointly optimizing the two metrics. We design and implement one such optimization, showing its potential to greatly reduce cost while maintaining accuracy. Second, the benchmarking needs of model and downstream developers have been conflated, making it hard to identify which agent would be best suited for a particular application. Third, many agent benchmarks have inadequate holdout sets, and sometimes none at all. This has led to agents that are fragile because they take shortcuts and overfit to the benchmark in various ways. We prescribe a principled framework for avoiding overfitting. Finally, there is a lack of standardization in evaluation practices, leading to a pervasive lack of reproducibility. We hope that the steps we introduce for addressing these shortcomings will spur the development of agents that are useful in the real world and not just accurate on benchmarks. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2406.19670 [pdf, other]

doi 10.1145/3664646.3664759

Function+Data Flow: A Framework to Specify Machine Learning Pipelines for Digital Twinning

Authors: Eduardo de Conto, Blaise Genest, Arvind Easwaran

Abstract: The development of digital twins (DTs) for physical systems increasingly leverages artificial intelligence (AI), particularly for combining data from different sources or for creating computationally efficient, reduced-dimension models. Indeed, even in very different application domains, twinning employs common techniques such as model order reduction and modelization with hybrid data (that is, da… ▽ More The development of digital twins (DTs) for physical systems increasingly leverages artificial intelligence (AI), particularly for combining data from different sources or for creating computationally efficient, reduced-dimension models. Indeed, even in very different application domains, twinning employs common techniques such as model order reduction and modelization with hybrid data (that is, data sourced from both physics-based models and sensors). Despite this apparent generality, current development practices are ad-hoc, making the design of AI pipelines for digital twinning complex and time-consuming. Here we propose Function+Data Flow (FDF), a domain-specific language (DSL) to describe AI pipelines within DTs. FDF aims to facilitate the design and validation of digital twins. Specifically, FDF treats functions as first-class citizens, enabling effective manipulation of models learned with AI. We illustrate the benefits of FDF on two concrete use cases from different domains: predicting the plastic strain of a structure and modeling the electromagnetic behavior of a bearing. △ Less

Submitted 8 July, 2024; v1 submitted 28 June, 2024; originally announced June 2024.

Comments: 9 pages, 10 figures, to be published in AIware'24

arXiv:2406.16746 [pdf, other]

The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources

Authors: Shayne Longpre, Stella Biderman, Alon Albalak, Hailey Schoelkopf, Daniel McDuff, Sayash Kapoor, Kevin Klyman, Kyle Lo, Gabriel Ilharco, Nay San, Maribeth Rauh, Aviya Skowron, Bertie Vidgen, Laura Weidinger, Arvind Narayanan, Victor Sanh, David Adelani, Percy Liang, Rishi Bommasani, Peter Henderson, Sasha Luccioni, Yacine Jernite, Luca Soldaini

Abstract: Foundation model development attracts a rapidly expanding body of contributors, scientists, and applications. To help shape responsible development practices, we introduce the Foundation Model Development Cheatsheet: a growing collection of 250+ tools and resources spanning text, vision, and speech modalities. We draw on a large body of prior work to survey resources (e.g. software, documentation,… ▽ More Foundation model development attracts a rapidly expanding body of contributors, scientists, and applications. To help shape responsible development practices, we introduce the Foundation Model Development Cheatsheet: a growing collection of 250+ tools and resources spanning text, vision, and speech modalities. We draw on a large body of prior work to survey resources (e.g. software, documentation, frameworks, guides, and practical tools) that support informed data selection, processing, and understanding, precise and limitation-aware artifact documentation, efficient model training, advance awareness of the environmental impact from training, careful model evaluation of capabilities, risks, and claims, as well as responsible model release, licensing and deployment practices. We hope this curated collection of resources helps guide more responsible development. The process of curating this list, enabled us to review the AI development ecosystem, revealing what tools are critically missing, misused, or over-used in existing practices. We find that (i) tools for data sourcing, model evaluation, and monitoring are critically under-serving ethical and real-world needs, (ii) evaluations for model safety, capabilities, and environmental impact all lack reproducibility and transparency, (iii) text and particularly English-centric analyses continue to dominate over multilingual and multi-modal analyses, and (iv) evaluation of systems, rather than just models, is needed so that capabilities and impact are assessed in context. △ Less

Submitted 25 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

arXiv:2406.14657 [pdf, other]

OpenDebateEvidence: A Massive-Scale Argument Mining and Summarization Dataset

Authors: Allen Roush, Yusuf Shabazz, Arvind Balaji, Peter Zhang, Stefano Mezza, Markus Zhang, Sanjay Basu, Sriram Vishwanath, Mehdi Fatemi, Ravid Shwartz-Ziv

Abstract: We introduce OpenDebateEvidence, a comprehensive dataset for argument mining and summarization sourced from the American Competitive Debate community. This dataset includes over 3.5 million documents with rich metadata, making it one of the most extensive collections of debate evidence. OpenDebateEvidence captures the complexity of arguments in high school and college debates, providing valuable r… ▽ More We introduce OpenDebateEvidence, a comprehensive dataset for argument mining and summarization sourced from the American Competitive Debate community. This dataset includes over 3.5 million documents with rich metadata, making it one of the most extensive collections of debate evidence. OpenDebateEvidence captures the complexity of arguments in high school and college debates, providing valuable resources for training and evaluation. Our extensive experiments demonstrate the efficacy of fine-tuning state-of-the-art large language models for argumentative abstractive summarization across various methods, models, and datasets. By providing this comprehensive resource, we aim to advance computational argumentation and support practical applications for debaters, educators, and researchers. OpenDebateEvidence is publicly available to support further research and innovation in computational argumentation. Access it here: https://huggingface.co/datasets/Yusuf5/OpenCaselist △ Less

Submitted 5 July, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

Comments: Accepted for Publication to ARGMIN 2024 at ACL2024

arXiv:2406.09351 [pdf, ps, other]

On the Expressibility of the Reconstructional Color Refinement

Authors: V. Arvind, Johannes Köbler, Oleg Verbitsky

Abstract: One of the most basic facts related to the famous Ulam reconstruction conjecture is that the connectedness of a graph can be determined by the deck of its vertex-deleted subgraphs, which are considered up to isomorphism. We strengthen this result by proving that connectedness can still be determined when the subgraphs in the deck are given up to equivalence under the color refinement isomorphism t… ▽ More One of the most basic facts related to the famous Ulam reconstruction conjecture is that the connectedness of a graph can be determined by the deck of its vertex-deleted subgraphs, which are considered up to isomorphism. We strengthen this result by proving that connectedness can still be determined when the subgraphs in the deck are given up to equivalence under the color refinement isomorphism test. Consequently, this implies that connectedness is recognizable by Reconstruction Graph Neural Networks, a recently introduced GNN architecture inspired by the reconstruction conjecture (Cotta, Morris, Ribeiro 2021). △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 9 pages

arXiv:2405.19524 [pdf, other]

AI Risk Management Should Incorporate Both Safety and Security

Authors: Xiangyu Qi, Yangsibo Huang, Yi Zeng, Edoardo Debenedetti, Jonas Geiping, Luxi He, Kaixuan Huang, Udari Madhushani, Vikash Sehwag, Weijia Shi, Boyi Wei, Tinghao Xie, Danqi Chen, Pin-Yu Chen, Jeffrey Ding, Ruoxi Jia, Jiaqi Ma, Arvind Narayanan, Weijie J Su, Mengdi Wang, Chaowei Xiao, Bo Li, Dawn Song, Peter Henderson, Prateek Mittal

Abstract: The exposure of security vulnerabilities in safety-aligned language models, e.g., susceptibility to adversarial attacks, has shed light on the intricate interplay between AI safety and AI security. Although the two disciplines now come together under the overarching goal of AI risk management, they have historically evolved separately, giving rise to differing perspectives. Therefore, in this pape… ▽ More The exposure of security vulnerabilities in safety-aligned language models, e.g., susceptibility to adversarial attacks, has shed light on the intricate interplay between AI safety and AI security. Although the two disciplines now come together under the overarching goal of AI risk management, they have historically evolved separately, giving rise to differing perspectives. Therefore, in this paper, we advocate that stakeholders in AI risk management should be aware of the nuances, synergies, and interplay between safety and security, and unambiguously take into account the perspectives of both disciplines in order to devise mostly effective and holistic risk mitigation approaches. Unfortunately, this vision is often obfuscated, as the definitions of the basic concepts of "safety" and "security" themselves are often inconsistent and lack consensus across communities. With AI risk management being increasingly cross-disciplinary, this issue is particularly salient. In light of this conceptual challenge, we introduce a unified reference framework to clarify the differences and interplay between AI safety and AI security, aiming to facilitate a shared understanding and effective collaboration across communities. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.07264 [pdf, other]

Information Rates Over Multi-View Channels

Authors: V. Arvind Rameshwar, Nir Weinberger

Abstract: We investigate the fundamental limits of reliable communication over multi-view channels, in which the channel output is comprised of a large number of independent noisy views of a transmitted symbol. We consider first the setting of multi-view discrete memoryless channels and then extend our results to general multi-view channels (using multi-letter formulas). We argue that the channel capacity a… ▽ More We investigate the fundamental limits of reliable communication over multi-view channels, in which the channel output is comprised of a large number of independent noisy views of a transmitted symbol. We consider first the setting of multi-view discrete memoryless channels and then extend our results to general multi-view channels (using multi-letter formulas). We argue that the channel capacity and dispersion of such multi-view channels converge exponentially fast in the number of views to the entropy and varentropy of the input distribution, respectively. We identify the exact rate of convergence as the smallest Chernoff information between two conditional distributions of the output, conditioned on unequal inputs. For the special case of the deletion channel, we compute upper bounds on this Chernoff information. Finally, we present a new channel model we term the Poisson approximation channel -- of possible independent interest -- whose capacity closely approximates the capacity of the multi-view binary symmetric channel for any fixed number of views. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Comments: 33 pages, 1 figure, submitted to the IEEE

arXiv:2405.06261 [pdf, other]

Improving the Privacy Loss Under User-Level DP Composition for Fixed Estimation Error

Authors: V. Arvind Rameshwar, Anshoo Tandon

Abstract: This paper considers the private release of statistics of several disjoint subsets of a datasets, under user-level $ε$-differential privacy (DP). In particular, we consider the user-level differentially private release of sample means and variances of speed values in several grids in a city, in a potentially sequential manner. Traditional analysis of the privacy loss due to the sequential composit… ▽ More This paper considers the private release of statistics of several disjoint subsets of a datasets, under user-level $ε$-differential privacy (DP). In particular, we consider the user-level differentially private release of sample means and variances of speed values in several grids in a city, in a potentially sequential manner. Traditional analysis of the privacy loss due to the sequential composition of queries necessitates a privacy loss degradation by a factor that equals the total number of grids. Our main contribution is an iterative, instance-dependent algorithm, based on clipping the number of user contributions, which seeks to reduce the overall privacy loss degradation under a canonical Laplace mechanism, while not increasing the {worst} estimation error among the different grids. We test the performance of our algorithm on synthetic datasets and demonstrate improvements in the privacy loss degradation factor via our algorithm. We also demonstrate improvements in the worst-case error using a simple extension of a pseudo-user creation-based mechanism. An important component of this analysis is our exact characterization of the sensitivities and the worst-case estimation errors of sample means and variances incurred by clipping user contributions in an arbitrary fashion, which we believe is of independent interest. △ Less

Submitted 10 May, 2024; originally announced May 2024.

Comments: 15 pages, 6 figures, to be submitted to the ACM

arXiv:2405.05530 [pdf, other]

NurtureNet: A Multi-task Video-based Approach for Newborn Anthropometry

Authors: Yash Khandelwal, Mayur Arvind, Sriram Kumar, Ashish Gupta, Sachin Kumar Danisetty, Piyush Bagad, Anish Madan, Mayank Lunayach, Aditya Annavajjala, Abhishek Maiti, Sansiddh Jain, Aman Dalmia, Namrata Deka, Jerome White, Jigar Doshi, Angjoo Kanazawa, Rahul Panicker, Alpan Raval, Srinivas Rana, Makarand Tapaswi

Abstract: Malnutrition among newborns is a top public health concern in developing countries. Identification and subsequent growth monitoring are key to successful interventions. However, this is challenging in rural communities where health systems tend to be inaccessible and under-equipped, with poor adherence to protocol. Our goal is to equip health workers and public health systems with a solution for c… ▽ More Malnutrition among newborns is a top public health concern in developing countries. Identification and subsequent growth monitoring are key to successful interventions. However, this is challenging in rural communities where health systems tend to be inaccessible and under-equipped, with poor adherence to protocol. Our goal is to equip health workers and public health systems with a solution for contactless newborn anthropometry in the community. We propose NurtureNet, a multi-task model that fuses visual information (a video taken with a low-cost smartphone) with tabular inputs to regress multiple anthropometry estimates including weight, length, head circumference, and chest circumference. We show that visual proxy tasks of segmentation and keypoint prediction further improve performance. We establish the efficacy of the model through several experiments and achieve a relative error of 3.9% and mean absolute error of 114.3 g for weight estimation. Model compression to 15 MB also allows offline deployment to low-cost smartphones. △ Less

Submitted 8 May, 2024; originally announced May 2024.

Comments: Accepted at CVPM Workshop at CVPR 2024

arXiv:2405.05506 [pdf, other]

Cross-Care: Assessing the Healthcare Implications of Pre-training Data on Language Model Bias

Authors: Shan Chen, Jack Gallifant, Mingye Gao, Pedro Moreira, Nikolaj Munch, Ajay Muthukkumar, Arvind Rajan, Jaya Kolluri, Amelia Fiske, Janna Hastings, Hugo Aerts, Brian Anthony, Leo Anthony Celi, William G. La Cava, Danielle S. Bitterman

Abstract: Large language models (LLMs) are increasingly essential in processing natural languages, yet their application is frequently compromised by biases and inaccuracies originating in their training data. In this study, we introduce Cross-Care, the first benchmark framework dedicated to assessing biases and real world knowledge in LLMs, specifically focusing on the representation of disease prevalence… ▽ More Large language models (LLMs) are increasingly essential in processing natural languages, yet their application is frequently compromised by biases and inaccuracies originating in their training data. In this study, we introduce Cross-Care, the first benchmark framework dedicated to assessing biases and real world knowledge in LLMs, specifically focusing on the representation of disease prevalence across diverse demographic groups. We systematically evaluate how demographic biases embedded in pre-training corpora like $ThePile$ influence the outputs of LLMs. We expose and quantify discrepancies by juxtaposing these biases against actual disease prevalences in various U.S. demographic groups. Our results highlight substantial misalignment between LLM representation of disease prevalence and real disease prevalence rates across demographic subgroups, indicating a pronounced risk of bias propagation and a lack of real-world grounding for medical applications of LLMs. Furthermore, we observe that various alignment methods minimally resolve inconsistencies in the models' representation of disease prevalence across different languages. For further exploration and analysis, we make all data and a data visualization tool available at: www.crosscare.net. △ Less

Submitted 24 June, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

Comments: Submitted for review, data visualization tool available at: www.crosscare.net

arXiv:2404.19109 [pdf, other]

The Shape of Money Laundering: Subgraph Representation Learning on the Blockchain with the Elliptic2 Dataset

Authors: Claudio Bellei, Muhua Xu, Ross Phillips, Tom Robinson, Mark Weber, Tim Kaler, Charles E. Leiserson, Arvind, Jie Chen

Abstract: Subgraph representation learning is a technique for analyzing local structures (or shapes) within complex networks. Enabled by recent developments in scalable Graph Neural Networks (GNNs), this approach encodes relational information at a subgroup level (multiple connected nodes) rather than at a node level of abstraction. We posit that certain domain applications, such as anti-money laundering (A… ▽ More Subgraph representation learning is a technique for analyzing local structures (or shapes) within complex networks. Enabled by recent developments in scalable Graph Neural Networks (GNNs), this approach encodes relational information at a subgroup level (multiple connected nodes) rather than at a node level of abstraction. We posit that certain domain applications, such as anti-money laundering (AML), are inherently subgraph problems and mainstream graph techniques have been operating at a suboptimal level of abstraction. This is due in part to the scarcity of annotated datasets of real-world size and complexity, as well as the lack of software tools for managing subgraph GNN workflows at scale. To enable work in fundamental algorithms as well as domain applications in AML and beyond, we introduce Elliptic2, a large graph dataset containing 122K labeled subgraphs of Bitcoin clusters within a background graph consisting of 49M node clusters and 196M edge transactions. The dataset provides subgraphs known to be linked to illicit activity for learning the set of "shapes" that money laundering exhibits in cryptocurrency and accurately classifying new criminal activity. Along with the dataset we share our graph techniques, software tooling, promising early experimental results, and new domain insights already gleaned from this approach. Taken together, we find immediate practical value in this approach and the potential for a new standard in anti-money laundering and forensic analytics in cryptocurrencies and other financial networks. △ Less

Submitted 1 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

arXiv:2404.16382 [pdf, ps, other]

A Multivariate to Bivariate Reduction for Noncommutative Rank and Related Results

Authors: Vikraman Arvind, Pushkar S Joglekar

Abstract: We study the noncommutative rank problem, ncRANK, of computing the rank of matrices with linear entries in $n$ noncommuting variables and the problem of noncommutative Rational Identity Testing, RIT, which is to decide if a given rational formula in $n$ noncommuting variables is zero on its domain of definition. Motivated by the question whether these problems have deterministic NC algorithms, we… ▽ More We study the noncommutative rank problem, ncRANK, of computing the rank of matrices with linear entries in $n$ noncommuting variables and the problem of noncommutative Rational Identity Testing, RIT, which is to decide if a given rational formula in $n$ noncommuting variables is zero on its domain of definition. Motivated by the question whether these problems have deterministic NC algorithms, we revisit their interrelationship from a parallel complexity point of view. We show the following results: 1. Based on Cohn's embedding theorem \cite{Co90,Cohnfir} we show deterministic NC reductions from multivariate ncRANK to bivariate ncRANK and from multivariate RIT to bivariate RIT. 2. We obtain a deterministic NC-Turing reduction from bivariate $\RIT$ to bivariate ncRANK, thereby proving that a deterministic NC algorithm for bivariate ncRANK would imply that both multivariate RIT and multivariate ncRANK are in deterministic NC. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: 31 pages

arXiv:2404.10732 [pdf, other]

Attention-Aware Visualization: Tracking and Responding to User Perception Over Time

Authors: Arvind Srinivasan, Johannes Ellemose, Peter W. S. Butcher, Panagiotis D. Ritsos, Niklas Elmqvist

Abstract: We propose the notion of Attention-Aware Visualizations (AAVs) that track the user's perception of a visual representation over time and feed this information back to the visualization. Such context awareness is particularly useful for ubiquitous and immersive analytics where knowing which embedded visualizations the user is looking at can be used to make visualizations react appropriately to the… ▽ More We propose the notion of Attention-Aware Visualizations (AAVs) that track the user's perception of a visual representation over time and feed this information back to the visualization. Such context awareness is particularly useful for ubiquitous and immersive analytics where knowing which embedded visualizations the user is looking at can be used to make visualizations react appropriately to the user's attention: for example, by highlighting data the user has not yet seen. We can separate the approach into three components: (1) measuring the user's gaze on a visualization and its parts; (2) tracking the user's attention over time; and (3) reactively modifying the visual representation based on the current attention metric. In this paper, we present two separate implementations of AAV: a 2D data-agnostic method for web-based visualizations that can use an embodied eyetracker to capture the user's gaze, and a 3D data-aware one that uses the stencil buffer to track the visibility of each individual mark in a visualization. Both methods provide similar mechanisms for accumulating attention over time and changing the appearance of marks in response. We also present results from a qualitative evaluation studying visual feedback and triggering mechanisms for capturing and revisualizing attention. △ Less

Submitted 16 April, 2024; originally announced April 2024.

arXiv:2404.10085

The Average Spectrum Norm and Near-Optimal Tensor Completion

Authors: Oscar López, Richard Lehoucq, Carlos Llosa-Vite, Arvind Prasadan, Daniel M. Dunlavy

Abstract: We introduce a new tensor norm, the average spectrum norm, to study sample complexity of tensor completion problems based on the canonical polyadic decomposition (CPD). Properties of the average spectrum norm and its dual norm are investigated, demonstrating their utility for low-rank tensor recovery analysis. Our novel approach significantly reduces the provable sample rate for CPD-based noisy te… ▽ More We introduce a new tensor norm, the average spectrum norm, to study sample complexity of tensor completion problems based on the canonical polyadic decomposition (CPD). Properties of the average spectrum norm and its dual norm are investigated, demonstrating their utility for low-rank tensor recovery analysis. Our novel approach significantly reduces the provable sample rate for CPD-based noisy tensor completion, providing the best bounds to date on the number of observed noisy entries required to produce an arbitrarily accurate estimate of an underlying mean value tensor. Under Poisson and Bernoulli multivariate distributions, we show that an $N$-way CPD rank-$R$ parametric tensor $\boldsymbol{\mathscr{M}}\in\mathbb{R}^{I\times \cdots\times I}$ generating noisy observations can be approximated by large likelihood estimators from $\mathcal{O}(IR^2\log^{N+2}(I))$ revealed entries. Furthermore, under nonnegative and orthogonal versions of the CPD we improve the result to depend linearly on the rank, achieving the near-optimal rate $\mathcal{O}(IR\log^{N+2}(I))$. △ Less

Submitted 17 June, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

Comments: Error, in Section 2.1.2

arXiv:2404.09091 [pdf, other]

Semantic In-Domain Product Identification for Search Queries

Authors: Sanat Sharma, Jayant Kumar, Twisha Naik, Zhaoyu Lu, Arvind Srikantan, Tracy Holloway King

Abstract: Accurate explicit and implicit product identification in search queries is critical for enhancing user experiences, especially at a company like Adobe which has over 50 products and covers queries across hundreds of tools. In this work, we present a novel approach to training a product classifier from user behavioral data. Our semantic model led to >25% relative improvement in CTR (click through r… ▽ More Accurate explicit and implicit product identification in search queries is critical for enhancing user experiences, especially at a company like Adobe which has over 50 products and covers queries across hundreds of tools. In this work, we present a novel approach to training a product classifier from user behavioral data. Our semantic model led to >25% relative improvement in CTR (click through rate) across the deployed surfaces; a >50% decrease in null rate; a 2x increase in the app cards surfaced, which helps drive product visibility. △ Less

Submitted 29 May, 2024; v1 submitted 13 April, 2024; originally announced April 2024.

arXiv:2404.07986 [pdf, ps, other]

Trading Determinism for Noncommutativity in Edmonds' Problem

Authors: V. Arvind, Abhranil Chatterjee, Partha Mukhopadhyay

Abstract: Let $X=X_1\sqcup X_2\sqcup\ldots\sqcup X_k$ be a partitioned set of variables such that the variables in each part $X_i$ are noncommuting but for any $i\neq j$, the variables $x\in X_i$ commute with the variables $x'\in X_j$. Given as input a square matrix $T$ whose entries are linear forms over $\mathbb{Q}\langle{X}\rangle$, we consider the problem of checking if $T$ is invertible or not over the… ▽ More Let $X=X_1\sqcup X_2\sqcup\ldots\sqcup X_k$ be a partitioned set of variables such that the variables in each part $X_i$ are noncommuting but for any $i\neq j$, the variables $x\in X_i$ commute with the variables $x'\in X_j$. Given as input a square matrix $T$ whose entries are linear forms over $\mathbb{Q}\langle{X}\rangle$, we consider the problem of checking if $T$ is invertible or not over the universal skew field of fractions of the partially commutative polynomial ring $\mathbb{Q}\langle{X}\rangle$ [Klep-Vinnikov-Volcic (2020)]. In this paper, we design a deterministic polynomial-time algorithm for this problem for constant $k$. The special case $k=1$ is the noncommutative Edmonds' problem (NSINGULAR) which has a deterministic polynomial-time algorithm by recent results [Garg-Gurvits-Oliveira-Wigderson (2016), Ivanyos-Qiao-Subrahmanyam (2018), Hamada-Hirai (2021)]. En-route, we obtain the first deterministic polynomial-time algorithm for the equivalence testing problem of $k$-tape \emph{weighted} automata (for constant $k$) resolving a long-standing open problem [Harju and Karhum"{a}ki(1991), Worrell (2013)]. Algebraically, the equivalence problem reduces to testing whether a partially commutative rational series over the partitioned set $X$ is zero or not [Worrell (2013)]. Decidability of this problem was established by Harju and Karhumäki (1991). Prior to this work, a \emph{randomized} polynomial-time algorithm for this problem was given by Worrell (2013) and, subsequently, a deterministic quasipolynomial-time algorithm was also developed [Arvind et al. (2021)]. △ Less

Submitted 11 April, 2024; originally announced April 2024.

arXiv:2404.00487 [pdf, other]

doi 10.1145/3613905.3650767

Contextual AI Journaling: Integrating LLM and Time Series Behavioral Sensing Technology to Promote Self-Reflection and Well-being using the MindScape App

Authors: Subigya Nepal, Arvind Pillai, William Campbell, Talie Massachi, Eunsol Soul Choi, Orson Xu, Joanna Kuc, Jeremy Huckins, Jason Holden, Colin Depp, Nicholas Jacobson, Mary Czerwinski, Eric Granholm, Andrew T. Campbell

Abstract: MindScape aims to study the benefits of integrating time series behavioral patterns (e.g., conversational engagement, sleep, location) with Large Language Models (LLMs) to create a new form of contextual AI journaling, promoting self-reflection and well-being. We argue that integrating behavioral sensing in LLMs will likely lead to a new frontier in AI. In this Late-Breaking Work paper, we discuss… ▽ More MindScape aims to study the benefits of integrating time series behavioral patterns (e.g., conversational engagement, sleep, location) with Large Language Models (LLMs) to create a new form of contextual AI journaling, promoting self-reflection and well-being. We argue that integrating behavioral sensing in LLMs will likely lead to a new frontier in AI. In this Late-Breaking Work paper, we discuss the MindScape contextual journal App design that uses LLMs and behavioral sensing to generate contextual and personalized journaling prompts crafted to encourage self-reflection and emotional development. We also discuss the MindScape study of college students based on a preliminary user study and our upcoming study to assess the effectiveness of contextual AI journaling in promoting better well-being on college campuses. MindScape represents a new application class that embeds behavioral intelligence in AI. △ Less

Submitted 30 March, 2024; originally announced April 2024.

ACM Class: H.5.0; H.5.3; H.5.m; J.0

arXiv:2403.17277 [pdf, other]

Relational Network Verification

Authors: Xieyang Xu, Yifei Yuan, Zachary Kincaid, Arvind Krishnamurthy, Ratul Mahajan, David Walker, Ennan Zhai

Abstract: Relational network verification is a new approach to validating network changes. In contrast to traditional network verification, which analyzes specifications for a single network snapshot, relational network verification analyzes specifications concerning two network snapshots (e.g., pre- and post-change snapshots) and captures their similarities and differences. Relational change specifications… ▽ More Relational network verification is a new approach to validating network changes. In contrast to traditional network verification, which analyzes specifications for a single network snapshot, relational network verification analyzes specifications concerning two network snapshots (e.g., pre- and post-change snapshots) and captures their similarities and differences. Relational change specifications are compact and precise because they specify the flows or paths that change between snapshots and then simply mandate that other behaviors of the network "stay the same", without enumerating them. To achieve similar guarantees, single-snapshot specifications need to enumerate all flow and path behaviors that are not expected to change, so we can check that nothing has accidentally changed. Thus, precise single-snapshot specifications are proportional to network size, which makes them impractical to generate for many real-world networks. To demonstrate the value of relational reasoning, we develop a high-level relational specification language and a tool called Rela to validate network changes. Rela first compiles input specifications and network snapshot representations to finite state transducers. It then checks compliance using decision procedures for automaton equivalence. Our experiments using data on complex changes to a global backbone (with over 10^3 routers) find that Rela specifications need fewer than 10 terms for 93% of them and it validates 80% of them within 20 minutes. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.13411 [pdf, other]

Optimal Fixed Priority Scheduling in Multi-Stage Multi-Resource Distributed Real-Time Systems

Authors: Niraj Kumar, Chuanchao Gao, Arvind Easwaran

Abstract: This work studies fixed priority (FP) scheduling of real-time jobs with end-to-end deadlines in a distributed system. Specifically, given a multi-stage pipeline with multiple heterogeneous resources of the same type at each stage, the problem is to assign priorities to a set of real-time jobs with different release times to access a resource at each stage of the pipeline subject to the end-to-end… ▽ More This work studies fixed priority (FP) scheduling of real-time jobs with end-to-end deadlines in a distributed system. Specifically, given a multi-stage pipeline with multiple heterogeneous resources of the same type at each stage, the problem is to assign priorities to a set of real-time jobs with different release times to access a resource at each stage of the pipeline subject to the end-to-end deadline constraints. Note, in such a system, jobs may compete with different sets of jobs at different stages of the pipeline depending on the job-to-resource mapping. To this end, following are the two major contributions of this work. We show that an OPA-compatible schedulability test based on the delay composition algebra can be constructed, which we then use with an optimal priority assignment algorithm to compute a priority ordering. Further, we establish the versatility of pairwise priority assignment in such a multi-stage multi-resource system, compared to a total priority ordering. In particular, we show that a pairwise priority assignment may be feasible even if a priority ordering does not exist. We propose an integer linear programming formulation and a scalable heuristic to compute a pairwise priority assignment. We also show through simulation experiments that the proposed approaches can be used for the holistic scheduling of real-time jobs in edge computing systems. △ Less

Submitted 20 March, 2024; originally announced March 2024.

Comments: Accepted in DATE (Design, Automation and Test in Europe Conference) 2024

arXiv:2403.11411 [pdf, other]

Laconic: Streamlined Load Balancers for SmartNICs

Authors: Tianyi Cui, Chenxingyu Zhao, Wei Zhang, Kaiyuan Zhang, Arvind Krishnamurthy

Abstract: Load balancers are pervasively used inside today's clouds to scalably distribute network requests across data center servers. Given the extensive use of load balancers and their associated operating costs, several efforts have focused on improving their efficiency by implementing Layer-4 load-balancing logic within the kernel or using hardware acceleration. This work explores whether the more comp… ▽ More Load balancers are pervasively used inside today's clouds to scalably distribute network requests across data center servers. Given the extensive use of load balancers and their associated operating costs, several efforts have focused on improving their efficiency by implementing Layer-4 load-balancing logic within the kernel or using hardware acceleration. This work explores whether the more complex and connection-oriented Layer-7 load-balancing capability can also benefit from hardware acceleration. In particular, we target the offloading of load-balancing capability onto programmable SmartNICs. We fully leverage the cost and energy efficiency of SmartNICs using three key ideas. First, we argue that a full and complex TCP/IP stack is not required for Layer-7 load balancers and instead propose a lightweight forwarding agent on the SmartNIC. Second, we develop connection management data structures with a high degree of concurrency with minimal synchronization when executed on multi-core SmartNICs. Finally, we describe how the load-balancing logic could be accelerated using custom packet-processing accelerators on SmartNICs. We prototype Laconic on two types of SmartNIC hardware, achieving over 150 Gbps throughput using all cores on BlueField-2, while a single SmartNIC core achieves 8.7x higher throughput and comparable latency to Nginx on a single x86 core. △ Less

Submitted 17 March, 2024; originally announced March 2024.

arXiv:2403.10401 [pdf, other]

SculptDiff: Learning Robotic Clay Sculpting from Humans with Goal Conditioned Diffusion Policy

Authors: Alison Bartsch, Arvind Car, Charlotte Avra, Amir Barati Farimani

Abstract: Manipulating deformable objects remains a challenge within robotics due to the difficulties of state estimation, long-horizon planning, and predicting how the object will deform given an interaction. These challenges are the most pronounced with 3D deformable objects. We propose SculptDiff, a goal-conditioned diffusion-based imitation learning framework that works with point cloud state observatio… ▽ More Manipulating deformable objects remains a challenge within robotics due to the difficulties of state estimation, long-horizon planning, and predicting how the object will deform given an interaction. These challenges are the most pronounced with 3D deformable objects. We propose SculptDiff, a goal-conditioned diffusion-based imitation learning framework that works with point cloud state observations to directly learn clay sculpting policies for a variety of target shapes. To the best of our knowledge this is the first real-world method that successfully learns manipulation policies for 3D deformable objects. For sculpting videos and access to our dataset and hardware CAD models, see the project website: https://sites.google.com/andrew.cmu.edu/imitation-sculpting/home △ Less

Submitted 15 March, 2024; originally announced March 2024.

arXiv:2403.07918 [pdf, other]

On the Societal Impact of Open Foundation Models

Authors: Sayash Kapoor, Rishi Bommasani, Kevin Klyman, Shayne Longpre, Ashwin Ramaswami, Peter Cihon, Aspen Hopkins, Kevin Bankston, Stella Biderman, Miranda Bogen, Rumman Chowdhury, Alex Engler, Peter Henderson, Yacine Jernite, Seth Lazar, Stefano Maffulli, Alondra Nelson, Joelle Pineau, Aviya Skowron, Dawn Song, Victor Storchan, Daniel Zhang, Daniel E. Ho, Percy Liang, Arvind Narayanan

Abstract: Foundation models are powerful technologies: how they are released publicly directly shapes their societal impact. In this position paper, we focus on open foundation models, defined here as those with broadly available model weights (e.g. Llama 2, Stable Diffusion XL). We identify five distinctive properties (e.g. greater customizability, poor monitoring) of open foundation models that lead to bo… ▽ More Foundation models are powerful technologies: how they are released publicly directly shapes their societal impact. In this position paper, we focus on open foundation models, defined here as those with broadly available model weights (e.g. Llama 2, Stable Diffusion XL). We identify five distinctive properties (e.g. greater customizability, poor monitoring) of open foundation models that lead to both their benefits and risks. Open foundation models present significant benefits, with some caveats, that span innovation, competition, the distribution of decision-making power, and transparency. To understand their risks of misuse, we design a risk assessment framework for analyzing their marginal risk. Across several misuse vectors (e.g. cyberattacks, bioweapons), we find that current research is insufficient to effectively characterize the marginal risk of open foundation models relative to pre-existing technologies. The framework helps explain why the marginal risk is low in some cases, clarifies disagreements about misuse risks by revealing that past work has focused on different subsets of the framework with different assumptions, and articulates a way forward for more constructive debate. Overall, our work helps support a more grounded assessment of the societal impact of open foundation models by outlining what research is needed to empirically validate their theoretical benefits and risks. △ Less

Submitted 27 February, 2024; originally announced March 2024.

arXiv:2403.05893 [pdf, other]

Estimating the Weight Enumerators of Reed-Muller Codes via Sampling

Authors: Shreyas Jain, V. Arvind Rameshwar, Navin Kashyap

Abstract: This paper develops an algorithmic approach for obtaining estimates of the weight enumerators of Reed-Muller (RM) codes. Our algorithm is based on a technique for estimating the partition functions of spin systems, which in turn employs a sampler that produces codewords according to a suitably defined Gibbs distribution. We apply our method to moderate-blocklength RM codes and derive approximate v… ▽ More This paper develops an algorithmic approach for obtaining estimates of the weight enumerators of Reed-Muller (RM) codes. Our algorithm is based on a technique for estimating the partition functions of spin systems, which in turn employs a sampler that produces codewords according to a suitably defined Gibbs distribution. We apply our method to moderate-blocklength RM codes and derive approximate values of their weight enumerators. We observe that the rates of the weight enumerator estimates returned by our method are close to the true rates when these rates are either known or computable by brute-force search; in other cases, our computations provide provably robust estimates. As a byproduct, our sampling algorithm also allows us to obtain estimates of the weight spectra of RM codes. We illustrate our methods by providing estimates of the hitherto unknown weight enumerators of the RM$(11,5)$ code and the exact weight spectra of the RM$(10,3)$ and RM$(10,4)$ codes. △ Less

Submitted 9 March, 2024; originally announced March 2024.

Comments: 8 pages, 1 figure, 4 tables; submitted to the IEEE for possible publication. arXiv admin note: substantial text overlap with arXiv:2309.08907

arXiv:2403.04893 [pdf, other]

A Safe Harbor for AI Evaluation and Red Teaming

Authors: Shayne Longpre, Sayash Kapoor, Kevin Klyman, Ashwin Ramaswami, Rishi Bommasani, Borhane Blili-Hamelin, Yangsibo Huang, Aviya Skowron, Zheng-Xin Yong, Suhas Kotha, Yi Zeng, Weiyan Shi, Xianjun Yang, Reid Southen, Alexander Robey, Patrick Chao, Diyi Yang, Ruoxi Jia, Daniel Kang, Sandy Pentland, Arvind Narayanan, Percy Liang, Peter Henderson

Abstract: Independent evaluation and red teaming are critical for identifying the risks posed by generative AI systems. However, the terms of service and enforcement strategies used by prominent AI companies to deter model misuse have disincentives on good faith safety evaluations. This causes some researchers to fear that conducting such research or releasing their findings will result in account suspensio… ▽ More Independent evaluation and red teaming are critical for identifying the risks posed by generative AI systems. However, the terms of service and enforcement strategies used by prominent AI companies to deter model misuse have disincentives on good faith safety evaluations. This causes some researchers to fear that conducting such research or releasing their findings will result in account suspensions or legal reprisal. Although some companies offer researcher access programs, they are an inadequate substitute for independent research access, as they have limited community representation, receive inadequate funding, and lack independence from corporate incentives. We propose that major AI developers commit to providing a legal and technical safe harbor, indemnifying public interest safety research and protecting it from the threat of account suspensions or legal reprisal. These proposals emerged from our collective experience conducting safety, privacy, and trustworthiness research on generative AI systems, where norms and incentives could be better aligned with public interests, without exacerbating model misuse. We believe these commitments are a necessary step towards more inclusive and unimpeded community efforts to tackle the risks of generative AI. △ Less

Submitted 7 March, 2024; originally announced March 2024.

arXiv:2403.00106 [pdf, other]

doi 10.1145/3613904.3641996.

Umwelt: Accessible Structured Editing of Multimodal Data Representations

Authors: Jonathan Zong, Isabella Pedraza Pineros, Mengzhu Katie Chen, Daniel Hajas, Arvind Satyanarayan

Abstract: We present Umwelt, an authoring environment for interactive multimodal data representations. In contrast to prior approaches, which center the visual modality, Umwelt treats visualization, sonification, and textual description as coequal representations: they are all derived from a shared abstract data model, such that no modality is prioritized over the others. To simplify specification, Umwelt e… ▽ More We present Umwelt, an authoring environment for interactive multimodal data representations. In contrast to prior approaches, which center the visual modality, Umwelt treats visualization, sonification, and textual description as coequal representations: they are all derived from a shared abstract data model, such that no modality is prioritized over the others. To simplify specification, Umwelt evaluates a set of heuristics to generate default multimodal representations that express a dataset's functional relationships. To support smoothly moving between representations, Umwelt maintains a shared query predicated that is reified across all modalities -- for instance, navigating the textual description also highlights the visualization and filters the sonification. In a study with 5 blind / low-vision expert users, we found that Umwelt's multimodal representations afforded complementary overview and detailed perspectives on a dataset, allowing participants to fluidly shift between task- and representation-oriented ways of thinking. △ Less

Submitted 3 March, 2024; v1 submitted 29 February, 2024; originally announced March 2024.

Comments: ACM CHI 2024

arXiv:2402.16268 [pdf, other]

Foundation Model Transparency Reports

Authors: Rishi Bommasani, Kevin Klyman, Shayne Longpre, Betty Xiong, Sayash Kapoor, Nestor Maslej, Arvind Narayanan, Percy Liang

Abstract: Foundation models are critical digital technologies with sweeping societal impact that necessitates transparency. To codify how foundation model developers should provide transparency about the development and deployment of their models, we propose Foundation Model Transparency Reports, drawing upon the transparency reporting practices in social media. While external documentation of societal harm… ▽ More Foundation models are critical digital technologies with sweeping societal impact that necessitates transparency. To codify how foundation model developers should provide transparency about the development and deployment of their models, we propose Foundation Model Transparency Reports, drawing upon the transparency reporting practices in social media. While external documentation of societal harms prompted social media transparency reports, our objective is to institutionalize transparency reporting for foundation models while the industry is still nascent. To design our reports, we identify 6 design principles given the successes and shortcomings of social media transparency reporting. To further schematize our reports, we draw upon the 100 transparency indicators from the Foundation Model Transparency Index. Given these indicators, we measure the extent to which they overlap with the transparency requirements included in six prominent government policies (e.g., the EU AI Act, the US Executive Order on Safe, Secure, and Trustworthy AI). Well-designed transparency reports could reduce compliance costs, in part due to overlapping regulatory requirements across different jurisdictions. We encourage foundation model developers to regularly publish transparency reports, building upon recommendations from the G7 and the White House. △ Less

Submitted 25 February, 2024; originally announced February 2024.

arXiv:2402.16182 [pdf, other]

doi 10.1145/3613904.3642680

MoodCapture: Depression Detection Using In-the-Wild Smartphone Images

Authors: Subigya Nepal, Arvind Pillai, Weichen Wang, Tess Griffin, Amanda C. Collins, Michael Heinz, Damien Lekkas, Shayan Mirjafari, Matthew Nemesure, George Price, Nicholas C. Jacobson, Andrew T. Campbell

Abstract: MoodCapture presents a novel approach that assesses depression based on images automatically captured from the front-facing camera of smartphones as people go about their daily lives. We collect over 125,000 photos in the wild from N=177 participants diagnosed with major depressive disorder for 90 days. Images are captured naturalistically while participants respond to the PHQ-8 depression survey… ▽ More MoodCapture presents a novel approach that assesses depression based on images automatically captured from the front-facing camera of smartphones as people go about their daily lives. We collect over 125,000 photos in the wild from N=177 participants diagnosed with major depressive disorder for 90 days. Images are captured naturalistically while participants respond to the PHQ-8 depression survey question: \textit{``I have felt down, depressed, or hopeless''}. Our analysis explores important image attributes, such as angle, dominant colors, location, objects, and lighting. We show that a random forest trained with face landmarks can classify samples as depressed or non-depressed and predict raw PHQ-8 scores effectively. Our post-hoc analysis provides several insights through an ablation study, feature importance analysis, and bias assessment. Importantly, we evaluate user concerns about using MoodCapture to detect depression based on sharing photos, providing critical insights into privacy concerns that inform the future design of in-the-wild image-based mental health assessment tools. △ Less

Submitted 25 February, 2024; originally announced February 2024.

ACM Class: H.5.0; H.5.3; H.5.m; J.0

arXiv:2402.06787 [pdf, other]

ForestColl: Efficient Collective Communications on Heterogeneous Network Fabrics

Authors: Liangyu Zhao, Saeed Maleki, Ziyue Yang, Hossein Pourreza, Aashaka Shah, Changho Hwang, Arvind Krishnamurthy

Abstract: As modern DNN models grow ever larger, collective communications between the accelerators (allreduce, etc.) emerge as a significant performance bottleneck. Designing efficient communication schedules is challenging given today's highly diverse and heterogeneous network fabrics. In this paper, we present ForestColl, a tool that generates efficient schedules for any network topology. ForestColl cons… ▽ More As modern DNN models grow ever larger, collective communications between the accelerators (allreduce, etc.) emerge as a significant performance bottleneck. Designing efficient communication schedules is challenging given today's highly diverse and heterogeneous network fabrics. In this paper, we present ForestColl, a tool that generates efficient schedules for any network topology. ForestColl constructs broadcast/aggregation spanning trees as the communication schedule, achieving theoretically minimum network congestion. Its schedule generation runs in strongly polynomial time and is highly scalable. ForestColl supports any network fabrics, including both switching fabrics and direct connections, as well as any network graph structure. We evaluated ForestColl on multi-cluster AMD MI250 and NVIDIA A100 platforms. ForestColl's schedules achieved up to 52\% higher performance compared to the vendors' own optimized communication libraries, RCCL and NCCL. ForestColl also outperforms other state-of-the-art schedule generation techniques with both up to 61\% more efficient generated schedules and orders of magnitude faster schedule generation speed. △ Less

Submitted 9 February, 2024; originally announced February 2024.

Comments: arXiv admin note: text overlap with arXiv:2305.18461

arXiv:2402.01656 [pdf, other]

Promises and pitfalls of artificial intelligence for legal applications

Authors: Sayash Kapoor, Peter Henderson, Arvind Narayanan

Abstract: Is AI set to redefine the legal profession? We argue that this claim is not supported by the current evidence. We dive into AI's increasingly prevalent roles in three types of legal tasks: information processing; tasks involving creativity, reasoning, or judgment; and predictions about the future. We find that the ease of evaluating legal applications varies greatly across legal tasks, based on th… ▽ More Is AI set to redefine the legal profession? We argue that this claim is not supported by the current evidence. We dive into AI's increasingly prevalent roles in three types of legal tasks: information processing; tasks involving creativity, reasoning, or judgment; and predictions about the future. We find that the ease of evaluating legal applications varies greatly across legal tasks, based on the ease of identifying correct answers and the observability of information relevant to the task at hand. Tasks that would lead to the most significant changes to the legal profession are also the ones most prone to overoptimism about AI capabilities, as they are harder to evaluate. We make recommendations for better evaluation and deployment of AI in legal contexts. △ Less

Submitted 10 January, 2024; originally announced February 2024.

arXiv:2401.15906 [pdf, other]

Mean Estimation with User-Level Privacy for Spatio-Temporal IoT Datasets

Authors: V. Arvind Rameshwar, Anshoo Tandon, Prajjwal Gupta, Aditya Vikram Singh, Novoneel Chakraborty, Abhay Sharma

Abstract: This paper considers the problem of the private release of sample means of speed values from traffic datasets. Our key contribution is the development of user-level differentially private algorithms that incorporate carefully chosen parameter values to ensure low estimation errors on real-world datasets, while ensuring privacy. We test our algorithms on ITMS (Intelligent Traffic Management System)… ▽ More This paper considers the problem of the private release of sample means of speed values from traffic datasets. Our key contribution is the development of user-level differentially private algorithms that incorporate carefully chosen parameter values to ensure low estimation errors on real-world datasets, while ensuring privacy. We test our algorithms on ITMS (Intelligent Traffic Management System) data from an Indian city, where the speeds of different buses are drawn in a potentially non-i.i.d. manner from an unknown distribution, and where the number of speed samples contributed by different buses is potentially different. We then apply our algorithms to large synthetic datasets, generated based on the ITMS data. Here, we provide theoretical justification for the observed performance trends, and also provide recommendations for the choices of algorithm subroutines that result in low estimation errors. Finally, we characterize the best performance of pseudo-user creation-based algorithms on worst-case datasets via a minimax approach; this then gives rise to a novel procedure for the creation of pseudo-users, which optimizes the worst-case total estimation error. The algorithms discussed in the paper are readily applicable to general spatio-temporal IoT datasets for releasing a differentially private mean of a desired value. △ Less

Submitted 25 April, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

Comments: 14 pages, 5 figures, submitted to the ACM for possible publication

arXiv:2401.11037 [pdf, other]

Equivariant Graph Neural Operator for Modeling 3D Dynamics

Authors: Minkai Xu, Jiaqi Han, Aaron Lou, Jean Kossaifi, Arvind Ramanathan, Kamyar Azizzadenesheli, Jure Leskovec, Stefano Ermon, Anima Anandkumar

Abstract: Modeling the complex three-dimensional (3D) dynamics of relational systems is an important problem in the natural sciences, with applications ranging from molecular simulations to particle mechanics. Machine learning methods have achieved good success by learning graph neural networks to model spatial interactions. However, these approaches do not faithfully capture temporal correlations since the… ▽ More Modeling the complex three-dimensional (3D) dynamics of relational systems is an important problem in the natural sciences, with applications ranging from molecular simulations to particle mechanics. Machine learning methods have achieved good success by learning graph neural networks to model spatial interactions. However, these approaches do not faithfully capture temporal correlations since they only model next-step predictions. In this work, we propose Equivariant Graph Neural Operator (EGNO), a novel and principled method that directly models dynamics as trajectories instead of just next-step prediction. Different from existing methods, EGNO explicitly learns the temporal evolution of 3D dynamics where we formulate the dynamics as a function over time and learn neural operators to approximate it. To capture the temporal correlations while keeping the intrinsic SE(3)-equivariance, we develop equivariant temporal convolutions parameterized in the Fourier space and build EGNO by stacking the Fourier layers over equivariant networks. EGNO is the first operator learning framework that is capable of modeling solution dynamics functions over time while retaining 3D equivariance. Comprehensive experiments in multiple domains, including particle simulations, human motion capture, and molecular dynamics, demonstrate the significantly superior performance of EGNO against existing methods, thanks to the equivariant temporal modeling. Our code is available at https://github.com/MinkaiXu/egno. △ Less

Submitted 2 June, 2024; v1 submitted 19 January, 2024; originally announced January 2024.

arXiv:2311.10302 [pdf, other]

Social Isolation and Serious Mental Illness: The Role of Context-Aware Mobile Interventions

Authors: Subigya Nepal, Arvind Pillai, Emma M. Parrish, Jason Holden, Colin Depp, Andrew T. Campbell, Eric Granholm

Abstract: Social isolation is a common problem faced by individuals with serious mental illness (SMI), and current intervention approaches have limited effectiveness. This paper presents a blended intervention approach, called mobile Social Interaction Therapy by Exposure (mSITE), to address social isolation in individuals with serious mental illness. The approach combines brief in-person cognitive-behavior… ▽ More Social isolation is a common problem faced by individuals with serious mental illness (SMI), and current intervention approaches have limited effectiveness. This paper presents a blended intervention approach, called mobile Social Interaction Therapy by Exposure (mSITE), to address social isolation in individuals with serious mental illness. The approach combines brief in-person cognitive-behavioral therapy (CBT) with context-triggered mobile CBT interventions that are personalized using mobile sensing data. Our approach targets social behavior and is the first context-aware intervention for improving social outcomes in serious mental illness. △ Less

Submitted 16 November, 2023; originally announced November 2023.

ACM Class: H.5.0; H.5.m; J.0; J.3; J.4; J.m

arXiv:2311.07753 [pdf, other]

Bringing Reconfigurability to the Network Stack

Authors: Akshay Narayan, Aurojit Panda, Mohammad Alizadeh, Hari Balakrishnan, Arvind Krishnamurthy, Scott Shenker

Abstract: Reconfiguring the network stack allows applications to specialize the implementations of communication libraries depending on where they run, the requests they serve, and the performance they need to provide. Specializing applications in this way is challenging because developers need to choose the libraries they use when writing a program and cannot easily change them at runtime. This paper intro… ▽ More Reconfiguring the network stack allows applications to specialize the implementations of communication libraries depending on where they run, the requests they serve, and the performance they need to provide. Specializing applications in this way is challenging because developers need to choose the libraries they use when writing a program and cannot easily change them at runtime. This paper introduces Bertha, which allows these choices to be changed at runtime without limiting developer flexibility in the choice of network and communication functions. Bertha allows applications to safely use optimized communication primitives (including ones with deployment limitations) without limiting deployability. Our evaluation shows cases where this results in 16x higher throughput and 63% lower latency than current portable approaches while imposing minimal overheads when compared to a hand-optimized versions that use deployment-specific communication primitives. △ Less

Submitted 13 November, 2023; originally announced November 2023.

Comments: 12 pages, 10 figures

arXiv:2311.02332 [pdf, other]

Multimodal Machine Learning in Image-Based and Clinical Biomedicine: Survey and Prospects

Authors: Elisa Warner, Joonsang Lee, William Hsu, Tanveer Syeda-Mahmood, Charles Kahn, Olivier Gevaert, Arvind Rao

Abstract: Machine learning (ML) applications in medical artificial intelligence (AI) systems have shifted from traditional and statistical methods to increasing application of deep learning models. This survey navigates the current landscape of multimodal ML, focusing on its profound impact on medical image analysis and clinical decision support systems. Emphasizing challenges and innovations in addressing… ▽ More Machine learning (ML) applications in medical artificial intelligence (AI) systems have shifted from traditional and statistical methods to increasing application of deep learning models. This survey navigates the current landscape of multimodal ML, focusing on its profound impact on medical image analysis and clinical decision support systems. Emphasizing challenges and innovations in addressing multimodal representation, fusion, translation, alignment, and co-learning, the paper explores the transformative potential of multimodal models for clinical predictions. It also highlights the need for principled assessments and practical implementation of such models, bringing attention to the dynamics between decision support systems and healthcare providers and personnel. Despite advancements, challenges such as data biases and the scarcity of "big data" in many biomedical domains persist. We conclude with a discussion on principled innovation and collaborative efforts to further the mission of seamless integration of multimodal ML models into biomedical practice. △ Less

Submitted 19 January, 2024; v1 submitted 4 November, 2023; originally announced November 2023.

arXiv:2310.19102 [pdf, other]

Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

Authors: Yilong Zhao, Chien-Yu Lin, Kan Zhu, Zihao Ye, Lequn Chen, Size Zheng, Luis Ceze, Arvind Krishnamurthy, Tianqi Chen, Baris Kasikci

Abstract: The growing demand for Large Language Models (LLMs) in applications such as content generation, intelligent chatbots, and sentiment analysis poses considerable challenges for LLM service providers. To efficiently use GPU resources and boost throughput, batching multiple requests has emerged as a popular paradigm; to further speed up batching, LLM quantization techniques reduce memory consumption a… ▽ More The growing demand for Large Language Models (LLMs) in applications such as content generation, intelligent chatbots, and sentiment analysis poses considerable challenges for LLM service providers. To efficiently use GPU resources and boost throughput, batching multiple requests has emerged as a popular paradigm; to further speed up batching, LLM quantization techniques reduce memory consumption and increase computing capacity. However, prevalent quantization schemes (e.g., 8-bit weight-activation quantization) cannot fully leverage the capabilities of modern GPUs, such as 4-bit integer operators, resulting in sub-optimal performance. To maximize LLMs' serving throughput, we introduce Atom, a low-bit quantization method that achieves high throughput improvements with negligible accuracy loss. Atom significantly boosts serving throughput by using low-bit operators and considerably reduces memory consumption via low-bit quantization. It attains high accuracy by applying a novel mixed-precision and fine-grained quantization process. We evaluate Atom on 4-bit weight-activation quantization in the serving context. Atom improves end-to-end throughput (token/s) by up to $7.7\times$ compared to the FP16 and by $2.5\times$ compared to INT8 quantization, while maintaining the same latency target. △ Less

Submitted 16 April, 2024; v1 submitted 29 October, 2023; originally announced October 2023.

arXiv:2310.18547 [pdf, other]

Punica: Multi-Tenant LoRA Serving

Authors: Lequn Chen, Zihao Ye, Yongji Wu, Danyang Zhuo, Luis Ceze, Arvind Krishnamurthy

Abstract: Low-rank adaptation (LoRA) has become an important and popular method to adapt pre-trained models to specific domains. We present Punica, a system to serve multiple LoRA models in a shared GPU cluster. Punica contains a new CUDA kernel design that allows batching of GPU operations for different LoRA models. This allows a GPU to hold only a single copy of the underlying pre-trained model when servi… ▽ More Low-rank adaptation (LoRA) has become an important and popular method to adapt pre-trained models to specific domains. We present Punica, a system to serve multiple LoRA models in a shared GPU cluster. Punica contains a new CUDA kernel design that allows batching of GPU operations for different LoRA models. This allows a GPU to hold only a single copy of the underlying pre-trained model when serving multiple, different LoRA models, significantly enhancing GPU efficiency in terms of both memory and computation. Our scheduler consolidates multi-tenant LoRA serving workloads in a shared GPU cluster. With a fixed-sized GPU cluster, our evaluations show that Punica achieves 12x higher throughput in serving multiple LoRA models compared to state-of-the-art LLM serving systems while only adding 2ms latency per token. Punica is open source at https://github.com/punica-ai/punica . △ Less

Submitted 27 October, 2023; originally announced October 2023.

arXiv:2310.04727 [pdf, other]

Task Aware Modulation using Representation Learning: An Approach for Few Shot Learning in Heterogeneous Systems

Authors: Arvind Renganathan, Rahul Ghosh, Ankush Khandelwal, Vipin Kumar

Abstract: We present a Task-aware modulation using Representation Learning (TAM-RL) framework that enhances personalized predictions in few-shot settings for heterogeneous systems when individual task characteristics are not known. TAM-RL extracts embeddings representing the actual inherent characteristics of these entities and uses these characteristics to personalize the predictions for each entity/task.… ▽ More We present a Task-aware modulation using Representation Learning (TAM-RL) framework that enhances personalized predictions in few-shot settings for heterogeneous systems when individual task characteristics are not known. TAM-RL extracts embeddings representing the actual inherent characteristics of these entities and uses these characteristics to personalize the predictions for each entity/task. Using real-world hydrological and flux tower benchmark data sets, we show that TAM-RL can significantly outperform existing baseline approaches such as MAML and multi-modal MAML (MMAML) while being much faster and simpler to train due to less complexity. Specifically, TAM-RL eliminates the need for sensitive hyper-parameters like inner loop steps and inner loop learning rate, which are crucial for model convergence in MAML, MMAML. We further present an empirical evaluation via synthetic data to explore the impact of heterogeneity amongst the entities on the relative performance of MAML, MMAML, and TAM-RL. We show that TAM-RL significantly improves predictive performance for cases where it is possible to learn distinct representations for different tasks. △ Less

Submitted 7 October, 2023; originally announced October 2023.

arXiv:2310.04610 [pdf, other]

DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

Authors: Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Masahiro Tanaka, Xiaoxia Wu, Jeff Rasley, Ammar Ahmad Awan, Connor Holmes, Martin Cai, Adam Ghanem, Zhongzhu Zhou, Yuxiong He, Pete Luferenko, Divya Kumar, Jonathan Weyn, Ruixiong Zhang, Sylwester Klocek, Volodymyr Vragov, Mohammed AlQuraishi, Gustaf Ahdritz, Christina Floristean, Cristina Negri , et al. (67 additional authors not shown)

Abstract: In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing significant advancements across sectors from drug development to renewable energy. To answer this call, we present DeepSpeed4Science initiative (deepspeed4science.ai) which aims to build unique… ▽ More In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing significant advancements across sectors from drug development to renewable energy. To answer this call, we present DeepSpeed4Science initiative (deepspeed4science.ai) which aims to build unique capabilities through AI system technology innovations to help domain experts to unlock today's biggest science mysteries. By leveraging DeepSpeed's current technology pillars (training, inference and compression) as base technology enablers, DeepSpeed4Science will create a new set of AI system technologies tailored for accelerating scientific discoveries by addressing their unique complexity beyond the common technical approaches used for accelerating generic large language models (LLMs). In this paper, we showcase the early progress we made with DeepSpeed4Science in addressing two of the critical system challenges in structural biology research. △ Less

Submitted 11 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

arXiv:2310.04391 [pdf, ps, other]

On a Hierarchy of Spectral Invariants for Graphs

Authors: V. Arvind, Frank Fuhlbrück, Johannes Köbler, Oleg Verbitsky

Abstract: We consider a hierarchy of graph invariants that naturally extends the spectral invariants defined by Fürer (Lin. Alg. Appl. 2010) based on the angles formed by the set of standard basis vectors and their projections onto eigenspaces of the adjacency matrix. We provide a purely combinatorial characterization of this hierarchy in terms of the walk counts. This allows us to give a complete answer to… ▽ More We consider a hierarchy of graph invariants that naturally extends the spectral invariants defined by Fürer (Lin. Alg. Appl. 2010) based on the angles formed by the set of standard basis vectors and their projections onto eigenspaces of the adjacency matrix. We provide a purely combinatorial characterization of this hierarchy in terms of the walk counts. This allows us to give a complete answer to Fürer's question about the strength of his invariants in distinguishing non-isomorphic graphs in comparison to the 2-dimensional Weisfeiler-Leman algorithm, extending the recent work of Rattan and Seppelt (SODA 2023). As another application of the characterization, we prove that almost all graphs are determined up to isomorphism in terms of the spectrum and the angles, which is of interest in view of the long-standing open problem whether almost all graphs are determined by their eigenvalues alone. Finally, we describe the exact relationship between the hierarchy and the Weisfeiler-Leman algorithms for small dimensions, as also some other important spectral characteristics of a graph such as the generalized and the main spectra. △ Less

Submitted 30 May, 2024; v1 submitted 6 October, 2023; originally announced October 2023.

Comments: 32 pages, 1 diagram, 1 figure. A preliminary version of this paper appeared in the Proceedings of the 41st International Symposium on Theoretical Aspects of Computer Science (STACS'24), published in LIPIcs Vol. 289, Schloss Dagstuhl, Leibniz-Zentrum für Informatik, 2024

arXiv:2310.02193 [pdf, other]

Uncertainty Quantification in Inverse Models in Hydrology

Authors: Somya Sharma Chatterjee, Rahul Ghosh, Arvind Renganathan, Xiang Li, Snigdhansu Chatterjee, John Nieber, Christopher Duffy, Vipin Kumar

Abstract: In hydrology, modeling streamflow remains a challenging task due to the limited availability of basin characteristics information such as soil geology and geomorphology. These characteristics may be noisy due to measurement errors or may be missing altogether. To overcome this challenge, we propose a knowledge-guided, probabilistic inverse modeling method for recovering physical characteristics fr… ▽ More In hydrology, modeling streamflow remains a challenging task due to the limited availability of basin characteristics information such as soil geology and geomorphology. These characteristics may be noisy due to measurement errors or may be missing altogether. To overcome this challenge, we propose a knowledge-guided, probabilistic inverse modeling method for recovering physical characteristics from streamflow and weather data, which are more readily available. We compare our framework with state-of-the-art inverse models for estimating river basin characteristics. We also show that these estimates offer improvement in streamflow modeling as opposed to using the original basin characteristic values. Our inverse model offers 3\% improvement in R$^2$ for the inverse model (basin characteristic estimation) and 6\% for the forward model (streamflow prediction). Our framework also offers improved explainability since it can quantify uncertainty in both the inverse and the forward model. Uncertainty quantification plays a pivotal role in improving the explainability of machine learning models by providing additional insights into the reliability and limitations of model predictions. In our analysis, we assess the quality of the uncertainty estimates. Compared to baseline uncertainty quantification methods, our framework offers 10\% improvement in the dispersion of epistemic uncertainty and 13\% improvement in coverage rate. This information can help stakeholders understand the level of uncertainty associated with the predictions and provide a more comprehensive view of the potential outcomes. △ Less

Submitted 3 October, 2023; originally announced October 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2210.06213

arXiv:2309.16882 [pdf, other]

Message Propagation Through Time: An Algorithm for Sequence Dependency Retention in Time Series Modeling

Authors: Shaoming Xu, Ankush Khandelwal, Arvind Renganathan, Vipin Kumar

Abstract: Time series modeling, a crucial area in science, often encounters challenges when training Machine Learning (ML) models like Recurrent Neural Networks (RNNs) using the conventional mini-batch training strategy that assumes independent and identically distributed (IID) samples and initializes RNNs with zero hidden states. The IID assumption ignores temporal dependencies among samples, resulting in… ▽ More Time series modeling, a crucial area in science, often encounters challenges when training Machine Learning (ML) models like Recurrent Neural Networks (RNNs) using the conventional mini-batch training strategy that assumes independent and identically distributed (IID) samples and initializes RNNs with zero hidden states. The IID assumption ignores temporal dependencies among samples, resulting in poor performance. This paper proposes the Message Propagation Through Time (MPTT) algorithm to effectively incorporate long temporal dependencies while preserving faster training times relative to the stateful solutions. MPTT utilizes two memory modules to asynchronously manage initial hidden states for RNNs, fostering seamless information exchange between samples and allowing diverse mini-batches throughout epochs. MPTT further implements three policies to filter outdated and preserve essential information in the hidden states to generate informative initial hidden states for RNNs, facilitating robust training. Experimental results demonstrate that MPTT outperforms seven strategies on four climate datasets with varying levels of temporal dependencies. △ Less

Submitted 28 September, 2023; originally announced September 2023.

arXiv:2309.16654 [pdf, other]

Novel Deep Learning Pipeline for Automatic Weapon Detection

Authors: Haribharathi Sivakumar, Vijay Arvind. R, Pawan Ragavendhar V, G. Balamurugan

Abstract: Weapon and gun violence have recently become a pressing issue today. The degree of these crimes and activities has risen to the point of being termed as an epidemic. This prevalent misuse of weapons calls for an automatic system that detects weapons in real-time. Real-time surveillance video is captured and recorded in almost all public forums and places. These videos contain abundant raw data whi… ▽ More Weapon and gun violence have recently become a pressing issue today. The degree of these crimes and activities has risen to the point of being termed as an epidemic. This prevalent misuse of weapons calls for an automatic system that detects weapons in real-time. Real-time surveillance video is captured and recorded in almost all public forums and places. These videos contain abundant raw data which can be extracted and processed into meaningful information. This paper proposes a novel pipeline consisting of an ensemble of convolutional neural networks with distinct architectures. Each neural network is trained with a unique mini-batch with little to no overlap in the training samples. This paper will present several promising results using multiple datasets associated with comparing the proposed architecture and state-of-the-art (SoA) models. The proposed pipeline produced an average increase of 5% in accuracy, specificity, and recall compared to the SoA systems. △ Less

Submitted 28 September, 2023; originally announced September 2023.

Comments: Accepted for presentation at the IEEE 2nd International Conference on Automation, Robotics and Computer Engineering

arXiv:2309.15647 [pdf, ps, other]

Black-Box Identity Testing of Noncommutative Rational Formulas in Deterministic Quasipolynomial Time

Authors: V. Arvind, Abhranil Chatterjee, Partha Mukhopadhyay

Abstract: Rational Identity Testing (RIT) is the decision problem of determining whether or not a noncommutative rational formula computes zero in the free skew field. It admits a deterministic polynomial-time white-box algorithm [Garg, Gurvits, Oliveira, and Wigderson (2016); Ivanyos, Qiao, Subrahmanyam (2018); Hamada and Hirai (2021)], and a randomized polynomial-time algorithm [Derksen and Makam (2017)]… ▽ More Rational Identity Testing (RIT) is the decision problem of determining whether or not a noncommutative rational formula computes zero in the free skew field. It admits a deterministic polynomial-time white-box algorithm [Garg, Gurvits, Oliveira, and Wigderson (2016); Ivanyos, Qiao, Subrahmanyam (2018); Hamada and Hirai (2021)], and a randomized polynomial-time algorithm [Derksen and Makam (2017)] in the black-box setting, via singularity testing of linear matrices over the free skew field. Indeed, a randomized NC algorithm for RIT in the white-box setting follows from the result of Derksen and Makam (2017). Designing an efficient deterministic black-box algorithm for RIT and understanding the parallel complexity of RIT are major open problems in this area. Despite being open since the work of Garg, Gurvits, Oliveira, and Wigderson (2016), these questions have seen limited progress. In fact, the only known result in this direction is the construction of a quasipolynomial-size hitting set for rational formulas of only inversion height two [Arvind, Chatterjee, Mukhopadhyay (2022)]. In this paper, we significantly improve the black-box complexity of this problem and obtain the first quasipolynomial-size hitting set for all rational formulas of polynomial size. Our construction also yields the first deterministic quasi-NC upper bound for RIT in the white-box setting. △ Less

Submitted 6 April, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

arXiv:2309.13541 [pdf, other]

Efficient All-to-All Collective Communication Schedules for Direct-Connect Topologies

Authors: Prithwish Basu, Liangyu Zhao, Jason Fantl, Siddharth Pal, Arvind Krishnamurthy, Joud Khoury

Abstract: The all-to-all collective communications primitive is widely used in machine learning (ML) and high performance computing (HPC) workloads, and optimizing its performance is of interest to both ML and HPC communities. All-to-all is a particularly challenging workload that can severely strain the underlying interconnect bandwidth at scale. This paper takes a holistic approach to optimize the perform… ▽ More The all-to-all collective communications primitive is widely used in machine learning (ML) and high performance computing (HPC) workloads, and optimizing its performance is of interest to both ML and HPC communities. All-to-all is a particularly challenging workload that can severely strain the underlying interconnect bandwidth at scale. This paper takes a holistic approach to optimize the performance of all-to-all collective communications on supercomputer-scale direct-connect interconnects. We address several algorithmic and practical challenges in developing efficient and bandwidth-optimal all-to-all schedules for any topology and lowering the schedules to various runtimes and interconnect technologies. We also propose a novel topology that delivers near-optimal all-to-all performance. △ Less

Submitted 25 April, 2024; v1 submitted 23 September, 2023; originally announced September 2023.

Comments: HPDC '24

arXiv:2309.12624 [pdf, other]

Quark: A High-Performance Secure Container Runtime for Serverless Computing

Authors: Chenxingyu Zhao, Yulin Sun, Ying Xiong, Arvind Krishnamurthy

Abstract: Secure container runtimes serve as the foundational layer for creating and running containers, which is the bedrock of emerging computing paradigms like microservices and serverless computing. Although existing secure container runtimes indeed enhance security via running containers over a guest kernel and a Virtual Machine Monitor (VMM or Hypervisor), they incur performance penalties in critical… ▽ More Secure container runtimes serve as the foundational layer for creating and running containers, which is the bedrock of emerging computing paradigms like microservices and serverless computing. Although existing secure container runtimes indeed enhance security via running containers over a guest kernel and a Virtual Machine Monitor (VMM or Hypervisor), they incur performance penalties in critical areas such as networking, container startup, and I/O system calls. In our practice of operating microservices and serverless computing, we build a high-performance secure container runtime named Quark. Unlike existing solutions that rely on traditional VM technologies by importing Linux for the guest kernel and QEMU for the VMM, we take a different approach to building Quark from the ground up, paving the way for extreme customization to unlock high performance. Our development centers on co-designing a custom guest kernel and a VMM for secure containers. To this end, we build a lightweight guest OS kernel named QKernel and a specialized VMM named QVisor. The QKernel-QVisor codesign allows us to deliver three key advancements: high-performance RDMA-based container networking, fast container startup mode, and efficient mechanisms for executing I/O syscalls. In our practice with real-world apps like Redis, Quark cuts down P95 latency by 79.3% and increases throughput by 2.43x compared to Kata. Moreover, Quark container startup achieves 96.5% lower latency than the cold-start mode while saving 81.3% memory cost to the keep-warm mode. Quark is open-source with an industry-standard codebase in Rust. △ Less

Submitted 6 October, 2023; v1 submitted 22 September, 2023; originally announced September 2023.

Comments: arXiv admin note: text overlap with arXiv:2305.10621. The paper on arXiv:2305.10621 presents a detailed version of the TSoR module in Quark

arXiv:2309.10291 [pdf, other]

Koopman Invertible Autoencoder: Leveraging Forward and Backward Dynamics for Temporal Modeling

Authors: Kshitij Tayal, Arvind Renganathan, Rahul Ghosh, Xiaowei Jia, Vipin Kumar

Abstract: Accurate long-term predictions are the foundations for many machine learning applications and decision-making processes. However, building accurate long-term prediction models remains challenging due to the limitations of existing temporal models like recurrent neural networks (RNNs), as they capture only the statistical connections in the training data and may fail to learn the underlying dynamic… ▽ More Accurate long-term predictions are the foundations for many machine learning applications and decision-making processes. However, building accurate long-term prediction models remains challenging due to the limitations of existing temporal models like recurrent neural networks (RNNs), as they capture only the statistical connections in the training data and may fail to learn the underlying dynamics of the target system. To tackle this challenge, we propose a novel machine learning model based on Koopman operator theory, which we call Koopman Invertible Autoencoders (KIA), that captures the inherent characteristic of the system by modeling both forward and backward dynamics in the infinite-dimensional Hilbert space. This enables us to efficiently learn low-dimensional representations, resulting in more accurate predictions of long-term system behavior. Moreover, our method's invertibility design guarantees reversibility and consistency in both forward and inverse operations. We illustrate the utility of KIA on pendulum and climate datasets, demonstrating 300% improvements in long-term prediction capability for pendulum while maintaining robustness against noise. Additionally, our method excels in long-term climate prediction, further validating our method's effectiveness. △ Less

Submitted 18 September, 2023; originally announced September 2023.

Comments: Accepted at IEEE International Conference on Data Mining (ICDM) 2023

arXiv:2309.09944 [pdf, other]

DiffusionWorldViewer: Exposing and Broadening the Worldview Reflected by Generative Text-to-Image Models

Authors: Zoe De Simone, Angie Boggust, Arvind Satyanarayan, Ashia Wilson

Abstract: Generative text-to-image (TTI) models produce high-quality images from short textual descriptions and are widely used in academic and creative domains. Like humans, TTI models have a worldview, a conception of the world learned from their training data and task that influences the images they generate for a given prompt. However, the worldviews of TTI models are often hidden from users, making it… ▽ More Generative text-to-image (TTI) models produce high-quality images from short textual descriptions and are widely used in academic and creative domains. Like humans, TTI models have a worldview, a conception of the world learned from their training data and task that influences the images they generate for a given prompt. However, the worldviews of TTI models are often hidden from users, making it challenging for users to build intuition about TTI outputs, and they are often misaligned with users' worldviews, resulting in output images that do not match user expectations. In response, we introduce DiffusionWorldViewer, an interactive interface that exposes a TTI model's worldview across output demographics and provides editing tools for aligning output images with user perspectives. In a user study with 18 diverse TTI users, we find that DiffusionWorldViewer helps users represent their varied viewpoints in generated images and challenge the limited worldview reflected in current TTI models. △ Less

Submitted 5 February, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

Comments: 20 pages, 8 figures

arXiv:2309.09191 [pdf, other]

End-to-End Optimized Pipeline for Prediction of Protein Folding Kinetics

Authors: Vijay Arvind. R, Haribharathi Sivakumar, Brindha. R

Abstract: Protein folding is the intricate process by which a linear sequence of amino acids self-assembles into a unique three-dimensional structure. Protein folding kinetics is the study of pathways and time-dependent mechanisms a protein undergoes when it folds. Understanding protein kinetics is essential as a protein needs to fold correctly for it to perform its biological functions optimally, and a mis… ▽ More Protein folding is the intricate process by which a linear sequence of amino acids self-assembles into a unique three-dimensional structure. Protein folding kinetics is the study of pathways and time-dependent mechanisms a protein undergoes when it folds. Understanding protein kinetics is essential as a protein needs to fold correctly for it to perform its biological functions optimally, and a misfolded protein can sometimes be contorted into shapes that are not ideal for a cellular environment giving rise to many degenerative, neuro-degenerative disorders and amyloid diseases. Monitoring at-risk individuals and detecting protein discrepancies in a protein's folding kinetics at the early stages could majorly result in public health benefits, as preventive measures can be taken. This research proposes an efficient pipeline for predicting protein folding kinetics with high accuracy and low memory footprint. The deployed machine learning (ML) model outperformed the state-of-the-art ML models by 4.8% in terms of accuracy while consuming 327x lesser memory and being 7.3% faster. △ Less

Submitted 17 September, 2023; originally announced September 2023.

Comments: Accepted for presentation at the 22nd International Conference on Machine Learning and Applications

arXiv:2309.09175

Imbalanced Data Stream Classification using Dynamic Ensemble Selection

Authors: Priya. S, Haribharathi Sivakumar, Vijay Arvind. R

Abstract: Modern streaming data categorization faces significant challenges from concept drift and class imbalanced data. This negatively impacts the output of the classifier, leading to improper classification. Furthermore, other factors such as the overlapping of multiple classes limit the extent of the correctness of the output. This work proposes a novel framework for integrating data pre-processing and… ▽ More Modern streaming data categorization faces significant challenges from concept drift and class imbalanced data. This negatively impacts the output of the classifier, leading to improper classification. Furthermore, other factors such as the overlapping of multiple classes limit the extent of the correctness of the output. This work proposes a novel framework for integrating data pre-processing and dynamic ensemble selection, by formulating the classification framework for the nonstationary drifting imbalanced data stream, which employs the data pre-processing and dynamic ensemble selection techniques. The proposed framework was evaluated using six artificially generated data streams with differing imbalance ratios in combination with two different types of concept drifts. Each stream is composed of 200 chunks of 500 objects described by eight features and contains five concept drifts. Seven pre-processing techniques and two dynamic ensemble selection methods were considered. According to experimental results, data pre-processing combined with Dynamic Ensemble Selection techniques significantly delivers more accuracy when dealing with imbalanced data streams. △ Less

Submitted 28 September, 2023; v1 submitted 17 September, 2023; originally announced September 2023.

Comments: Made an error in the research and need to rectify it

arXiv:2309.08907 [pdf, other]

Sampling-Based Estimates of the Sizes of Constrained Subcodes of Reed-Muller Codes

Authors: V. Arvind Rameshwar, Shreyas Jain, Navin Kashyap

Abstract: This paper develops an algorithmic approach for obtaining approximate, numerical estimates of the sizes of subcodes of Reed-Muller (RM) codes, all of the codewords in which satisfy a given constraint. Our algorithm is based on a statistical physics technique for estimating the partition functions of spin systems, which in turn makes use of a sampler that produces RM codewords according to a Gibbs… ▽ More This paper develops an algorithmic approach for obtaining approximate, numerical estimates of the sizes of subcodes of Reed-Muller (RM) codes, all of the codewords in which satisfy a given constraint. Our algorithm is based on a statistical physics technique for estimating the partition functions of spin systems, which in turn makes use of a sampler that produces RM codewords according to a Gibbs distribution. The Gibbs distribution is designed so that it is biased towards codewords that respect the constraint. We apply our method to approximately compute the sizes of runlength limited (RLL) subcodes and obtain estimates of the weight distribution of moderate-blocklength RM codes. We observe that the estimates returned by our method are close to the true sizes when these sizes are either known or computable by brute-force search; in other cases, our computations provide provably robust estimates. As an illustration of our methods, we provide estimates of the weight distribution of the RM$(9,4)$ code. △ Less

Submitted 19 September, 2023; v1 submitted 16 September, 2023; originally announced September 2023.

Comments: 20 pages, 3 figures, 4 tables, to be submitted to the IEEE for possible publication

Showing 1–50 of 480 results for author: Arvind