Skip to main content

Showing 1–50 of 144 results for author: Krishnamurthy, A

  1. arXiv:2407.04180  [pdf, other

    cs.CV

    Slice-100K: A Multimodal Dataset for Extrusion-based 3D Printing

    Authors: Anushrut Jignasu, Kelly O. Marshall, Ankush Kumar Mishra, Lucas Nerone Rillo, Baskar Ganapathysubramanian, Aditya Balu, Chinmay Hegde, Adarsh Krishnamurthy

    Abstract: G-code (Geometric code) or RS-274 is the most widely used computer numerical control (CNC) and 3D printing programming language. G-code provides machine instructions for the movement of the 3D printer, especially for the nozzle, stage, and extrusion of material for extrusion-based additive manufacturing. Currently there does not exist a large repository of curated CAD models along with their corre… ▽ More

    Submitted 11 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

    Comments: Replaced "SLICE-100K" with "Slice-100K", added acknowledgements, and updated main figure to better capture shadows

  2. arXiv:2406.11810  [pdf, ps, other

    cs.LG cs.RO eess.SY

    Computationally Efficient RL under Linear Bellman Completeness for Deterministic Dynamics

    Authors: Runzhe Wu, Ayush Sekhari, Akshay Krishnamurthy, Wen Sun

    Abstract: We study computationally and statistically efficient Reinforcement Learning algorithms for the linear Bellman Complete setting, a setting that uses linear function approximation to capture value functions and unifies existing models like linear Markov Decision Processes (MDP) and Linear Quadratic Regulators (LQR). While it is known from the prior works that this setting is statistically tractable,… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  3. arXiv:2405.21046  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF

    Authors: Tengyang Xie, Dylan J. Foster, Akshay Krishnamurthy, Corby Rosset, Ahmed Awadallah, Alexander Rakhlin

    Abstract: Reinforcement learning from human feedback (RLHF) has emerged as a central tool for language model alignment. We consider online exploration in RLHF, which exploits interactive access to human or AI feedback by deliberately encouraging the model to produce diverse, maximally informative responses. By allowing RLHF to confidently stray from the pre-trained model, online exploration offers the possi… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  4. arXiv:2405.19269  [pdf, other

    cs.LG

    Rich-Observation Reinforcement Learning with Continuous Latent Dynamics

    Authors: Yuda Song, Lili Wu, Dylan J. Foster, Akshay Krishnamurthy

    Abstract: Sample-efficiency and reliability remain major bottlenecks toward wide adoption of reinforcement learning algorithms in continuous settings with high-dimensional perceptual inputs. Toward addressing these challenges, we introduce a new theoretical framework, RichCLD (Rich-Observation RL with Continuous Latent Dynamics), in which the agent performs control based on high-dimensional observations, bu… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 63 pages, 4 figures, published at ICML 2024

  5. arXiv:2403.17277  [pdf, other

    cs.NI

    Relational Network Verification

    Authors: Xieyang Xu, Yifei Yuan, Zachary Kincaid, Arvind Krishnamurthy, Ratul Mahajan, David Walker, Ennan Zhai

    Abstract: Relational network verification is a new approach to validating network changes. In contrast to traditional network verification, which analyzes specifications for a single network snapshot, relational network verification analyzes specifications concerning two network snapshots (e.g., pre- and post-change snapshots) and captures their similarities and differences. Relational change specifications… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  6. arXiv:2403.15371  [pdf, other

    cs.LG cs.AI cs.CL

    Can large language models explore in-context?

    Authors: Akshay Krishnamurthy, Keegan Harris, Dylan J. Foster, Cyril Zhang, Aleksandrs Slivkins

    Abstract: We investigate the extent to which contemporary Large Language Models (LLMs) can engage in exploration, a core capability in reinforcement learning and decision making. We focus on native performance of existing LLMs, without training interventions. We deploy LLMs as agents in simple multi-armed bandit environments, specifying the environment description and interaction history entirely in-context… ▽ More

    Submitted 12 July, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

    Comments: Minor updates, added references to related and concurrent work

  7. arXiv:2403.11411  [pdf, other

    cs.NI

    Laconic: Streamlined Load Balancers for SmartNICs

    Authors: Tianyi Cui, Chenxingyu Zhao, Wei Zhang, Kaiyuan Zhang, Arvind Krishnamurthy

    Abstract: Load balancers are pervasively used inside today's clouds to scalably distribute network requests across data center servers. Given the extensive use of load balancers and their associated operating costs, several efforts have focused on improving their efficiency by implementing Layer-4 load-balancing logic within the kernel or using hardware acceleration. This work explores whether the more comp… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

  8. arXiv:2403.06571  [pdf, other

    cs.LG math.OC stat.ML

    Scalable Online Exploration via Coverability

    Authors: Philip Amortila, Dylan J. Foster, Akshay Krishnamurthy

    Abstract: Exploration is a major challenge in reinforcement learning, especially for high-dimensional domains that require function approximation. We propose exploration objectives -- policy optimization objectives that enable downstream maximization of any reward function -- as a conceptual framework to systematize the study of exploration. Within this framework, we introduce a new objective, $L_1$-Coverag… ▽ More

    Submitted 4 June, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: ICML 2024

  9. arXiv:2402.10344  [pdf, other

    cs.CV

    Evaluating NeRFs for 3D Plant Geometry Reconstruction in Field Conditions

    Authors: Muhammad Arbab Arshad, Talukder Jubery, James Afful, Anushrut Jignasu, Aditya Balu, Baskar Ganapathysubramanian, Soumik Sarkar, Adarsh Krishnamurthy

    Abstract: We evaluate different Neural Radiance Fields (NeRFs) techniques for reconstructing (3D) plants in varied environments, from indoor settings to outdoor fields. Traditional techniques often struggle to capture the complex details of plants, which is crucial for botanical and agricultural understanding. We evaluate three scenarios with increasing complexity and compare the results with the point clou… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  10. arXiv:2402.06787  [pdf, other

    cs.NI cs.DC cs.LG

    ForestColl: Efficient Collective Communications on Heterogeneous Network Fabrics

    Authors: Liangyu Zhao, Saeed Maleki, Ziyue Yang, Hossein Pourreza, Aashaka Shah, Changho Hwang, Arvind Krishnamurthy

    Abstract: As modern DNN models grow ever larger, collective communications between the accelerators (allreduce, etc.) emerge as a significant performance bottleneck. Designing efficient communication schedules is challenging given today's highly diverse and heterogeneous network fabrics. In this paper, we present ForestColl, a tool that generates efficient schedules for any network topology. ForestColl cons… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: text overlap with arXiv:2305.18461

  11. arXiv:2402.06091  [pdf, other

    cs.CV

    Early Fusion of Features for Semantic Segmentation

    Authors: Anupam Gupta, Ashok Krishnamurthy, Lisa Singh

    Abstract: This paper introduces a novel segmentation framework that integrates a classifier network with a reverse HRNet architecture for efficient image segmentation. Our approach utilizes a ResNet-50 backbone, pretrained in a semi-supervised manner, to generate feature maps at various scales. These maps are then processed by a reverse HRNet, which is adapted to handle varying channel dimensions through 1x… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  12. arXiv:2401.12216  [pdf, other

    stat.ML cs.LG math.OC

    Mitigating Covariate Shift in Misspecified Regression with Applications to Reinforcement Learning

    Authors: Philip Amortila, Tongyi Cao, Akshay Krishnamurthy

    Abstract: A pervasive phenomenon in machine learning applications is distribution shift, where training and deployment conditions for a machine learning model differ. As distribution shift typically results in a degradation in performance, much attention has been devoted to algorithmic interventions that mitigate these detrimental effects. In this paper, we study the effect of distribution shift in the pres… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

  13. arXiv:2311.07753  [pdf, other

    cs.NI

    Bringing Reconfigurability to the Network Stack

    Authors: Akshay Narayan, Aurojit Panda, Mohammad Alizadeh, Hari Balakrishnan, Arvind Krishnamurthy, Scott Shenker

    Abstract: Reconfiguring the network stack allows applications to specialize the implementations of communication libraries depending on where they run, the requests they serve, and the performance they need to provide. Specializing applications in this way is challenging because developers need to choose the libraries they use when writing a program and cannot easily change them at runtime. This paper intro… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

    Comments: 12 pages, 10 figures

  14. arXiv:2310.19102  [pdf, other

    cs.LG

    Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

    Authors: Yilong Zhao, Chien-Yu Lin, Kan Zhu, Zihao Ye, Lequn Chen, Size Zheng, Luis Ceze, Arvind Krishnamurthy, Tianqi Chen, Baris Kasikci

    Abstract: The growing demand for Large Language Models (LLMs) in applications such as content generation, intelligent chatbots, and sentiment analysis poses considerable challenges for LLM service providers. To efficiently use GPU resources and boost throughput, batching multiple requests has emerged as a popular paradigm; to further speed up batching, LLM quantization techniques reduce memory consumption a… ▽ More

    Submitted 16 April, 2024; v1 submitted 29 October, 2023; originally announced October 2023.

  15. arXiv:2310.18547  [pdf, other

    cs.DC cs.LG

    Punica: Multi-Tenant LoRA Serving

    Authors: Lequn Chen, Zihao Ye, Yongji Wu, Danyang Zhuo, Luis Ceze, Arvind Krishnamurthy

    Abstract: Low-rank adaptation (LoRA) has become an important and popular method to adapt pre-trained models to specific domains. We present Punica, a system to serve multiple LoRA models in a shared GPU cluster. Punica contains a new CUDA kernel design that allows batching of GPU operations for different LoRA models. This allows a GPU to hold only a single copy of the underlying pre-trained model when servi… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

  16. arXiv:2310.11428  [pdf, other

    cs.LG math.OC stat.ML

    Butterfly Effects of SGD Noise: Error Amplification in Behavior Cloning and Autoregression

    Authors: Adam Block, Dylan J. Foster, Akshay Krishnamurthy, Max Simchowitz, Cyril Zhang

    Abstract: This work studies training instabilities of behavior cloning with deep neural networks. We observe that minibatch SGD updates to the policy network during training result in sharp oscillations in long-horizon rewards, despite negligibly affecting the behavior cloning loss. We empirically disentangle the statistical and computational causes of these oscillations, and find them to stem from the chao… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

  17. arXiv:2309.13541  [pdf, other

    cs.DC cs.NI

    Efficient All-to-All Collective Communication Schedules for Direct-Connect Topologies

    Authors: Prithwish Basu, Liangyu Zhao, Jason Fantl, Siddharth Pal, Arvind Krishnamurthy, Joud Khoury

    Abstract: The all-to-all collective communications primitive is widely used in machine learning (ML) and high performance computing (HPC) workloads, and optimizing its performance is of interest to both ML and HPC communities. All-to-all is a particularly challenging workload that can severely strain the underlying interconnect bandwidth at scale. This paper takes a holistic approach to optimize the perform… ▽ More

    Submitted 25 April, 2024; v1 submitted 23 September, 2023; originally announced September 2023.

    Comments: HPDC '24

  18. arXiv:2309.12624  [pdf, other

    cs.NI

    Quark: A High-Performance Secure Container Runtime for Serverless Computing

    Authors: Chenxingyu Zhao, Yulin Sun, Ying Xiong, Arvind Krishnamurthy

    Abstract: Secure container runtimes serve as the foundational layer for creating and running containers, which is the bedrock of emerging computing paradigms like microservices and serverless computing. Although existing secure container runtimes indeed enhance security via running containers over a guest kernel and a Virtual Machine Monitor (VMM or Hypervisor), they incur performance penalties in critical… ▽ More

    Submitted 6 October, 2023; v1 submitted 22 September, 2023; originally announced September 2023.

    Comments: arXiv admin note: text overlap with arXiv:2305.10621. The paper on arXiv:2305.10621 presents a detailed version of the TSoR module in Quark

  19. arXiv:2309.11601  [pdf, other

    cs.LG

    Latent Diffusion Models for Structural Component Design

    Authors: Ethan Herron, Jaydeep Rade, Anushrut Jignasu, Baskar Ganapathysubramanian, Aditya Balu, Soumik Sarkar, Adarsh Krishnamurthy

    Abstract: Recent advances in generative modeling, namely Diffusion models, have revolutionized generative modeling, enabling high-quality image generation tailored to user needs. This paper proposes a framework for the generative design of structural components. Specifically, we employ a Latent Diffusion model to generate potential designs of a component that can satisfy a set of problem-specific loading co… ▽ More

    Submitted 24 September, 2023; v1 submitted 20 September, 2023; originally announced September 2023.

  20. arXiv:2309.02465  [pdf, other

    cs.SE cs.AI cs.CL cs.LG

    Towards Foundational AI Models for Additive Manufacturing: Language Models for G-Code Debugging, Manipulation, and Comprehension

    Authors: Anushrut Jignasu, Kelly Marshall, Baskar Ganapathysubramanian, Aditya Balu, Chinmay Hegde, Adarsh Krishnamurthy

    Abstract: 3D printing or additive manufacturing is a revolutionary technology that enables the creation of physical objects from digital models. However, the quality and accuracy of 3D printing depend on the correctness and efficiency of the G-code, a low-level numerical control programming language that instructs 3D printers how to move and extrude material. Debugging G-code is a challenging task that requ… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

  21. arXiv:2308.07470  [pdf, other

    cs.DC cs.LG

    Symphony: Optimized DNN Model Serving using Deferred Batch Scheduling

    Authors: Lequn Chen, Weixin Deng, Anirudh Canumalla, Yu Xin, Danyang Zhuo, Matthai Philipose, Arvind Krishnamurthy

    Abstract: Having large batch sizes is one of the most critical aspects of increasing the accelerator efficiency and the performance of DNN model inference. However, existing model serving systems cannot achieve adequate batch sizes while meeting latency objectives as these systems eagerly dispatch requests to accelerators to minimize the accelerator idle time. We propose Symphony, a DNN serving system that… ▽ More

    Submitted 28 February, 2024; v1 submitted 14 August, 2023; originally announced August 2023.

  22. arXiv:2306.08183  [pdf, other

    cs.CV

    ZeroForge: Feedforward Text-to-Shape Without 3D Supervision

    Authors: Kelly O. Marshall, Minh Pham, Ameya Joshi, Anushrut Jignasu, Aditya Balu, Adarsh Krishnamurthy, Chinmay Hegde

    Abstract: Current state-of-the-art methods for text-to-shape generation either require supervised training using a labeled dataset of pre-defined 3D shapes, or perform expensive inference-time optimization of implicit neural representations. In this work, we present ZeroForge, an approach for zero-shot text-to-shape generation that avoids both pitfalls. To achieve open-vocabulary shape generation, we requir… ▽ More

    Submitted 15 June, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

    Comments: 19 pages, High resolution figures needed to demonstrate 3D results

  23. arXiv:2306.07923  [pdf, other

    cs.LG

    Oracle-Efficient Pessimism: Offline Policy Optimization in Contextual Bandits

    Authors: Lequn Wang, Akshay Krishnamurthy, Aleksandrs Slivkins

    Abstract: We consider offline policy optimization (OPO) in contextual bandits, where one is given a fixed dataset of logged interactions. While pessimistic regularizers are typically used to mitigate distribution shift, prior implementations thereof are either specialized or computationally inefficient. We present the first general oracle-efficient algorithm for pessimistic OPO: it reduces to supervised lea… ▽ More

    Submitted 25 October, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

  24. arXiv:2306.00946  [pdf, other

    cs.LG cs.CL

    Exposing Attention Glitches with Flip-Flop Language Modeling

    Authors: Bingbin Liu, Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Cyril Zhang

    Abstract: Why do large language models sometimes output factual inaccuracies and exhibit erroneous reasoning? The brittleness of these models, particularly when executing long chains of reasoning, currently seems to be an inevitable price to pay for their advanced capabilities of coherently synthesizing knowledge, pragmatics, and abstract thought. Towards making sense of this fundamentally unsolved problem,… ▽ More

    Submitted 30 October, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: v2: NeurIPS 2023 camera-ready + data release

  25. arXiv:2305.18461  [pdf, ps, other

    cs.NI cs.DC cs.DM cs.LG

    Bandwidth Optimal Pipeline Schedule for Collective Communication

    Authors: Liangyu Zhao, Arvind Krishnamurthy

    Abstract: We present a strongly polynomial-time algorithm to generate bandwidth optimal allgather/reduce-scatter on any network topology, with or without switches. Our algorithm constructs pipeline schedules achieving provably the best possible bandwidth performance on a given topology. To provide a universal solution, we model the network topology as a directed graph with heterogeneous link capacities and… ▽ More

    Submitted 31 May, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

  26. arXiv:2305.10621  [pdf, other

    cs.NI cs.DC

    TSoR: TCP Socket over RDMA Container Network for Cloud Native Computing

    Authors: Yulin Sun, Qingming Qu, Chenxingyu Zhao, Arvind Krishnamurthy, Hong Chang, Ying Xiong

    Abstract: Cloud-native containerized applications constantly seek high-performance and easy-to-operate container network solutions. RDMA network is a potential enabler with higher throughput and lower latency than the standard TCP/IP network stack. However, several challenges remain in equipping containerized applications with RDMA network: 1) How to deliver transparent improvements without modifying applic… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

  27. arXiv:2305.07120  [pdf, other

    cs.GR

    Geometric Modeling and Physics Simulation Framework for Building a Digital Twin of Extrusion-based Additive Manufacturing

    Authors: Dhruv Gamdha, Kumar Saurabh, Baskar Ganapathysubramanian, Adarsh Krishnamurthy

    Abstract: Accurate simulation of the printing process is essential for improving print quality, reducing waste, and optimizing the printing parameters of extrusion-based additive manufacturing. Traditional additive manufacturing simulations are very compute-intensive and are not scalable to simulate even moderately-sized geometries. In this paper, we propose a general framework for creating a digital twin o… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

    Comments: 13 pages

  28. arXiv:2303.02535  [pdf, other

    cs.LG

    Streaming Active Learning with Deep Neural Networks

    Authors: Akanksha Saran, Safoora Yousefi, Akshay Krishnamurthy, John Langford, Jordan T. Ash

    Abstract: Active learning is perhaps most naturally posed as an online learning problem. However, prior active learning approaches with deep neural networks assume offline access to the entire dataset ahead of time. This paper proposes VeSSAL, a new algorithm for batch active learning with deep neural networks in streaming settings, which samples groups of points to query for labels at the moment they are e… ▽ More

    Submitted 6 June, 2023; v1 submitted 4 March, 2023; originally announced March 2023.

    Comments: ICML 2023

  29. arXiv:2302.14753  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Hidden Markov Models Using Conditional Samples

    Authors: Sham M. Kakade, Akshay Krishnamurthy, Gaurav Mahajan, Cyril Zhang

    Abstract: This paper is concerned with the computational complexity of learning the Hidden Markov Model (HMM). Although HMMs are some of the most widely used tools in sequential and time series modeling, they are cryptographically hard to learn in the standard setting where one has access to i.i.d. samples of observation sequences. In this paper, we depart from this setup and consider an interactive access… ▽ More

    Submitted 24 February, 2024; v1 submitted 28 February, 2023; originally announced February 2023.

  30. arXiv:2302.13934  [pdf, other

    cs.LG stat.ML

    Statistical Learning under Heterogeneous Distribution Shift

    Authors: Max Simchowitz, Anurag Ajay, Pulkit Agrawal, Akshay Krishnamurthy

    Abstract: This paper studies the prediction of a target $\mathbf{z}$ from a pair of random variables $(\mathbf{x},\mathbf{y})$, where the ground-truth predictor is additive $\mathbb{E}[\mathbf{z} \mid \mathbf{x},\mathbf{y}] = f_\star(\mathbf{x}) +g_{\star}(\mathbf{y})$. We study the performance of empirical risk minimization (ERM) over functions $f+g$, $f \in F$ and $g \in G$, fit on a given training distri… ▽ More

    Submitted 27 October, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

  31. arXiv:2302.09949  [pdf, other

    cs.LG cs.CV

    SpecXAI -- Spectral interpretability of Deep Learning Models

    Authors: Stefan Druc, Peter Wooldridge, Adarsh Krishnamurthy, Soumik Sarkar, Aditya Balu

    Abstract: Deep learning is becoming increasingly adopted in business and industry due to its ability to transform large quantities of data into high-performing models. These models, however, are generally regarded as black boxes, which, in spite of their performance, could prevent their use. In this context, the field of eXplainable AI attempts to develop techniques that temper the impenetrable nature of th… ▽ More

    Submitted 20 February, 2023; originally announced February 2023.

  32. arXiv:2211.14662  [pdf, other

    cs.CV cs.LG eess.IV q-bio.QM

    3D Reconstruction of Protein Complex Structures Using Synthesized Multi-View AFM Images

    Authors: Jaydeep Rade, Soumik Sarkar, Anwesha Sarkar, Adarsh Krishnamurthy

    Abstract: Recent developments in deep learning-based methods demonstrated its potential to predict the 3D protein structures using inputs such as protein sequences, Cryo-Electron microscopy (Cryo-EM) images of proteins, etc. However, these methods struggle to predict the protein complexes (PC), structures with more than one protein. In this work, we explore the atomic force microscope (AFM) assisted deep le… ▽ More

    Submitted 26 November, 2022; originally announced November 2022.

    Comments: 5 apges, 8 figures, Machine Learning for Structural Biology Workshop, NeurIPS 2022

  33. arXiv:2211.03241  [pdf, other

    cs.LG math.NA

    Neural PDE Solvers for Irregular Domains

    Authors: Biswajit Khara, Ethan Herron, Zhanhong Jiang, Aditya Balu, Chih-Hsuan Yang, Kumar Saurabh, Anushrut Jignasu, Soumik Sarkar, Chinmay Hegde, Adarsh Krishnamurthy, Baskar Ganapathysubramanian

    Abstract: Neural network-based approaches for solving partial differential equations (PDEs) have recently received special attention. However, the large majority of neural PDE solvers only apply to rectilinear domains, and do not systematically address the imposition of Dirichlet/Neumann boundary conditions over irregular domain boundaries. In this paper, we present a framework to neurally solve partial dif… ▽ More

    Submitted 6 November, 2022; originally announced November 2022.

  34. arXiv:2210.10749  [pdf, other

    cs.LG cs.FL stat.ML

    Transformers Learn Shortcuts to Automata

    Authors: Bingbin Liu, Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Cyril Zhang

    Abstract: Algorithmic reasoning requires capabilities which are most naturally understood through recurrent models of computation, like the Turing machine. However, Transformer models, while lacking recurrence, are able to perform such reasoning using far fewer layers than the number of reasoning steps. This raises the question: what solutions are learned by these shallow and non-recurrent models? We find t… ▽ More

    Submitted 2 May, 2023; v1 submitted 19 October, 2022; originally announced October 2022.

  35. arXiv:2210.06718  [pdf, other

    cs.LG

    Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient

    Authors: Yuda Song, Yifei Zhou, Ayush Sekhari, J. Andrew Bagnell, Akshay Krishnamurthy, Wen Sun

    Abstract: We consider a hybrid reinforcement learning setting (Hybrid RL), in which an agent has access to an offline dataset and the ability to collect experience via real-world online interaction. The framework mitigates the challenges that arise in both pure offline and online RL settings, allowing for the design of simple and highly effective algorithms, in both theory and practice. We demonstrate these… ▽ More

    Submitted 11 March, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: 42 pages, 6 figures. Published at ICLR 2023. Code available at https://github.com/yudasong/HyQ

  36. arXiv:2209.13121  [pdf, other

    physics.comp-ph cs.MS

    CyRSoXS: A GPU-accelerated virtual instrument for Polarized Resonant Soft X-ray Scattering (P-RSoXS)

    Authors: Kumar Saurabh, Peter J. Dudenas, Eliot Gann, Veronica G. Reynolds, Subhrangsu Mukherjee, Daniel Sunday, Tyler B. Martin, Peter A. Beaucage, Michael L. Chabinyc, Dean M. DeLongchamp, Adarsh Krishnamurthy, Baskar Ganapathysubramanian

    Abstract: Polarized Resonant Soft X-ray scattering (P-RSoXS) has emerged as a powerful synchrotron-based tool that combines principles of X-ray scattering and X-ray spectroscopy. P-RSoXS provides unique sensitivity to molecular orientation and chemical heterogeneity in soft materials such as polymers and biomaterials. Quantitative extraction of orientation information from P-RSoXS pattern data is challengin… ▽ More

    Submitted 26 September, 2022; originally announced September 2022.

    Comments: 41 pages, 19 figures

  37. arXiv:2207.08229  [pdf, other

    cs.LG cs.RO stat.ML

    Guaranteed Discovery of Control-Endogenous Latent States with Multi-Step Inverse Models

    Authors: Alex Lamb, Riashat Islam, Yonathan Efroni, Aniket Didolkar, Dipendra Misra, Dylan Foster, Lekan Molu, Rajan Chari, Akshay Krishnamurthy, John Langford

    Abstract: In many sequential decision-making tasks, the agent is not able to model the full complexity of the world, which consists of multitudes of relevant and irrelevant information. For example, a person walking along a city street who tries to model all aspects of the world would quickly be overwhelmed by a multitude of shops, cars, and people moving in and out of view, each following their own complex… ▽ More

    Submitted 27 December, 2022; v1 submitted 17 July, 2022; originally announced July 2022.

    Comments: Project Website: https://controllable-latent-state.github.io/

  38. arXiv:2207.00592  [pdf, other

    cs.DC cs.NI

    Dissecting Service Mesh Overheads

    Authors: Xiangfeng Zhu, Guozhen She, Bowen Xue, Yu Zhang, Yongsu Zhang, Xuan Kelvin Zou, Xiongchun Duan, Peng He, Arvind Krishnamurthy, Matthew Lentz, Danyang Zhuo, Ratul Mahajan

    Abstract: Service meshes play a central role in the modern application ecosystem by providing an easy and flexible way to connect different services that form a distributed application. However, because of the way they interpose on application traffic, they can substantially increase application latency and resource consumption. We develop a decompositional approach and a tool, called MeshInsight, to system… ▽ More

    Submitted 2 July, 2022; originally announced July 2022.

  39. arXiv:2206.10770  [pdf, ps, other

    cs.LG cs.AI stat.ML

    On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL

    Authors: Jinglin Chen, Aditya Modi, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal

    Abstract: We study reward-free reinforcement learning (RL) under general non-linear function approximation, and establish sample efficiency and hardness results under various standard structural assumptions. On the positive side, we propose the RFOLIVE (Reward-Free OLIVE) algorithm for sample-efficient reward-free exploration under minimal structural assumptions, which covers the previously studied settings… ▽ More

    Submitted 22 October, 2022; v1 submitted 21 June, 2022; originally announced June 2022.

  40. arXiv:2206.04282  [pdf, ps, other

    cs.LG

    Sample-Efficient Reinforcement Learning in the Presence of Exogenous Information

    Authors: Yonathan Efroni, Dylan J. Foster, Dipendra Misra, Akshay Krishnamurthy, John Langford

    Abstract: In real-world reinforcement learning applications the learner's observation space is ubiquitously high-dimensional with both relevant and irrelevant information about the task at hand. Learning from high-dimensional observations has been the subject of extensive investigation in supervised learning and statistics (e.g., via sparsity), but analogous issues in reinforcement learning are not well und… ▽ More

    Submitted 9 June, 2022; originally announced June 2022.

    Comments: Accepted for presentation at the Conference on Learning Theory (COLT) 2022

  41. arXiv:2205.02102  [pdf, other

    cs.CV cs.GR cs.LG

    Concept Activation Vectors for Generating User-Defined 3D Shapes

    Authors: Stefan Druc, Aditya Balu, Peter Wooldridge, Adarsh Krishnamurthy, Soumik Sarkar

    Abstract: We explore the interpretability of 3D geometric deep learning models in the context of Computer-Aided Design (CAD). The field of parametric CAD can be limited by the difficulty of expressing high-level design concepts in terms of a few numeric parameters. In this paper, we use a deep learning architectures to encode high dimensional 3D shapes into a vectorized latent representation that can be use… ▽ More

    Submitted 29 April, 2022; originally announced May 2022.

  42. arXiv:2203.04236  [pdf, other

    cs.LG stat.ML

    A Complete Characterization of Linear Estimators for Offline Policy Evaluation

    Authors: Juan C. Perdomo, Akshay Krishnamurthy, Peter Bartlett, Sham Kakade

    Abstract: Offline policy evaluation is a fundamental statistical problem in reinforcement learning that involves estimating the value function of some decision-making policy given data collected by a potentially different policy. In order to tackle problems with complex, high-dimensional observations, there has been significant interest from theoreticians and practitioners alike in understanding the possibi… ▽ More

    Submitted 19 December, 2022; v1 submitted 8 March, 2022; originally announced March 2022.

    Comments: added extensions to misspecified case, comparisons to Bellman residual minimization, 41 pages

  43. arXiv:2203.00410  [pdf, other

    cs.PF

    Markovian Analysis of Coordination Strategies in Tandem Polling Queues with Setups

    Authors: Ravi Suman, Ananth Krishnamurthy

    Abstract: We analyze a network of tandem polling queues with two stations operating under synchronized polling (SP) and out-of-sync polling (OP) strategies, and with nonzero setups. We conduct an exact analysis using a decomposition approach to compare the performance in terms of throughput and mean waiting times to investigate when one strategy might be preferred over the other. We also numerically investi… ▽ More

    Submitted 28 February, 2022; originally announced March 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2202.10045

  44. arXiv:2202.14037  [pdf, other

    cs.LG cs.AI

    Understanding Contrastive Learning Requires Incorporating Inductive Biases

    Authors: Nikunj Saunshi, Jordan Ash, Surbhi Goel, Dipendra Misra, Cyril Zhang, Sanjeev Arora, Sham Kakade, Akshay Krishnamurthy

    Abstract: Contrastive learning is a popular form of self-supervised learning that encourages augmentations (views) of the same input to have more similar representations compared to augmentations of different inputs. Recent attempts to theoretically explain the success of contrastive learning on downstream classification tasks prove guarantees depending on properties of {\em augmentations} and the value of… ▽ More

    Submitted 28 February, 2022; originally announced February 2022.

  45. arXiv:2202.10045  [pdf, other

    cs.PF math.PR

    Analysis of Two-Station Polling Queues with Setups using Continuous Time Markov Chain

    Authors: Ravi Suman, Ananth Krishnamurthy

    Abstract: The paper analyzes the performance of tandem network of polling queue with setups. For a system with two-products and two-stations, we propose a new approach based on a partially-collapsible state-space characterization to reduce state-space complexity. In this approach, the size of the state-space is varied depending on the information needed to determine buffer levels and waiting times. We evalu… ▽ More

    Submitted 21 February, 2022; originally announced February 2022.

  46. arXiv:2202.03983  [pdf, other

    cs.LG cs.AI

    Provable Reinforcement Learning with a Short-Term Memory

    Authors: Yonathan Efroni, Chi Jin, Akshay Krishnamurthy, Sobhan Miryoosefi

    Abstract: Real-world sequential decision making problems commonly involve partial observability, which requires the agent to maintain a memory of history in order to infer the latent states, plan and make good decisions. Coping with partial observability in general is extremely challenging, as a number of worst-case statistical and computational barriers are known in learning Partially Observable Markov Dec… ▽ More

    Submitted 8 February, 2022; originally announced February 2022.

  47. arXiv:2202.03356  [pdf, other

    cs.NI cs.DC cs.LG

    Efficient Direct-Connect Topologies for Collective Communications

    Authors: Liangyu Zhao, Siddharth Pal, Tapan Chugh, Weiyang Wang, Jason Fantl, Prithwish Basu, Joud Khoury, Arvind Krishnamurthy

    Abstract: We consider the problem of distilling efficient network topologies for collective communications. We provide an algorithmic framework for constructing direct-connect topologies optimized for the latency vs. bandwidth trade-off associated with the workload. Our approach synthesizes many different topologies and schedules for a given cluster size and degree and then identifies the appropriate topolo… ▽ More

    Submitted 12 May, 2024; v1 submitted 7 February, 2022; originally announced February 2022.

  48. arXiv:2111.12306  [pdf, ps, other

    cs.LG

    Efficient and Optimal Algorithms for Contextual Dueling Bandits under Realizability

    Authors: Aadirupa Saha, Akshay Krishnamurthy

    Abstract: We study the $K$-armed contextual dueling bandit problem, a sequential decision making setting in which the learner uses contextual information to make two decisions, but only observes \emph{preference-based feedback} suggesting that one decision was better than the other. We focus on the regret minimization problem under realizability, where the feedback is generated by a pairwise preference matr… ▽ More

    Submitted 24 November, 2021; originally announced November 2021.

  49. arXiv:2111.10919  [pdf, other

    cs.LG stat.ML

    Offline Reinforcement Learning: Fundamental Barriers for Value Function Approximation

    Authors: Dylan J. Foster, Akshay Krishnamurthy, David Simchi-Levi, Yunzong Xu

    Abstract: We consider the offline reinforcement learning problem, where the aim is to learn a decision making policy from logged data. Offline RL -- particularly when coupled with (value) function approximation to allow for generalization in large or continuous state spaces -- is becoming increasingly relevant in practice, because it avoids costly and time-consuming online data collection and is well suited… ▽ More

    Submitted 30 August, 2022; v1 submitted 21 November, 2021; originally announced November 2021.

    Comments: Accepted for presentation at the Conference on Learning Theory (COLT) 2022

  50. arXiv:2111.09353  [pdf, other

    cs.DC cs.CE cs.CY

    Case study of SARS-CoV-2 transmission risk assessment in indoor environments using cloud computing resources

    Authors: Kumar Saurabh, Santi Adavani, Kendrick Tan, Masado Ishii, Boshun Gao, Adarsh Krishnamurthy, Hari Sundar, Baskar Ganapathysubramanian

    Abstract: Complex flow simulations are conventionally performed on HPC clusters. However, the limited availability of HPC resources and steep learning curve of executing on traditional supercomputer infrastructure has drawn attention towards deploying flow simulation software on the cloud. We showcase how a complex computational framework -- that can evaluate COVID-19 transmission risk in various indoor cla… ▽ More

    Submitted 17 November, 2021; originally announced November 2021.

    Comments: Accepted for publication at SuperCompCloud: 5th Workshop on Interoperability of Supercomputing and Cloud Technologies