Skip to main content

Showing 1–35 of 35 results for author: Zilberstein, S

  1. arXiv:2306.15909  [pdf, other

    cs.LG cs.AI

    RL$^3$: Boosting Meta Reinforcement Learning via RL inside RL$^2$

    Authors: Abhinav Bhatia, Samer B. Nashed, Shlomo Zilberstein

    Abstract: Meta reinforcement learning (meta-RL) methods such as RL$^2$ have emerged as promising approaches for learning data-efficient RL algorithms tailored to a given task distribution. However, they show poor asymptotic performance and struggle with out-of-distribution tasks because they rely on sequence models, such as recurrent neural networks or transformers, to process experiences rather than summar… ▽ More

    Submitted 26 March, 2024; v1 submitted 28 June, 2023; originally announced June 2023.

  2. arXiv:2206.02380  [pdf, other

    cs.LG cs.AI

    Adaptive Rollout Length for Model-Based RL Using Model-Free Deep RL

    Authors: Abhinav Bhatia, Philip S. Thomas, Shlomo Zilberstein

    Abstract: Model-based reinforcement learning promises to learn an optimal policy from fewer interactions with the environment compared to model-free reinforcement learning by learning an intermediate model of the environment in order to predict future interactions. When predicting a sequence of interactions, the rollout length, which limits the prediction horizon, is a critical hyperparameter as accuracy of… ▽ More

    Submitted 7 June, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

  3. arXiv:2206.00705  [pdf, other

    cs.RO cs.AI

    Dense Crowd Flow-Informed Path Planning

    Authors: Emily Pruc, Shlomo Zilberstein, Joydeep Biswas

    Abstract: Both pedestrian and robot comfort are of the highest priority whenever a robot is placed in an environment containing human beings. In the case of pedestrian-unaware mobile robots this desire for safety leads to the freezing robot problem, where a robot confronted with a large dynamic group of obstacles (such as a crowd of pedestrians) would determine all forward navigation unsafe causing the robo… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

  4. arXiv:2205.15462  [pdf, other

    cs.AI cs.MA

    Causal Explanations for Sequential Decision Making Under Uncertainty

    Authors: Samer B. Nashed, Saaduddin Mahmud, Claudia V. Goldman, Shlomo Zilberstein

    Abstract: We introduce a novel framework for causal explanations of stochastic, sequential decision-making systems built on the well-studied structural causal model paradigm for causal reasoning. This single framework can identify multiple, semantically distinct explanations for agent actions -- something not previously possible. In this paper, we establish exact methods and several approximation techniques… ▽ More

    Submitted 10 January, 2023; v1 submitted 30 May, 2022; originally announced May 2022.

    Comments: 9 pages, 7 figures

  5. arXiv:2109.13974  [pdf, other

    cs.RO cs.CV

    Competence-Aware Path Planning via Introspective Perception

    Authors: Sadegh Rabiee, Connor Basich, Kyle Hollins Wray, Shlomo Zilberstein, Joydeep Biswas

    Abstract: Robots deployed in the real world over extended periods of time need to reason about unexpected failures, learn to predict them, and to proactively take actions to avoid future failures. Existing approaches for competence-aware planning are either model-based, requiring explicit enumeration of known failure modes, or purely statistical, using state- and location-specific failure statistics to infe… ▽ More

    Submitted 17 January, 2022; v1 submitted 28 September, 2021; originally announced September 2021.

    Comments: Accepted by IEEE Robotics and Automation Letters (8 pages, 8 figures)

    ACM Class: I.2.9; I.2.10

  6. arXiv:2108.00366  [pdf, other

    cs.RO cs.AI cs.MA

    Agent-aware State Estimation in Autonomous Vehicles

    Authors: Shane Parr, Ishan Khatri, Justin Svegliato, Shlomo Zilberstein

    Abstract: Autonomous systems often operate in environments where the behavior of multiple agents is coordinated by a shared global state. Reliable estimation of the global state is thus critical for successfully operating in a multi-agent setting. We introduce agent-aware state estimation -- a framework for calculating indirect estimations of state given observations of the behavior of other agents in the e… ▽ More

    Submitted 1 August, 2021; originally announced August 2021.

    Comments: To appear in IROS 2021

  7. arXiv:2102.07017  [pdf, other

    cs.AI cs.MA cs.RO

    Mitigating Negative Side Effects via Environment Shaping

    Authors: Sandhya Saisubramanian, Shlomo Zilberstein

    Abstract: Agents operating in unstructured environments often produce negative side effects (NSE), which are difficult to identify at design time. While the agent can learn to mitigate the side effects from human feedback, such feedback is often expensive and the rate of learning is sensitive to the agent's state representation. We examine how humans can assist an agent, beyond providing feedback, and explo… ▽ More

    Submitted 13 February, 2021; originally announced February 2021.

    Comments: 9 pages

  8. arXiv:2102.03977  [pdf, other

    stat.ML cs.AI cs.CY cs.DS cs.LG

    Learning to Generate Fair Clusters from Demonstrations

    Authors: Sainyam Galhotra, Sandhya Saisubramanian, Shlomo Zilberstein

    Abstract: Fair clustering is the process of grouping similar entities together, while satisfying a mathematically well-defined fairness metric as a constraint. Due to the practical challenges in precise model specification, the prescribed fairness constraints are often incomplete and act as proxies to the intended fairness requirement, leading to biased outcomes when the system is deployed. We examine how t… ▽ More

    Submitted 7 February, 2021; originally announced February 2021.

  9. arXiv:2010.04914  [pdf, other

    cs.AI cs.RO

    Helpfulness as a Key Metric of Human-Robot Collaboration

    Authors: Richard G. Freedman, Steven J. Levine, Brian C. Williams, Shlomo Zilberstein

    Abstract: As robotic teammates become more common in society, people will assess the robots' roles in their interactions along many dimensions. One such dimension is effectiveness: people will ask whether their robotic partners are trustworthy and effective collaborators. This begs a crucial question: how can we quantitatively measure the helpfulness of a robotic partner for a given task at hand? This paper… ▽ More

    Submitted 10 October, 2020; originally announced October 2020.

    Comments: Accepted for presentation at the AAAI 2020 Fall Symposium Series, in the symposium for Artificial Intelligence for Human-Robot Interaction: Trust & Explainability in Artificial Intelligence for Human-Robot Interaction

  10. arXiv:2008.12146  [pdf, other

    cs.CY cs.AI

    Avoiding Negative Side Effects due to Incomplete Knowledge of AI Systems

    Authors: Sandhya Saisubramanian, Shlomo Zilberstein, Ece Kamar

    Abstract: Autonomous agents acting in the real-world often operate based on models that ignore certain aspects of the environment. The incompleteness of any given model -- handcrafted or machine acquired -- is inevitable due to practical limitations of any modeling technique for complex real-world settings. Due to the limited fidelity of its model, an agent's actions may have unexpected, undesirable consequ… ▽ More

    Submitted 18 October, 2021; v1 submitted 24 August, 2020; originally announced August 2020.

    Comments: 9 pages

  11. arXiv:2007.11740  [pdf, other

    cs.AI cs.HC cs.RO

    Improving Competence for Reliable Autonomy

    Authors: Connor Basich, Justin Svegliato, Kyle Hollins Wray, Stefan J. Witwicki, Shlomo Zilberstein

    Abstract: Given the complexity of real-world, unstructured domains, it is often impossible or impractical to design models that include every feature needed to handle all possible scenarios that an autonomous system may encounter. For an autonomous system to be reliable in such domains, it should have the ability to improve its competence online. In this paper, we propose a method for improving the competen… ▽ More

    Submitted 22 July, 2020; originally announced July 2020.

    Comments: In Proceedings AREA 2020, arXiv:2007.11260

    Journal ref: EPTCS 319, 2020, pp. 37-53

  12. arXiv:2003.07745  [pdf, other

    cs.AI cs.RO

    Learning to Optimize Autonomy in Competence-Aware Systems

    Authors: Connor Basich, Justin Svegliato, Kyle Hollins Wray, Stefan Witwicki, Joydeep Biswas, Shlomo Zilberstein

    Abstract: Interest in semi-autonomous systems (SAS) is growing rapidly as a paradigm to deploy autonomous systems in domains that require occasional reliance on humans. This paradigm allows service robots or autonomous vehicles to operate at varying levels of autonomy and offer safety in situations that require human judgment. We propose an introspective model of autonomy that is learned and updated online… ▽ More

    Submitted 17 March, 2020; originally announced March 2020.

    Comments: To be published in Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2020). 9 pages

    ACM Class: I.2.8; I.2.9

  13. arXiv:1912.07820  [pdf, other

    stat.ML cs.DS cs.LG

    Balancing the Tradeoff Between Clustering Value and Interpretability

    Authors: Sandhya Saisubramanian, Sainyam Galhotra, Shlomo Zilberstein

    Abstract: Graph clustering groups entities -- the vertices of a graph -- based on their similarity, typically using a complex distance function over a large number of features. Successful integration of clustering approaches in automated decision-support systems hinges on the interpretability of the resulting clusters. This paper addresses the problem of generating interpretable clusters, given features of… ▽ More

    Submitted 30 January, 2020; v1 submitted 17 December, 2019; originally announced December 2019.

    Comments: Accepted at AIES 2020

  14. arXiv:1909.06427  [pdf, other

    cs.AI

    Responsive Planning and Recognition for Closed-Loop Interaction

    Authors: Richard G. Freedman, Yi Ren Fung, Roman Ganchin, Shlomo Zilberstein

    Abstract: Many intelligent systems currently interact with others using at least one of fixed communication inputs or preset responses, resulting in rigid interaction experiences and extensive efforts developing a variety of scenarios for the system. Fixed inputs limit the natural behavior of the user in order to effectively communicate, and preset responses prevent the system from adapting to the current s… ▽ More

    Submitted 13 September, 2019; originally announced September 2019.

    Comments: Accepted for presentation at the AAAI 2019 Fall Symposium Series, in the symposium for Artificial Intelligence and Human-Robot Interaction for Service Robots in Human Environments

    Report number: AI-HRI/2019/24

  15. arXiv:1905.09355  [pdf, other

    cs.AI

    Minimizing the Negative Side Effects of Planning with Reduced Models

    Authors: Sandhya Saisubramanian, Shlomo Zilberstein

    Abstract: Reduced models of large Markov decision processes accelerate planning by considering a subset of outcomes for each state-action pair. This reduction in reachable states leads to replanning when the agent encounters states without a precomputed action during plan execution. However, not all states are suitable for replanning. In the worst case, the agent may not be able to reach the goal from the n… ▽ More

    Submitted 22 May, 2019; originally announced May 2019.

    Comments: AAAI Workshop on Artificial Intelligence Safety (2019)

  16. arXiv:1903.00750  [pdf, other

    stat.ML cs.AI cs.DS cs.LG

    Lexicographically Ordered Multi-Objective Clustering

    Authors: Sainyam Galhotra, Sandhya Saisubramanian, Shlomo Zilberstein

    Abstract: We introduce a rich model for multi-objective clustering with lexicographic ordering over objectives and a slack. The slack denotes the allowed multiplicative deviation from the optimal objective value of the higher priority objective to facilitate improvement in lower-priority objectives. We then propose an algorithm called Zeus to solve this class of problems, which is characterized by a makeshi… ▽ More

    Submitted 2 March, 2019; originally announced March 2019.

  17. arXiv:1810.08159  [pdf, other

    cs.AI

    Planning in Stochastic Environments with Goal Uncertainty

    Authors: Sandhya Saisubramanian, Kyle Hollins Wray, Luis Pineda, Shlomo Zilberstein

    Abstract: We present the Goal Uncertain Stochastic Shortest Path (GUSSP) problem -- a general framework to model path planning and decision making in stochastic environments with goal uncertainty. The framework extends the stochastic shortest path (SSP) model to dynamic environments in which it is impossible to determine the exact goal states ahead of plan execution. GUSSPs introduce flexibility in goal spe… ▽ More

    Submitted 3 April, 2020; v1 submitted 18 October, 2018; originally announced October 2018.

    Comments: 6 pages, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2019

    Journal ref: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 2019, pp. 1649-1654

  18. arXiv:1802.05835  [pdf, other

    cs.AI

    An Anytime Algorithm for Task and Motion MDPs

    Authors: Siddharth Srivastava, Nishant Desai, Richard Freedman, Shlomo Zilberstein

    Abstract: Integrated task and motion planning has emerged as a challenging problem in sequential decision making, where a robot needs to compute high-level strategy and low-level motion plans for solving complex tasks. While high-level strategies require decision making over longer time-horizons and scales, their feasibility depends on low-level constraints based upon the geometries and continuous dynamics… ▽ More

    Submitted 15 February, 2018; originally announced February 2018.

    Comments: 7 pages, 4 figures

  19. arXiv:1705.07381  [pdf, other

    cs.AI

    Generalizing the Role of Determinization in Probabilistic Planning

    Authors: Luis Pineda, Shlomo Zilberstein

    Abstract: The stochastic shortest path problem (SSP) is a highly expressive model for probabilistic planning. The computational hardness of SSPs has sparked interest in determinization-based planners that can quickly solve large problems. However, existing methods employ a simplistic approach to determinization. In particular, they ignore the possibility of tailoring the determinization to the specific char… ▽ More

    Submitted 29 July, 2017; v1 submitted 20 May, 2017; originally announced May 2017.

    Report number: UM-CS-2017-006

  20. arXiv:1612.00104  [pdf, other

    cs.AI

    Robust Optimization for Tree-Structured Stochastic Network Design

    Authors: Xiaojian Wu, Akshat Kumar, Daniel Sheldon, Shlomo Zilberstein

    Abstract: Stochastic network design is a general framework for optimizing network connectivity. It has several applications in computational sustainability including spatial conservation planning, pre-disaster network preparation, and river network optimization. A common assumption in previous work has been made that network parameters (e.g., probability of species colonization) are precisely known, which i… ▽ More

    Submitted 30 November, 2016; originally announced December 2016.

    Comments: AAAI 2017

  21. A Bilinear Programming Approach for Multiagent Planning

    Authors: Marek Petrik, Shlomo Zilberstein

    Abstract: Multiagent planning and coordination problems are common and known to be computationally hard. We show that a wide range of two-agent problems can be formulated as bilinear programs. We present a successive approximation algorithm that significantly outperforms the coverage set algorithm, which is the state-of-the-art method for this class of multiagent problems. Because the algorithm is formula… ▽ More

    Submitted 15 January, 2014; originally announced January 2014.

    Journal ref: Journal Of Artificial Intelligence Research, Volume 35, pages 235-274, 2009

  22. Policy Iteration for Decentralized Control of Markov Decision Processes

    Authors: Daniel S. Bernstein, Christopher Amato, Eric A. Hansen, Shlomo Zilberstein

    Abstract: Coordination of distributed agents is required for problems arising in many areas, including multi-robot systems, networking and e-commerce. As a formal framework for such problems, we use the decentralized partially observable Markov decision process (DEC-POMDP). Though much work has been done on optimal dynamic programming algorithms for the single-agent version of the problem, optimal algorit… ▽ More

    Submitted 15 January, 2014; originally announced January 2014.

    Journal ref: Journal Of Artificial Intelligence Research, Volume 34, pages 89-132, 2009

  23. arXiv:1301.3836  [pdf

    cs.AI

    The Complexity of Decentralized Control of Markov Decision Processes

    Authors: Daniel S Bernstein, Shlomo Zilberstein, Neil Immerman

    Abstract: Planning for distributed agents with partial state information is considered from a decision- theoretic perspective. We describe generalizations of both the MDP and POMDP models that allow for decentralized control. For even a small number of agents, the finite-horizon problems corresponding to both of our models are complete for nondeterministic exponential time. These complexity results illus… ▽ More

    Submitted 16 January, 2013; originally announced January 2013.

    Comments: Appears in Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI2000)

    Report number: UAI-P-2000-PG-32-37

  24. arXiv:1212.2459  [pdf

    cs.AI

    Symbolic Generalization for On-line Planning

    Authors: Zhengzhu Feng, Eric A. Hansen, Shlomo Zilberstein

    Abstract: Symbolic representations have been used successfully in off-line planning algorithms for Markov decision processes. We show that they can also improve the performance of on-line planners. In addition to reducing computation time, symbolic generalization can reduce the amount of costly real-world interactions required for convergence. We introduce Symbolic Real-Time Dynamic Programming (or sRTDP),… ▽ More

    Submitted 19 October, 2012; originally announced December 2012.

    Comments: Appears in Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI2003)

    Report number: UAI-P-2003-PG-209-216

  25. arXiv:1207.4116  [pdf

    cs.AI

    Region-Based Incremental Pruning for POMDPs

    Authors: Zhengzhu Feng, Shlomo Zilberstein

    Abstract: We present a major improvement to the incremental pruning algorithm for solving partially observable Markov decision processes. Our technique targets the cross-sum step of the dynamic programming (DP) update, a key source of complexity in POMDP algorithms. Instead of reasoning about the whole belief space when pruning the cross-sums, our algorithm divides the belief space into smaller regions and… ▽ More

    Submitted 11 July, 2012; originally announced July 2012.

    Comments: Appears in Proceedings of the Twentieth Conference on Uncertainty in Artificial Intelligence (UAI2004)

    Report number: UAI-P-2004-PG-146-153

  26. arXiv:1207.1359  [pdf

    cs.AI

    MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs

    Authors: Daniel Szer, Francois Charpillet, Shlomo Zilberstein

    Abstract: We present multi-agent A* (MAA*), the first complete and optimal heuristic search algorithm for solving decentralized partially-observable Markov decision problems (DEC-POMDPs) with finite horizon. The algorithm is suitable for computing optimal plans for a cooperative group of agents that operate in a stochastic environment such as multirobot coordination, network traffic control, `or distributed… ▽ More

    Submitted 4 July, 2012; originally announced July 2012.

    Comments: Appears in Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence (UAI2005)

    Report number: UAI-P-2005-PG-576-583

  27. arXiv:1206.5295  [pdf

    cs.AI

    Improved Memory-Bounded Dynamic Programming for Decentralized POMDPs

    Authors: Sven Seuken, Shlomo Zilberstein

    Abstract: Memory-Bounded Dynamic Programming (MBDP) has proved extremely effective in solving decentralized POMDPs with large horizons. We generalize the algorithm and improve its scalability by reducing the complexity with respect to the number of observations from exponential to polynomial. We derive error bounds on solution quality with respect to this new approximation and analyze the convergence behavi… ▽ More

    Submitted 20 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the Twenty-Third Conference on Uncertainty in Artificial Intelligence (UAI2007)

    Report number: UAI-P-2007-PG-344-351

  28. arXiv:1206.5258  [pdf

    cs.AI

    Optimizing Memory-Bounded Controllers for Decentralized POMDPs

    Authors: Christopher Amato, Daniel S Bernstein, Shlomo Zilberstein

    Abstract: We present a memory-bounded optimization approach for solving infinite-horizon decentralized POMDPs. Policies for each agent are represented by stochastic finite state controllers. We formulate the problem of optimizing these policies as a nonlinear program, leveraging powerful existing nonlinear optimization techniques for solving the problem. While existing solvers only guarantee locally optimal… ▽ More

    Submitted 20 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the Twenty-Third Conference on Uncertainty in Artificial Intelligence (UAI2007)

    Report number: UAI-P-2007-PG-1-8

  29. arXiv:1203.3528  [pdf

    cs.AI

    Rollout Sampling Policy Iteration for Decentralized POMDPs

    Authors: Feng Wu, Shlomo Zilberstein, Xiaoping Chen

    Abstract: We present decentralized rollout sampling policy iteration (DecRSPI) - a new algorithm for multi-agent decision problems formalized as DEC-POMDPs. DecRSPI is designed to improve scalability and tackle problems that lack an explicit model. The algorithm uses Monte- Carlo methods to generate a sample of reachable belief states. Then it computes a joint policy for each belief state based on the rollo… ▽ More

    Submitted 15 March, 2012; originally announced March 2012.

    Comments: Appears in Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI2010)

    Report number: UAI-P-2010-PG-666-673

  30. arXiv:1203.3490  [pdf

    cs.AI

    Anytime Planning for Decentralized POMDPs using Expectation Maximization

    Authors: Akshat Kumar, Shlomo Zilberstein

    Abstract: Decentralized POMDPs provide an expressive framework for multi-agent sequential decision making. While fnite-horizon DECPOMDPs have enjoyed signifcant success, progress remains slow for the infnite-horizon case mainly due to the inherent complexity of optimizing stochastic controllers representing agent policies. We present a promising new class of algorithms for the infnite-horizon case, which re… ▽ More

    Submitted 15 March, 2012; originally announced March 2012.

    Comments: Appears in Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI2010)

    Report number: UAI-P-2010-PG-294-301

  31. arXiv:1202.3739  [pdf

    cs.AI cs.DS stat.CO

    Message-Passing Algorithms for Quadratic Programming Formulations of MAP Estimation

    Authors: Akshat Kumar, Shlomo Zilberstein

    Abstract: Computing maximum a posteriori (MAP) estimation in graphical models is an important inference problem with many applications. We present message-passing algorithms for quadratic programming (QP) formulations of MAP estimation for pairwise Markov random fields. In particular, we use the concave-convex procedure (CCCP) to obtain a locally optimal algorithm for the non-convex QP formulation. A simila… ▽ More

    Submitted 14 February, 2012; originally announced February 2012.

    Report number: UAI-P-2011-PG-428-435

  32. Communication-Based Decomposition Mechanisms for Decentralized MDPs

    Authors: Claudia V. Goldman, Shlomo Zilberstein

    Abstract: Multi-agent planning in stochastic environments can be framed formally as a decentralized Markov decision problem. Many real-life distributed problems that arise in manufacturing, multi-robot coordination and information gathering scenarios can be formalized using this framework. However, finding the optimal solution in the general case is hard, limiting the applicability of recently developed alg… ▽ More

    Submitted 31 October, 2011; originally announced November 2011.

    Journal ref: Journal Of Artificial Intelligence Research, Volume 32, pages 169-202, 2008

  33. Decentralized Control of Cooperative Systems: Categorization and Complexity Analysis

    Authors: C. V. Goldman, S. Zilberstein

    Abstract: Decentralized control of cooperative systems captures the operation of a group of decision makers that share a single global objective. The difficulty in solving optimally such problems arises when the agents lack full observability of the global state of the system when they operate. The general problem has been shown to be NEXP-complete. In this paper, we identify classes of decentralized contr… ▽ More

    Submitted 30 June, 2011; originally announced July 2011.

    Journal ref: Journal Of Artificial Intelligence Research, Volume 22, pages 143-174, 2004

  34. arXiv:1006.2743  [pdf, other

    cs.AI

    Global Optimization for Value Function Approximation

    Authors: Marek Petrik, Shlomo Zilberstein

    Abstract: Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose a new approximate bilinear programming formulation of value function approximation, which employs global optimization. The formulation provides strong a priori guarantees on both robust and expected policy loss by minimizing specific norms of… ▽ More

    Submitted 14 June, 2010; originally announced June 2010.

  35. arXiv:1005.1860  [pdf, ps, other

    cs.AI

    Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes

    Authors: Marek Petrik, Gavin Taylor, Ron Parr, Shlomo Zilberstein

    Abstract: Approximate dynamic programming has been used successfully in a large variety of domains, but it relies on a small set of provided approximation features to calculate solutions reliably. Large and rich sets of features can cause existing algorithms to overfit because of a limited number of samples. We address this shortcoming using $L_1$ regularization in approximate linear programming. Because th… ▽ More

    Submitted 20 May, 2010; v1 submitted 11 May, 2010; originally announced May 2010.

    Comments: Technical report corresponding to the ICML2010 submission of the same name