Skip to main content

Showing 1–50 of 68 results for author: Albrecht, S V

  1. arXiv:2406.04815  [pdf, other

    cs.LG cs.AI cs.RO

    Skill-aware Mutual Information Optimisation for Generalisation in Reinforcement Learning

    Authors: Xuehui Yu, Mhairi Dunion, Xin Li, Stefano V. Albrecht

    Abstract: Meta-Reinforcement Learning (Meta-RL) agents can struggle to operate across tasks with varying environmental features that require different optimal skills (i.e., different modes of behaviours). Using context encoders based on contrastive learning to enhance the generalisability of Meta-RL agents is now widely studied but faces challenges such as the requirement for a large sample size, also refer… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  2. arXiv:2405.11727  [pdf, other

    cs.LG

    Highway Graph to Accelerate Reinforcement Learning

    Authors: Zidu Yin, Zhen Zhang, Dong Gong, Stefano V. Albrecht, Javen Q. Shi

    Abstract: Reinforcement Learning (RL) algorithms often suffer from low training efficiency. A strategy to mitigate this issue is to incorporate a model-based planning algorithm, such as Monte Carlo Tree Search (MCTS) or Value Iteration (VI), into the environmental model. The major limitation of VI is the need to iterate over a large tensor. These still lead to intensive computations. We focus on improving t… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: 28 pages, 17 figures, 3 tables, TMLR

  3. arXiv:2404.15583  [pdf, other

    cs.AI

    Multi-Agent Reinforcement Learning for Energy Networks: Computational Challenges, Progress and Open Problems

    Authors: Sarah Keren, Chaimaa Essayeh, Stefano V. Albrecht, Thomas Morstyn

    Abstract: The rapidly changing architecture and functionality of electrical networks and the increasing penetration of renewable and distributed energy resources have resulted in various technological and managerial challenges. These have rendered traditional centralized energy-market paradigms insufficient due to their inability to support the dynamic and evolving nature of the network. This survey explore… ▽ More

    Submitted 25 May, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

  4. arXiv:2404.14285  [pdf, other

    cs.RO cs.AI

    LLM-Personalize: Aligning LLM Planners with Human Preferences via Reinforced Self-Training for Housekeeping Robots

    Authors: Dongge Han, Trevor McInroe, Adam Jelley, Stefano V. Albrecht, Peter Bell, Amos Storkey

    Abstract: Large language models (LLMs) have shown significant potential for robotics applications, particularly task planning, by harnessing their language comprehension and text generation capabilities. However, in applications such as household robotics, a critical gap remains in the personalization of these models to individual user preferences. We introduce LLM-Personalize, a novel framework with an opt… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  5. arXiv:2404.14064  [pdf, other

    cs.LG cs.CV

    Multi-view Disentanglement for Reinforcement Learning with Multiple Cameras

    Authors: Mhairi Dunion, Stefano V. Albrecht

    Abstract: The performance of image-based Reinforcement Learning (RL) agents can vary depending on the position of the camera used to capture the images. Training on multiple cameras simultaneously, including a first-person egocentric camera, can leverage information from different camera perspectives to improve the performance of RL. However, hardware constraints may limit the availability of multiple camer… ▽ More

    Submitted 21 June, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: Reinforcement Learning Conference (RLC), 2024

  6. arXiv:2403.08828  [pdf, other

    cs.HC cs.AI cs.RO

    People Attribute Purpose to Autonomous Vehicles When Explaining Their Behavior

    Authors: Balint Gyevnar, Stephanie Droop, Tadeg Quillien, Shay B. Cohen, Neil R. Bramley, Christopher G. Lucas, Stefano V. Albrecht

    Abstract: Cognitive science can help us understand which explanations people might expect, and in which format they frame these explanations, whether causal, counterfactual, or teleological (i.e., purpose-oriented). Understanding the relevance of these concepts is crucial for building good explainable AI (XAI) which offers recourse and actionability. Focusing on autonomous driving, a complex decision-making… ▽ More

    Submitted 30 April, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

  7. arXiv:2402.10086  [pdf, other

    cs.RO cs.AI cs.CV cs.HC cs.LG

    Explainable AI for Safe and Trustworthy Autonomous Driving: A Systematic Review

    Authors: Anton Kuznietsov, Balint Gyevnar, Cheng Wang, Steven Peters, Stefano V. Albrecht

    Abstract: Artificial Intelligence (AI) shows promising applications for the perception and planning tasks in autonomous driving (AD) due to its superior performance compared to conventional methods. However, inscrutable AI systems exacerbate the existing challenge of safety assurance of AD. One way to mitigate this challenge is to utilize explainable AI (XAI) techniques. To this end, we present the first co… ▽ More

    Submitted 3 July, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  8. arXiv:2402.03479  [pdf, other

    cs.LG cs.AI

    DRED: Zero-Shot Transfer in Reinforcement Learning via Data-Regularised Environment Design

    Authors: Samuel Garcin, James Doran, Shangmin Guo, Christopher G. Lucas, Stefano V. Albrecht

    Abstract: Autonomous agents trained using deep reinforcement learning (RL) often lack the ability to successfully generalise to new environments, even when these environments share characteristics with the ones they have encountered during training. In this work, we investigate how the sampling of individual environment instances, or levels, affects the zero-shot generalisation (ZSG) ability of RL agents. W… ▽ More

    Submitted 11 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: To appear in ICML 2024. A preliminary version of this work (arXiv:2310.03494) was presented at the ALOE workshop, NeurIPS 2023. arXiv admin note: text overlap with arXiv:2310.03494

  9. arXiv:2401.08808  [pdf, other

    cs.LG

    lpNTK: Better Generalisation with Less Data via Sample Interaction During Learning

    Authors: Shangmin Guo, Yi Ren, Stefano V. Albrecht, Kenny Smith

    Abstract: Although much research has been done on proposing new models or loss functions to improve the generalisation of artificial neural networks (ANNs), less attention has been directed to the impact of the training data on generalisation. In this work, we start from approximating the interaction between samples, i.e. how learning one sample would modify the model's prediction on other samples. Through… ▽ More

    Submitted 14 May, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: ICLR-2024

  10. arXiv:2312.04736  [pdf, other

    cs.CL cs.AI

    Is Feedback All You Need? Leveraging Natural Language Feedback in Goal-Conditioned Reinforcement Learning

    Authors: Sabrina McCallum, Max Taylor-Davies, Stefano V. Albrecht, Alessandro Suglia

    Abstract: Despite numerous successes, the field of reinforcement learning (RL) remains far from matching the impressive generalisation power of human behaviour learning. One possible way to help bridge this gap be to provide RL agents with richer, more human-like feedback expressed in natural language. To investigate this idea, we first extend BabyAI to automatically generate language feedback from the envi… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: Accepted at Workshop on Goal-conditioned Reinforcement Learning, NeurIPS 2023

  11. arXiv:2310.05723  [pdf, other

    cs.LG

    Planning to Go Out-of-Distribution in Offline-to-Online Reinforcement Learning

    Authors: Trevor McInroe, Adam Jelley, Stefano V. Albrecht, Amos Storkey

    Abstract: Offline pretraining with a static dataset followed by online fine-tuning (offline-to-online, or OtO) is a paradigm well matched to a real-world RL deployment process. In this scenario, we aim to find the best-performing policy within a limited budget of online interactions. Previous work in the OtO setting has focused on correcting for bias introduced by the policy-constraint mechanisms of offline… ▽ More

    Submitted 21 June, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: 10 pages, 17 figures, published at RLC 2024

  12. arXiv:2310.03494  [pdf, other

    cs.LG cs.AI

    How the level sampling process impacts zero-shot generalisation in deep reinforcement learning

    Authors: Samuel Garcin, James Doran, Shangmin Guo, Christopher G. Lucas, Stefano V. Albrecht

    Abstract: A key limitation preventing the wider adoption of autonomous agents trained via deep reinforcement learning (RL) is their limited ability to generalise to new environments, even when these share similar characteristics with environments encountered during training. In this work, we investigate how a non-uniform sampling strategy of individual environment instances, or levels, affects the zero-shot… ▽ More

    Submitted 10 December, 2023; v1 submitted 5 October, 2023; originally announced October 2023.

    Comments: Currently under review, 9 pages

  13. arXiv:2307.05209  [pdf, other

    cs.AI cs.LG

    Contextual Pre-planning on Reward Machine Abstractions for Enhanced Transfer in Deep Reinforcement Learning

    Authors: Guy Azran, Mohamad H. Danesh, Stefano V. Albrecht, Sarah Keren

    Abstract: Recent studies show that deep reinforcement learning (DRL) agents tend to overfit to the task on which they were trained and fail to adapt to minor environment changes. To expedite learning when transferring to unseen tasks, we propose a novel approach to representing the current task using reward machines (RMs), state machine abstractions that induce subtasks based on the current task's rewards a… ▽ More

    Submitted 20 February, 2024; v1 submitted 11 July, 2023; originally announced July 2023.

    Comments: Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI), 2024

  14. arXiv:2305.14133  [pdf, other

    cs.LG

    Conditional Mutual Information for Disentangled Representations in Reinforcement Learning

    Authors: Mhairi Dunion, Trevor McInroe, Kevin Sebastian Luck, Josiah P. Hanna, Stefano V. Albrecht

    Abstract: Reinforcement Learning (RL) environments can produce training data with spurious correlations between features due to the amount of training data or its limited feature coverage. This can lead to RL agents encoding these misleading correlations in their latent representation, preventing the agent from generalising if the correlation changes within the environment or when deployed in the real world… ▽ More

    Submitted 12 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Conference on Neural Information Processing Systems (NeurIPS), 2023

  15. arXiv:2305.05566  [pdf, other

    cs.LG cs.AI cs.MA

    SMAClite: A Lightweight Environment for Multi-Agent Reinforcement Learning

    Authors: Adam Michalski, Filippos Christianos, Stefano V. Albrecht

    Abstract: There is a lack of standard benchmarks for Multi-Agent Reinforcement Learning (MARL) algorithms. The Starcraft Multi-Agent Challenge (SMAC) has been widely used in MARL research, but is built on top of a heavy, closed-source computer game, StarCraft II. Thus, SMAC is computationally expensive and requires knowledge and the use of proprietary tools specific to the game for any meaningful alteration… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

  16. arXiv:2304.09825  [pdf, other

    cs.LG cs.AI

    Using Offline Data to Speed-up Reinforcement Learning in Procedurally Generated Environments

    Authors: Alain Andres, Lukas Schäfer, Esther Villar-Rodriguez, Stefano V. Albrecht, Javier Del Ser

    Abstract: One of the key challenges of Reinforcement Learning (RL) is the ability of agents to generalise their learned policy to unseen settings. Moreover, training RL agents requires large numbers of interactions with the environment. Motivated by the recent success of Offline RL and Imitation Learning (IL), we conduct a study to investigate whether agents can leverage offline data in the form of trajecto… ▽ More

    Submitted 18 April, 2023; originally announced April 2023.

    Comments: Presented at the Adaptive and Learning Agents Workshop (ALA) at the AAMAS conference 2023

  17. arXiv:2302.11793  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    Revisiting the Gumbel-Softmax in MADDPG

    Authors: Callum Rhys Tilbury, Filippos Christianos, Stefano V. Albrecht

    Abstract: MADDPG is an algorithm in multi-agent reinforcement learning (MARL) that extends the popular single-agent method, DDPG, to multi-agent scenarios. Importantly, DDPG is an algorithm designed for continuous action spaces, where the gradient of the state-action value function exists. For this algorithm to work in discrete action spaces, discrete gradient estimation must be performed. For MADDPG, the G… ▽ More

    Submitted 14 June, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

    Comments: Presented at AAMAS Workshop on Adaptive and Learning Agents, 2023

  18. arXiv:2302.10809  [pdf, other

    cs.AI cs.RO

    Causal Explanations for Sequential Decision-Making in Multi-Agent Systems

    Authors: Balint Gyevnar, Cheng Wang, Christopher G. Lucas, Shay B. Cohen, Stefano V. Albrecht

    Abstract: We present CEMA: Causal Explanations in Multi-Agent systems; a framework for creating causal natural language explanations of an agent's decisions in dynamic sequential multi-agent systems to build more trustworthy autonomous agents. Unlike prior work that assumes a fixed causal structure, CEMA only requires a probabilistic model for forward-simulating the state of the system. Using such a model,… ▽ More

    Submitted 14 February, 2024; v1 submitted 21 February, 2023; originally announced February 2023.

    Comments: Accepted in 23rd International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), 2024

    ACM Class: I.2.9

  19. arXiv:2302.04944  [pdf, other

    cs.MA cs.AI cs.LG

    Learning Complex Teamwork Tasks Using a Given Sub-task Decomposition

    Authors: Elliot Fosong, Arrasy Rahman, Ignacio Carlucho, Stefano V. Albrecht

    Abstract: Training a team to complete a complex task via multi-agent reinforcement learning can be difficult due to challenges such as policy search in a large joint policy space, and non-stationarity caused by mutually adapting agents. To facilitate efficient learning of complex multi-agent tasks, we propose an approach which uses an expert-provided decomposition of a task into simpler multi-agent sub-task… ▽ More

    Submitted 15 February, 2024; v1 submitted 9 February, 2023; originally announced February 2023.

  20. arXiv:2302.03439  [pdf, other

    cs.MA cs.LG

    Ensemble Value Functions for Efficient Exploration in Multi-Agent Reinforcement Learning

    Authors: Lukas Schäfer, Oliver Slumbers, Stephen McAleer, Yali Du, Stefano V. Albrecht, David Mguni

    Abstract: Existing value-based algorithms for cooperative multi-agent reinforcement learning (MARL) commonly rely on random exploration, such as $ε$-greedy, to explore the environment. However, such exploration is inefficient at finding effective joint actions in states that require cooperation of multiple agents. In this work, we propose ensemble value functions for multi-agent exploration (EMAX), a genera… ▽ More

    Submitted 16 April, 2024; v1 submitted 7 February, 2023; originally announced February 2023.

    Comments: Preprint. Previously presented at the Adaptive and Learning Agents Workshop (ALA) at the AAMAS conference 2023

  21. arXiv:2212.11498  [pdf, other

    cs.LG cs.AI cs.MA cs.RO

    Scalable Multi-Agent Reinforcement Learning for Warehouse Logistics with Robotic and Human Co-Workers

    Authors: Aleksandar Krnjaic, Raul D. Steleac, Jonathan D. Thomas, Georgios Papoudakis, Lukas Schäfer, Andrew Wing Keung To, Kuan-Ho Lao, Murat Cubuktepe, Matthew Haley, Peter Börsting, Stefano V. Albrecht

    Abstract: We envision a warehouse in which dozens of mobile robots and human pickers work together to collect and deliver items within the warehouse. The fundamental problem we tackle, called the order-picking problem, is how these worker agents must coordinate their movement and actions in the warehouse to maximise performance (e.g. order throughput). Established industry methods using heuristic approaches… ▽ More

    Submitted 7 July, 2023; v1 submitted 22 December, 2022; originally announced December 2022.

  22. arXiv:2210.14584  [pdf, other

    cs.LG cs.RO

    Planning with Occluded Traffic Agents using Bi-Level Variational Occlusion Models

    Authors: Filippos Christianos, Peter Karkus, Boris Ivanovic, Stefano V. Albrecht, Marco Pavone

    Abstract: Reasoning with occluded traffic agents is a significant open challenge for planning for autonomous vehicles. Recent deep learning models have shown impressive results for predicting occluded agents based on the behaviour of nearby visible agents; however, as we show in experiments, these models are difficult to integrate into downstream planning. To this end, we propose Bi-level Variational Occlus… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

    Comments: 7 pages, 6 figures

  23. DiPA: Probabilistic Multi-Modal Interactive Prediction for Autonomous Driving

    Authors: Anthony Knittel, Majd Hawasly, Stefano V. Albrecht, John Redford, Subramanian Ramamoorthy

    Abstract: Accurate prediction is important for operating an autonomous vehicle in interactive scenarios. Prediction must be fast, to support multiple requests from a planner exploring a range of possible futures. The generated predictions must accurately represent the probabilities of predicted trajectories, while also capturing different modes of behaviour (such as turning left vs continuing straight at a… ▽ More

    Submitted 8 March, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

    Journal ref: IEEE Robotics and Automation Letters, vol. 8, no. 8, pp. 4887-4894, Aug. 2023

  24. arXiv:2210.05448  [pdf, other

    cs.MA cs.AI

    A General Learning Framework for Open Ad Hoc Teamwork Using Graph-based Policy Learning

    Authors: Arrasy Rahman, Ignacio Carlucho, Niklas Höpner, Stefano V. Albrecht

    Abstract: Open ad hoc teamwork is the problem of training a single agent to efficiently collaborate with an unknown group of teammates whose composition may change over time. A variable team composition creates challenges for the agent, such as the requirement to adapt to new team dynamics and dealing with changing state vector sizes. These challenges are aggravated in real-world applications in which the c… ▽ More

    Submitted 28 October, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

  25. arXiv:2209.14344  [pdf, other

    cs.LG cs.MA

    Pareto Actor-Critic for Equilibrium Selection in Multi-Agent Reinforcement Learning

    Authors: Filippos Christianos, Georgios Papoudakis, Stefano V. Albrecht

    Abstract: This work focuses on equilibrium selection in no-conflict multi-agent games, where we specifically study the problem of selecting a Pareto-optimal Nash equilibrium among several existing equilibria. It has been shown that many state-of-the-art multi-agent reinforcement learning (MARL) algorithms are prone to converging to Pareto-dominated equilibria due to the uncertainty each agent has about the… ▽ More

    Submitted 14 October, 2023; v1 submitted 28 September, 2022; originally announced September 2022.

    Comments: Published in Transactions on Machine Learning Research (TMLR); Reviewed on OpenReview: https://openreview.net/forum?id=3AzqYa18ah

  26. arXiv:2208.01769  [pdf, other

    cs.MA cs.AI cs.LG

    Deep Reinforcement Learning for Multi-Agent Interaction

    Authors: Ibrahim H. Ahmed, Cillian Brewitt, Ignacio Carlucho, Filippos Christianos, Mhairi Dunion, Elliot Fosong, Samuel Garcin, Shangmin Guo, Balint Gyevnar, Trevor McInroe, Georgios Papoudakis, Arrasy Rahman, Lukas Schäfer, Massimiliano Tamborski, Giuseppe Vecchio, Cheng Wang, Stefano V. Albrecht

    Abstract: The development of autonomous agents which can interact with other agents to accomplish a given task is a core area of research in artificial intelligence and machine learning. Towards this goal, the Autonomous Agents Research Group develops novel machine learning algorithms for autonomous systems control, with a specific focus on deep reinforcement learning and multi-agent reinforcement learning.… ▽ More

    Submitted 2 August, 2022; originally announced August 2022.

    Comments: Published in AI Communications Special Issue on Multi-Agent Systems Research in the UK

  27. arXiv:2208.00096  [pdf, other

    cs.RO cs.MA

    Perspectives on the System-level Design of a Safe Autonomous Driving Stack

    Authors: Majd Hawasly, Jonathan Sadeghi, Morris Antonello, Stefano V. Albrecht, John Redford, Subramanian Ramamoorthy

    Abstract: Achieving safe and robust autonomy is the key bottleneck on the path towards broader adoption of autonomous vehicles technology. This motivates going beyond extrinsic metrics such as miles between disengagement, and calls for approaches that embody safety by design. In this paper, we address some aspects of this challenge, with emphasis on issues of motion planning and prediction. We do this throu… ▽ More

    Submitted 29 July, 2022; originally announced August 2022.

    Comments: AI Communications special issue on Multi-agent Systems Research in the UK

  28. arXiv:2207.14138  [pdf, other

    cs.LG cs.AI

    Generating Teammates for Training Robust Ad Hoc Teamwork Agents via Best-Response Diversity

    Authors: Arrasy Rahman, Elliot Fosong, Ignacio Carlucho, Stefano V. Albrecht

    Abstract: Ad hoc teamwork (AHT) is the challenge of designing a robust learner agent that effectively collaborates with unknown teammates without prior coordination mechanisms. Early approaches address the AHT challenge by training the learner with a diverse set of handcrafted teammate policies, usually designed based on an expert's domain knowledge about the policies the learner may encounter. However, imp… ▽ More

    Submitted 24 May, 2023; v1 submitted 28 July, 2022; originally announced July 2022.

    Comments: Accepted in Transactions of Machine Learning Research

  29. arXiv:2207.09300  [pdf, ps, other

    cs.MA cs.AI

    Few-Shot Teamwork

    Authors: Elliot Fosong, Arrasy Rahman, Ignacio Carlucho, Stefano V. Albrecht

    Abstract: We propose the novel few-shot teamwork (FST) problem, where skilled agents trained in a team to complete one task are combined with skilled agents from different tasks, and together must learn to adapt to an unseen but related task. We discuss how the FST problem can be seen as addressing two separate problems: one of reducing the experience required to train a team of agents to complete a complex… ▽ More

    Submitted 19 July, 2022; originally announced July 2022.

    Comments: IJCAI Workshop on Ad Hoc Teamwork, 2022

  30. arXiv:2207.07498  [pdf, other

    cs.MA

    Cooperative Marine Operations via Ad Hoc Teams

    Authors: Ignacio Carlucho, Arrasy Rahman, William Ard, Elliot Fosong, Corina Barbalata, Stefano V. Albrecht

    Abstract: While research in ad hoc teamwork has great potential for solving real-world robotic applications, most developments so far have been focusing on environments with simple dynamics. In this article, we discuss how the problem of ad hoc teamwork can be of special interest for marine robotics and how it can aid marine operations. Particularly, we present a set of challenges that need to be addressed… ▽ More

    Submitted 15 July, 2022; originally announced July 2022.

  31. arXiv:2207.05480  [pdf, other

    cs.LG

    Temporal Disentanglement of Representations for Improved Generalisation in Reinforcement Learning

    Authors: Mhairi Dunion, Trevor McInroe, Kevin Sebastian Luck, Josiah P. Hanna, Stefano V. Albrecht

    Abstract: Reinforcement Learning (RL) agents are often unable to generalise well to environment variations in the state space that were not observed during training. This issue is especially problematic for image-based RL, where a change in just one variable, such as the background colour, can change many pixels in the image. The changed pixels can lead to drastic changes in the agent's latent representatio… ▽ More

    Submitted 27 February, 2023; v1 submitted 12 July, 2022; originally announced July 2022.

    Comments: International Conference on Learning Representations (ICLR), 2023

  32. arXiv:2207.02249  [pdf, other

    cs.MA cs.AI cs.LG

    Learning Task Embeddings for Teamwork Adaptation in Multi-Agent Reinforcement Learning

    Authors: Lukas Schäfer, Filippos Christianos, Amos Storkey, Stefano V. Albrecht

    Abstract: Successful deployment of multi-agent reinforcement learning often requires agents to adapt their behaviour. In this work, we discuss the problem of teamwork adaptation in which a team of agents needs to adapt their policies to solve novel tasks with limited fine-tuning. Motivated by the intuition that agents need to be able to identify and distinguish tasks in order to adapt their behaviour to the… ▽ More

    Submitted 20 November, 2023; v1 submitted 5 July, 2022; originally announced July 2022.

    Comments: To be presented at the Seventh Workshop on Generalization in Planning at the NeurIPS 2023 conference

  33. arXiv:2206.14163  [pdf, other

    cs.RO cs.LG

    Verifiable Goal Recognition for Autonomous Driving with Occlusions

    Authors: Cillian Brewitt, Massimiliano Tamborski, Cheng Wang, Stefano V. Albrecht

    Abstract: Goal recognition (GR) involves inferring the goals of other vehicles, such as a certain junction exit, which can enable more accurate prediction of their future behaviour. In autonomous driving, vehicles can encounter many different scenarios and the environment may be partially observable due to occlusions. We present a novel GR method named Goal Recognition with Interpretable Trees under Occlusi… ▽ More

    Submitted 1 August, 2023; v1 submitted 28 June, 2022; originally announced June 2022.

    Comments: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023

  34. arXiv:2206.11396  [pdf, other

    cs.LG

    Multi-Horizon Representations with Hierarchical Forward Models for Reinforcement Learning

    Authors: Trevor McInroe, Lukas Schäfer, Stefano V. Albrecht

    Abstract: Learning control from pixels is difficult for reinforcement learning (RL) agents because representation learning and policy learning are intertwined. Previous approaches remedy this issue with auxiliary representation learning tasks, but they either do not consider the temporal aspect of the problem or only consider single-step transitions, which may cause learning inefficiencies if important envi… ▽ More

    Submitted 29 January, 2024; v1 submitted 22 June, 2022; originally announced June 2022.

    Comments: Published in TMLR

  35. A Human-Centric Method for Generating Causal Explanations in Natural Language for Autonomous Vehicle Motion Planning

    Authors: Balint Gyevnar, Massimiliano Tamborski, Cheng Wang, Christopher G. Lucas, Shay B. Cohen, Stefano V. Albrecht

    Abstract: Inscrutable AI systems are difficult to trust, especially if they operate in safety-critical settings like autonomous driving. Therefore, there is a need to build transparent and queryable systems to increase trust levels. We propose a transparent, human-centric explanation generation method for autonomous vehicle motion planning and prediction based on an existing white-box system called IGP2. Ou… ▽ More

    Submitted 27 June, 2022; v1 submitted 17 June, 2022; originally announced June 2022.

    Comments: IJCAI Workshop on Artificial Intelligence for Autonomous Driving (AI4AD), 2022

  36. arXiv:2205.08389  [pdf, other

    cs.RO

    MIDGARD: A Simulation Platform for Autonomous Navigation in Unstructured Environments

    Authors: Giuseppe Vecchio, Simone Palazzo, Dario C. Guastella, Ignacio Carlucho, Stefano V. Albrecht, Giovanni Muscato, Concetto Spampinato

    Abstract: We present MIDGARD, an open-source simulation platform for autonomous robot navigation in outdoor unstructured environments. MIDGARD is designed to enable the training of autonomous agents (e.g., unmanned ground vehicles) in photorealistic 3D environments, and to support the generalization skills of learning-based agents through the variability in training scenarios. MIDGARD's main features includ… ▽ More

    Submitted 20 September, 2022; v1 submitted 17 May, 2022; originally announced May 2022.

  37. arXiv:2203.08251  [pdf, other

    cs.RO

    Flash: Fast and Light Motion Prediction for Autonomous Driving with Bayesian Inverse Planning and Learned Motion Profiles

    Authors: Morris Antonello, Mihai Dobre, Stefano V. Albrecht, John Redford, Subramanian Ramamoorthy

    Abstract: Motion prediction of road users in traffic scenes is critical for autonomous driving systems that must take safe and robust decisions in complex dynamic environments. We present a novel motion prediction system for autonomous driving. Our system is based on the Bayesian inverse planning framework, which efficiently orchestrates map-based goal extraction, a classical control-based trajectory genera… ▽ More

    Submitted 15 August, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

    Comments: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2022. 8 pages

  38. arXiv:2202.10450  [pdf, ps, other

    cs.MA cs.AI

    A Survey of Ad Hoc Teamwork Research

    Authors: Reuth Mirsky, Ignacio Carlucho, Arrasy Rahman, Elliot Fosong, William Macke, Mohan Sridharan, Peter Stone, Stefano V. Albrecht

    Abstract: Ad hoc teamwork is the research problem of designing agents that can collaborate with new teammates without prior coordination. This survey makes a two-fold contribution: First, it provides a structured description of the different facets of the ad hoc teamwork problem. Second, it discusses the progress that has been made in the field so far, and identifies the immediate and long-term open problem… ▽ More

    Submitted 16 August, 2022; v1 submitted 16 February, 2022; originally announced February 2022.

    Comments: European Conference on Multi-Agent Systems (EUMAS), 2022

  39. arXiv:2111.14552  [pdf, other

    cs.LG

    Robust On-Policy Sampling for Data-Efficient Policy Evaluation in Reinforcement Learning

    Authors: Rujie Zhong, Duohan Zhang, Lukas Schäfer, Stefano V. Albrecht, Josiah P. Hanna

    Abstract: Reinforcement learning (RL) algorithms are often categorized as either on-policy or off-policy depending on whether they use data from a target policy of interest or from a different behavior policy. In this paper, we study a subtle distinction between on-policy data and on-policy sampling in the context of the RL sub-problem of policy evaluation. We observe that on-policy sampling may fail to mat… ▽ More

    Submitted 10 October, 2022; v1 submitted 29 November, 2021; originally announced November 2021.

    Comments: Published in 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

  40. arXiv:2110.04935  [pdf, other

    cs.LG cs.AI

    Learning Temporally-Consistent Representations for Data-Efficient Reinforcement Learning

    Authors: Trevor McInroe, Lukas Schäfer, Stefano V. Albrecht

    Abstract: Deep reinforcement learning (RL) agents that exist in high-dimensional state spaces, such as those composed of images, have interconnected learning burdens. Agents must learn an action-selection policy that completes their given task, which requires them to learn a representation of the state space that discerns between useful and useless information. The reward function is the only supervised fee… ▽ More

    Submitted 10 October, 2021; originally announced October 2021.

  41. arXiv:2108.02530  [pdf, other

    cs.RO

    Interpretable Goal Recognition in the Presence of Occluded Factors for Autonomous Vehicles

    Authors: Josiah P. Hanna, Arrasy Rahman, Elliot Fosong, Francisco Eiras, Mihai Dobre, John Redford, Subramanian Ramamoorthy, Stefano V. Albrecht

    Abstract: Recognising the goals or intentions of observed vehicles is a key step towards predicting the long-term future behaviour of other agents in an autonomous driving scenario. When there are unseen obstacles or occluded vehicles in a scenario, goal recognition may be confounded by the effects of these unseen entities on the behaviour of observed vehicles. Existing prediction algorithms that assume rat… ▽ More

    Submitted 5 August, 2021; originally announced August 2021.

    Comments: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2021)

  42. arXiv:2107.08966  [pdf, other

    cs.LG cs.AI

    Decoupled Reinforcement Learning to Stabilise Intrinsically-Motivated Exploration

    Authors: Lukas Schäfer, Filippos Christianos, Josiah P. Hanna, Stefano V. Albrecht

    Abstract: Intrinsic rewards can improve exploration in reinforcement learning, but the exploration process may suffer from instability caused by non-stationary reward shaping and strong dependency on hyperparameters. In this work, we introduce Decoupled RL (DeRL) as a general framework which trains separate policies for intrinsically-motivated exploration and exploitation. Such decoupling allows DeRL to lev… ▽ More

    Submitted 9 February, 2022; v1 submitted 19 July, 2021; originally announced July 2021.

    Comments: Published at the International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS) 2022

  43. arXiv:2106.03982  [pdf, other

    cs.CL

    Expressivity of Emergent Language is a Trade-off between Contextual Complexity and Unpredictability

    Authors: Shangmin Guo, Yi Ren, Kory Mathewson, Simon Kirby, Stefano V. Albrecht, Kenny Smith

    Abstract: Researchers are using deep learning models to explore the emergence of language in various language games, where agents interact and develop an emergent language to solve tasks. We focus on the factors that determine the expressivity of emergent languages, which reflects the amount of information about input spaces those languages are capable of encoding. We measure the expressivity of emergent la… ▽ More

    Submitted 15 March, 2022; v1 submitted 7 June, 2021; originally announced June 2021.

    Comments: 22 pages, 12 figures, 5 tables

    Journal ref: International Conference on Learning Representation 2022

  44. arXiv:2103.06113  [pdf, other

    cs.RO cs.MA

    GRIT: Fast, Interpretable, and Verifiable Goal Recognition with Learned Decision Trees for Autonomous Driving

    Authors: Cillian Brewitt, Balint Gyevnar, Samuel Garcin, Stefano V. Albrecht

    Abstract: It is important for autonomous vehicles to have the ability to infer the goals of other vehicles (goal recognition), in order to safely interact with other vehicles and predict their future trajectories. This is a difficult problem, especially in urban environments with interactions between many vehicles. Goal recognition methods must be fast to run in real time and make accurate inferences. As au… ▽ More

    Submitted 9 August, 2021; v1 submitted 10 March, 2021; originally announced March 2021.

    Comments: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

  45. arXiv:2102.07475  [pdf, other

    cs.MA cs.LG

    Scaling Multi-Agent Reinforcement Learning with Selective Parameter Sharing

    Authors: Filippos Christianos, Georgios Papoudakis, Arrasy Rahman, Stefano V. Albrecht

    Abstract: Sharing parameters in multi-agent deep reinforcement learning has played an essential role in allowing algorithms to scale to a large number of agents. Parameter sharing between agents significantly decreases the number of trainable parameters, shortening training times to tractable levels, and has been linked to more efficient learning. However, having all agents share the same parameters can als… ▽ More

    Submitted 12 June, 2021; v1 submitted 15 February, 2021; originally announced February 2021.

    Comments: To be published In Proceedings of the 38th International Conference on Machine Learning (ICML), 2021

  46. arXiv:2011.00509  [pdf, other

    cs.RO cs.LG

    PILOT: Efficient Planning by Imitation Learning and Optimisation for Safe Autonomous Driving

    Authors: Henry Pulver, Francisco Eiras, Ludovico Carozza, Majd Hawasly, Stefano V. Albrecht, Subramanian Ramamoorthy

    Abstract: Achieving a proper balance between planning quality, safety and efficiency is a major challenge for autonomous driving. Optimisation-based motion planners are capable of producing safe, smooth and comfortable plans, but often at the cost of runtime efficiency. On the other hand, naively deploying trajectories produced by efficient-to-run deep imitation learning approaches might risk compromising s… ▽ More

    Submitted 30 July, 2021; v1 submitted 1 November, 2020; originally announced November 2020.

    Comments: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2021. 8 pages, 7 figures

  47. arXiv:2007.09327  [pdf, ps, other

    cs.CR cs.LG cs.MA

    Towards Quantum-Secure Authentication and Key Agreement via Abstract Multi-Agent Interaction

    Authors: Ibrahim H. Ahmed, Josiah P. Hanna, Elliot Fosong, Stefano V. Albrecht

    Abstract: Current methods for authentication and key agreement based on public-key cryptography are vulnerable to quantum computing. We propose a novel approach based on artificial intelligence research in which communicating parties are viewed as autonomous agents which interact repeatedly using their private decision models. Authentication and key agreement are decided based on the agents' observed behavi… ▽ More

    Submitted 9 July, 2021; v1 submitted 18 July, 2020; originally announced July 2020.

    Comments: Published at the 19th International Conference on Practical Applications of Agents and Multi-Agent Systems (PAAMS 2021)

  48. arXiv:2006.10412  [pdf, other

    cs.LG cs.MA stat.ML

    Towards Open Ad Hoc Teamwork Using Graph-based Policy Learning

    Authors: Arrasy Rahman, Niklas Höpner, Filippos Christianos, Stefano V. Albrecht

    Abstract: Ad hoc teamwork is the challenging problem of designing an autonomous agent which can adapt quickly to collaborate with teammates without prior coordination mechanisms, including joint training. Prior work in this area has focused on closed teams in which the number of agents is fixed. In this work, we consider open teams by allowing agents with different fixed policies to enter and leave the envi… ▽ More

    Submitted 9 June, 2021; v1 submitted 18 June, 2020; originally announced June 2020.

    Comments: Published in the 38th International Conference on Machine Learning (ICML 2021)

  49. arXiv:2006.09447  [pdf, other

    cs.LG cs.MA stat.ML

    Agent Modelling under Partial Observability for Deep Reinforcement Learning

    Authors: Georgios Papoudakis, Filippos Christianos, Stefano V. Albrecht

    Abstract: Modelling the behaviours of other agents is essential for understanding how agents interact and making effective decisions. Existing methods for agent modelling commonly assume knowledge of the local observations and chosen actions of the modelled agents during execution. To eliminate this assumption, we extract representations from the local information of the controlled agent using encoder-decod… ▽ More

    Submitted 9 November, 2021; v1 submitted 16 June, 2020; originally announced June 2020.

    Comments: Published in the 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

  50. arXiv:2006.07869  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks

    Authors: Georgios Papoudakis, Filippos Christianos, Lukas Schäfer, Stefano V. Albrecht

    Abstract: Multi-agent deep reinforcement learning (MARL) suffers from a lack of commonly-used evaluation tasks and criteria, making comparisons between approaches difficult. In this work, we provide a systematic evaluation and comparison of three different classes of MARL algorithms (independent learning, centralised multi-agent policy gradient, value decomposition) in a diverse range of cooperative multi-a… ▽ More

    Submitted 9 November, 2021; v1 submitted 14 June, 2020; originally announced June 2020.

    Comments: Published in 35th Conference on Neural Information Processing Systems (NeurIPS 2021) Track on Datasets and Benchmarks