Skip to main content

Showing 1–50 of 92 results for author: James, S

  1. arXiv:2407.07875  [pdf, other

    cs.RO cs.AI cs.CL cs.CV cs.LG

    Generative Image as Action Models

    Authors: Mohit Shridhar, Yat Long Lo, Stephen James

    Abstract: Image-generation diffusion models have been fine-tuned to unlock new capabilities such as image-editing and novel view synthesis. Can we similarly unlock image-generation models for visuomotor control? We present GENIMA, a behavior-cloning agent that fine-tunes Stable Diffusion to 'draw joint-actions' as targets on RGB images. These images are fed into a controller that maps the visual targets int… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Project website, code, checkpoints: https://genima-robot.github.io/

  2. arXiv:2407.07868  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Green Screen Augmentation Enables Scene Generalisation in Robotic Manipulation

    Authors: Eugene Teoh, Sumit Patidar, Xiao Ma, Stephen James

    Abstract: Generalising vision-based manipulation policies to novel environments remains a challenging area with limited exploration. Current practices involve collecting data in one location, training imitation learning or reinforcement learning policies with this data, and deploying the policy in the same location. However, this approach lacks scalability as it necessitates data collection in multiple loca… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Project website: https://greenaug.github.io/

  3. arXiv:2407.07788  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    BiGym: A Demo-Driven Mobile Bi-Manual Manipulation Benchmark

    Authors: Nikita Chernyadev, Nicholas Backshall, Xiao Ma, Yunfan Lu, Younggyo Seo, Stephen James

    Abstract: We introduce BiGym, a new benchmark and learning environment for mobile bi-manual demo-driven robotic manipulation. BiGym features 40 diverse tasks set in home environments, ranging from simple target reaching to complex kitchen cleaning. To capture the real-world performance accurately, we provide human-collected demonstrations for each task, reflecting the diverse modalities found in real-world… ▽ More

    Submitted 11 July, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

    Comments: Project webpage: https://chernyadev.github.io/bigym/

  4. arXiv:2407.07787  [pdf, other

    cs.RO cs.AI cs.CV cs.LG eess.SY

    Continuous Control with Coarse-to-fine Reinforcement Learning

    Authors: Younggyo Seo, Jafar Uruç, Stephen James

    Abstract: Despite recent advances in improving the sample-efficiency of reinforcement learning (RL) algorithms, designing an RL algorithm that can be practically deployed in real-world environments remains a challenge. In this paper, we present Coarse-to-fine Reinforcement Learning (CRL), a framework that trains RL agents to zoom-into a continuous action space in a coarse-to-fine manner, enabling the use of… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Project webpage: https://younggyo.me/cqn/

  5. arXiv:2406.10916  [pdf, other

    cs.RO cs.DC

    M-SET: Multi-Drone Swarm Intelligence Experimentation with Collision Avoidance Realism

    Authors: Chuhao Qin, Alexander Robins, Callum Lillywhite-Roake, Adam Pearce, Hritik Mehta, Scott James, Tsz Ho Wong, Evangelos Pournaras

    Abstract: Distributed sensing by cooperative drone swarms is crucial for several Smart City applications, such as traffic monitoring and disaster response. Using an indoor lab with inexpensive drones, a testbed supports complex and ambitious studies on these systems while maintaining low cost, rigor, and external validity. This paper introduces the Multi-drone Sensing Experimentation Testbed (M-SET), a nove… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: 7 pages, 7 figures. This work has been submitted to the IEEE conferenece

  6. arXiv:2406.04144  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Redundancy-aware Action Spaces for Robot Learning

    Authors: Pietro Mazzaglia, Nicholas Backshall, Xiao Ma, Stephen James

    Abstract: Joint space and task space control are the two dominant action modes for controlling robot arms within the robot learning literature. Actions in joint space provide precise control over the robot's pose, but tend to suffer from inefficient training; actions in task space boast data-efficient training but sacrifice the ability to perform tasks in confined spaces due to limited control over the full… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Published in the RA-L journal

  7. arXiv:2405.18196  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Render and Diffuse: Aligning Image and Action Spaces for Diffusion-based Behaviour Cloning

    Authors: Vitalis Vosylius, Younggyo Seo, Jafar Uruç, Stephen James

    Abstract: In the field of Robot Learning, the complex mapping between high-dimensional observations such as RGB images and low-level robotic actions, two inherently very different spaces, constitutes a complex learning problem, especially with limited amounts of data. In this work, we introduce Render and Diffuse (R&D) a method that unifies low-level robot actions and RGB observations within the image space… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Robotics: Science and Systems (RSS) 2024. Videos are available on our project webpage at https://vv19.github.io/render-and-diffuse/

  8. arXiv:2405.06686  [pdf, other

    cs.CL cs.AI

    Word2World: Generating Stories and Worlds through Large Language Models

    Authors: Muhammad U. Nasir, Steven James, Julian Togelius

    Abstract: Large Language Models (LLMs) have proven their worth across a diverse spectrum of disciplines. LLMs have shown great potential in Procedural Content Generation (PCG) as well, but directly generating a level through a pre-trained LLM is still challenging. This work introduces Word2World, a system that enables LLMs to procedurally design playable games through stories, without any task-specific fine… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  9. arXiv:2403.19375  [pdf, other

    cs.RO cs.MA

    Multi-Agent Team Access Monitoring: Environments that Benefit from Target Information Sharing

    Authors: Andrew Dudash, Scott James, Ryan Rubel

    Abstract: Robotic access monitoring of multiple target areas has applications including checkpoint enforcement, surveillance and containment of fire and flood hazards. Monitoring access for a single target region has been successfully modeled as a minimum-cut problem. We generalize this model to support multiple target areas using two approaches: iterating on individual targets and examining the collections… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  10. arXiv:2403.12682  [pdf, other

    cs.CV cs.RO

    IFFNeRF: Initialisation Free and Fast 6DoF pose estimation from a single image and a NeRF model

    Authors: Matteo Bortolon, Theodore Tsesmelis, Stuart James, Fabio Poiesi, Alessio Del Bue

    Abstract: We introduce IFFNeRF to estimate the six degrees-of-freedom (6DoF) camera pose of a given image, building on the Neural Radiance Fields (NeRF) formulation. IFFNeRF is specifically designed to operate in real-time and eliminates the need for an initial pose guess that is proximate to the sought solution. IFFNeRF utilizes the Metropolis-Hasting algorithm to sample surface points from within the NeRF… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Accepted ICRA 2024, Project page: https://mbortolon97.github.io/iffnerf/

  11. arXiv:2403.09830  [pdf, other

    cs.LG cs.AI

    Towards the Reusability and Compositionality of Causal Representations

    Authors: Davide Talon, Phillip Lippe, Stuart James, Alessio Del Bue, Sara Magliacane

    Abstract: Causal Representation Learning (CRL) aims at identifying high-level causal factors and their relationships from high-dimensional observations, e.g., images. While most CRL works focus on learning causal representations in a single environment, in this work we instead propose a first step towards learning causal representations from temporal sequences of images that can be adapted in a new environm… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: Accepted to the 3rd Conference on Causal Learning and Reasoning (CLeaR 2024)

  12. arXiv:2403.08586  [pdf, other

    cs.CV

    PRAGO: Differentiable Multi-View Pose Optimization From Objectness Detections

    Authors: Matteo Taiana, Matteo Toso, Stuart James, Alessio Del Bue

    Abstract: Robustly estimating camera poses from a set of images is a fundamental task which remains challenging for differentiable methods, especially in the case of small and sparse camera pose graphs. To overcome this challenge, we propose Pose-refined Rotation Averaging Graph Optimization (PRAGO). From a set of objectness detections on unordered images, our method reconstructs the rotational pose, and in… ▽ More

    Submitted 15 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  13. arXiv:2403.03890  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Hierarchical Diffusion Policy for Kinematics-Aware Multi-Task Robotic Manipulation

    Authors: Xiao Ma, Sumit Patidar, Iain Haughton, Stephen James

    Abstract: This paper introduces Hierarchical Diffusion Policy (HDP), a hierarchical agent for multi-task robotic manipulation. HDP factorises a manipulation policy into a hierarchical structure: a high-level task-planning agent which predicts a distant next-best end-effector pose (NBP), and a low-level goal-conditioned diffusion policy which generates optimal motion trajectories. The factorised policy repre… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2024). Videos and code: https://yusufma03.github.io/projects/hdp/

  14. arXiv:2312.12891  [pdf, other

    cs.AI

    MinePlanner: A Benchmark for Long-Horizon Planning in Large Minecraft Worlds

    Authors: William Hill, Ireton Liu, Anita De Mello Koch, Damion Harvey, Nishanth Kumar, George Konidaris, Steven James

    Abstract: We propose a new benchmark for planning tasks based on the Minecraft game. Our benchmark contains 45 tasks overall, but also provides support for creating both propositional and numeric instances of new Minecraft tasks automatically. We benchmark numeric and propositional planning systems on these tasks, with results demonstrating that state-of-the-art planners are currently incapable of dealing w… ▽ More

    Submitted 28 April, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

    Comments: Accepted to the 6th ICAPS Workshop on the International Planning Competition (WIPC 2024)

  15. arXiv:2312.11364  [pdf, other

    cs.AI

    Counting Reward Automata: Sample Efficient Reinforcement Learning Through the Exploitation of Reward Function Structure

    Authors: Tristan Bester, Benjamin Rosman, Steven James, Geraud Nangue Tasse

    Abstract: We present counting reward automata-a finite state machine variant capable of modelling any reward function expressible as a formal language. Unlike previous approaches, which are limited to the expression of tasks as regular languages, our framework allows for tasks described by unrestricted grammars. We prove that an agent equipped with such an abstract machine is able to solve a larger set of t… ▽ More

    Submitted 16 February, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: 14 pages, 11 Figures, Published in AAAI W25: Neuro-Symbolic Learning and Reasoning in the era of Large Language Models (NuCLeaR)

    ACM Class: I.2; F.4

  16. arXiv:2310.16686  [pdf, other

    cs.AI cs.LG

    Dynamics Generalisation in Reinforcement Learning via Adaptive Context-Aware Policies

    Authors: Michael Beukman, Devon Jarvis, Richard Klein, Steven James, Benjamin Rosman

    Abstract: While reinforcement learning has achieved remarkable successes in several domains, its real-world application is limited due to many methods failing to generalise to unfamiliar conditions. In this work, we consider the problem of generalising to new transition dynamics, corresponding to cases in which the environment's response to the agent's actions differs. For example, the gravitational force e… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: Accepted to NeurIPS 2023

  17. arXiv:2309.13942  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    Speed Co-Augmentation for Unsupervised Audio-Visual Pre-training

    Authors: Jiangliu Wang, Jianbo Jiao, Yibing Song, Stephen James, Zhan Tong, Chongjian Ge, Pieter Abbeel, Yun-hui Liu

    Abstract: This work aims to improve unsupervised audio-visual pre-training. Inspired by the efficacy of data augmentation in visual contrastive learning, we propose a novel speed co-augmentation method that randomly changes the playback speeds of both audio and video data. Despite its simplicity, the speed co-augmentation method possesses two compelling attributes: (1) it increases the diversity of audio-vi… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: Published at the CVPR 2023 Sight and Sound workshop

  18. arXiv:2308.16893  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Language-Conditioned Path Planning

    Authors: Amber Xie, Youngwoon Lee, Pieter Abbeel, Stephen James

    Abstract: Contact is at the core of robotic manipulation. At times, it is desired (e.g. manipulation and grasping), and at times, it is harmful (e.g. when avoiding obstacles). However, traditional path planning algorithms focus solely on collision-free paths, limiting their applicability in contact-rich tasks. To address this limitation, we propose the domain of Language-Conditioned Path Planning, where con… ▽ More

    Submitted 31 August, 2023; originally announced August 2023.

    Comments: Conference on Robot Learning, 2023

  19. arXiv:2308.12270  [pdf, other

    cs.LG cs.AI

    Language Reward Modulation for Pretraining Reinforcement Learning

    Authors: Ademi Adeniji, Amber Xie, Carmelo Sferrazza, Younggyo Seo, Stephen James, Pieter Abbeel

    Abstract: Using learned reward functions (LRFs) as a means to solve sparse-reward reinforcement learning (RL) tasks has yielded some steady progress in task-complexity through the years. In this work, we question whether today's LRFs are best-suited as a direct replacement for task rewards. Instead, we propose leveraging the capabilities of LRFs as a pretraining signal for RL. Concretely, we propose… ▽ More

    Submitted 23 August, 2023; originally announced August 2023.

    Comments: Code available at https://github.com/ademiadeniji/lamp

  20. arXiv:2306.01102  [pdf, other

    cs.NE cs.AI cs.CL

    LLMatic: Neural Architecture Search via Large Language Models and Quality Diversity Optimization

    Authors: Muhammad U. Nasir, Sam Earle, Christopher Cleghorn, Steven James, Julian Togelius

    Abstract: Large Language Models (LLMs) have emerged as powerful tools capable of accomplishing a broad spectrum of tasks. Their abilities span numerous areas, and one area where they have made a significant impact is in the domain of code generation. Here, we propose using the coding abilities of LLMs to introduce meaningful variations to code defining neural networks. Meanwhile, Quality-Diversity (QD) algo… ▽ More

    Submitted 12 April, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: Accepted to The Genetic and Evolutionary Computation Conference 2024

  21. arXiv:2306.00035  [pdf, other

    cs.LG

    ROSARL: Reward-Only Safe Reinforcement Learning

    Authors: Geraud Nangue Tasse, Tamlin Love, Mark Nemecek, Steven James, Benjamin Rosman

    Abstract: An important problem in reinforcement learning is designing agents that learn to solve tasks safely in an environment. A common solution is for a human expert to define either a penalty in the reward function or a cost to be minimised when reaching unsafe states. However, this is non-trivial, since too small a penalty may lead to agents that reach unsafe states, while too large a penalty increases… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

  22. arXiv:2304.06373  [pdf, other

    cs.CV

    3DoF Localization from a Single Image and an Object Map: the Flatlandia Problem and Dataset

    Authors: Matteo Toso, Matteo Taiana, Stuart James, Alessio Del Bue

    Abstract: Efficient visual localization is crucial to many applications, such as large-scale deployment of autonomous agents and augmented reality. Traditional visual localization, while achieving remarkable accuracy, relies on extensive 3D models of the scene or large collections of geolocalized images, which are often inefficient to store and to scale to novel environments. In contrast, humans orient them… ▽ More

    Submitted 8 November, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

  23. arXiv:2303.11120  [pdf, other

    cs.CV

    Positional Diffusion: Ordering Unordered Sets with Diffusion Probabilistic Models

    Authors: Francesco Giuliari, Gianluca Scarpellini, Stuart James, Yiming Wang, Alessio Del Bue

    Abstract: Positional reasoning is the process of ordering unsorted parts contained in a set into a consistent structure. We present Positional Diffusion, a plug-and-play graph formulation with Diffusion Probabilistic Models to address positional reasoning. We use the forward process to map elements' positions in a set to random positions in a continuous space. Positional Diffusion learns to reverse the nois… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

  24. Hierarchical clustering with OWA-based linkages, the Lance-Williams formula, and dendrogram inversions

    Authors: Marek Gagolewski, Anna Cena, Simon James, Gleb Beliakov

    Abstract: Agglomerative hierarchical clustering based on Ordered Weighted Averaging (OWA) operators not only generalises the single, complete, and average linkages, but also includes intercluster distances based on a few nearest or farthest neighbours, trimmed and winsorised means of pairwise point similarities, amongst many others. We explore the relationships between the famous Lance-Williams update formu… ▽ More

    Submitted 25 October, 2023; v1 submitted 9 March, 2023; originally announced March 2023.

    Journal ref: Fuzzy Sets and Systems 473, 108740, 2023

  25. arXiv:2302.02408  [pdf, other

    cs.RO cs.CV cs.LG

    Multi-View Masked World Models for Visual Robotic Manipulation

    Authors: Younggyo Seo, Junsu Kim, Stephen James, Kimin Lee, Jinwoo Shin, Pieter Abbeel

    Abstract: Visual robotic manipulation research and applications often use multiple cameras, or views, to better perceive the world. How else can we utilize the richness of multi-view data? In this paper, we investigate how to learn good representations with multi-view data and utilize them for visual robotic manipulation. Specifically, we train a multi-view masked autoencoder which reconstructs pixels of ra… ▽ More

    Submitted 31 May, 2023; v1 submitted 5 February, 2023; originally announced February 2023.

    Comments: Accepted to ICML 2023. First two authors contributed equally. Project webpage: https://sites.google.com/view/mv-mwm

  26. arXiv:2302.01561  [pdf, other

    cs.AI

    Hierarchically Composing Level Generators for the Creation of Complex Structures

    Authors: Michael Beukman, Manuel Fokam, Marcel Kruger, Guy Axelrod, Muhammad Nasir, Branden Ingram, Benjamin Rosman, Steven James

    Abstract: Procedural content generation (PCG) is a growing field, with numerous applications in the video game industry and great potential to help create better games at a fraction of the cost of manual creation. However, much of the work in PCG is focused on generating relatively straightforward levels in simple games, as it is challenging to design an optimisable objective function for complex settings.… ▽ More

    Submitted 19 July, 2023; v1 submitted 3 February, 2023; originally announced February 2023.

    Comments: Code is available at https://github.com/Michael-Beukman/MCHAMR. This work has been accepted to IEEE Transactions on Games, with copyright transferred to the IEEE

  27. arXiv:2211.01644  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    StereoPose: Category-Level 6D Transparent Object Pose Estimation from Stereo Images via Back-View NOCS

    Authors: Kai Chen, Stephen James, Congying Sui, Yun-Hui Liu, Pieter Abbeel, Qi Dou

    Abstract: Most existing methods for category-level pose estimation rely on object point clouds. However, when considering transparent objects, depth cameras are usually not able to capture meaningful data, resulting in point clouds with severe artifacts. Without a high-quality point cloud, existing methods are not applicable to challenging transparent objects. To tackle this problem, we present StereoPose,… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

    Comments: 7 pages, 6 figures, Project homepage: https://appsrv.cse.cuhk.edu.hk/~kaichen/stereopose.html

  28. arXiv:2210.14721  [pdf, other

    cs.LG cs.AI

    Sim-to-Real via Sim-to-Seg: End-to-end Off-road Autonomous Driving Without Real Data

    Authors: John So, Amber Xie, Sunggoo Jung, Jeffrey Edlund, Rohan Thakker, Ali Agha-mohammadi, Pieter Abbeel, Stephen James

    Abstract: Autonomous driving is complex, requiring sophisticated 3D scene understanding, localization, mapping, and control. Rather than explicitly modelling and fusing each of these components, we instead consider an end-to-end approach via reinforcement learning (RL). However, collecting exploration driving data in the real world is impractical and dangerous. While training in simulation and deploying vis… ▽ More

    Submitted 25 October, 2022; originally announced October 2022.

    Comments: CoRL 2022 Paper

  29. arXiv:2210.11442  [pdf, other

    cs.AI cs.NE

    Augmentative Topology Agents For Open-Ended Learning

    Authors: Muhammad Umair Nasir, Michael Beukman, Steven James, Christopher Wesley Cleghorn

    Abstract: In this work, we tackle the problem of open-ended learning by introducing a method that simultaneously evolves agents and increasingly challenging environments. Unlike previous open-ended approaches that optimize agents using a fixed neural network topology, we hypothesize that generalization can be improved by allowing agents' controllers to become more complex as they encounter more difficult en… ▽ More

    Submitted 11 October, 2023; v1 submitted 20 October, 2022; originally announced October 2022.

    Comments: Accepted to The Proceedings of Genetic and Evolutionary Computation Conference (GECCO) 2023

  30. arXiv:2210.03109  [pdf, other

    cs.RO cs.CV cs.LG

    Real-World Robot Learning with Masked Visual Pre-training

    Authors: Ilija Radosavovic, Tete Xiao, Stephen James, Pieter Abbeel, Jitendra Malik, Trevor Darrell

    Abstract: In this work, we explore self-supervised visual pre-training on images from diverse, in-the-wild videos for real-world robotic tasks. Like prior work, our visual representations are pre-trained via a masked autoencoder (MAE), frozen, and then passed into a learnable control module. Unlike prior work, we show that the pre-trained representations are effective across a range of real-world robotic ta… ▽ More

    Submitted 6 October, 2022; originally announced October 2022.

    Comments: CoRL 2022; Project page: https://tetexiao.com/projects/real-mvp

  31. arXiv:2210.02396  [pdf, other

    cs.CV cs.AI cs.LG

    Temporally Consistent Transformers for Video Generation

    Authors: Wilson Yan, Danijar Hafner, Stephen James, Pieter Abbeel

    Abstract: To generate accurate videos, algorithms have to understand the spatial and temporal dependencies in the world. Current algorithms enable accurate predictions over short horizons but tend to suffer from temporal inconsistencies. When generated content goes out of view and is later revisited, the model invents different content instead. Despite this severe limitation, no established benchmarks on co… ▽ More

    Submitted 31 May, 2023; v1 submitted 5 October, 2022; originally announced October 2022.

    Comments: Project website: https://wilson1yan.github.io/teco

  32. arXiv:2209.07143  [pdf, other

    cs.CV

    HARP: Autoregressive Latent Video Prediction with High-Fidelity Image Generator

    Authors: Younggyo Seo, Kimin Lee, Fangchen Liu, Stephen James, Pieter Abbeel

    Abstract: Video prediction is an important yet challenging problem; burdened with the tasks of generating future frames and learning environment dynamics. Recently, autoregressive latent video models have proved to be a powerful video prediction tool, by separating the video prediction into two sub-problems: pre-training an image generator model, followed by learning an autoregressive prediction model in th… ▽ More

    Submitted 15 September, 2022; originally announced September 2022.

    Comments: Extended draft of the paper accepted to ICIP 2022 conference

  33. arXiv:2209.03638  [pdf, other

    cs.LG cs.CL cs.SI

    Geolocation of Cultural Heritage using Multi-View Knowledge Graph Embedding

    Authors: Hebatallah A. Mohamed, Sebastiano Vascon, Feliks Hibraj, Stuart James, Diego Pilutti, Alessio Del Bue, Marcello Pelillo

    Abstract: Knowledge Graphs (KGs) have proven to be a reliable way of structuring data. They can provide a rich source of contextual information about cultural heritage collections. However, cultural heritage KGs are far from being complete. They are often missing important attributes such as geographical location, especially for sculptures and mobile or indoor entities such as paintings. In this paper, we f… ▽ More

    Submitted 8 September, 2022; originally announced September 2022.

  34. Combining Evolutionary Search with Behaviour Cloning for Procedurally Generated Content

    Authors: Nicholas Muir, Steven James

    Abstract: In this work, we consider the problem of procedural content generation for video game levels. Prior approaches have relied on evolutionary search (ES) methods capable of generating diverse levels, but this generation procedure is slow, which is problematic in real-time settings. Reinforcement learning (RL) has also been proposed to tackle the same problem, and while level generation is fast, train… ▽ More

    Submitted 29 July, 2022; originally announced July 2022.

    Journal ref: Proceedings of 43rd Conference of the South African Institute of Computer Scientists and Information Technologists, July 2022

  35. arXiv:2207.09445  [pdf, other

    cs.CV

    PoserNet: Refining Relative Camera Poses Exploiting Object Detections

    Authors: Matteo Taiana, Matteo Toso, Stuart James, Alessio Del Bue

    Abstract: The estimation of the camera poses associated with a set of images commonly relies on feature matches between the images. In contrast, we are the first to address this challenge by using objectness regions to guide the pose estimation problem rather than explicit semantic object detections. We propose Pose Refiner Network (PoserNet) a light-weight Graph Neural Network to refine the approximate pai… ▽ More

    Submitted 21 July, 2022; v1 submitted 19 July, 2022; originally announced July 2022.

    Comments: Accepted at ECCV 2022

  36. arXiv:2207.05634  [pdf, other

    cs.CV

    GANzzle: Reframing jigsaw puzzle solving as a retrieval task using a generative mental image

    Authors: Davide Talon, Alessio Del Bue, Stuart James

    Abstract: Puzzle solving is a combinatorial challenge due to the difficulty of matching adjacent pieces. Instead, we infer a mental image from all pieces, which a given piece can then be matched against avoiding the combinatorial explosion. Exploiting advancements in Generative Adversarial methods, we learn how to reconstruct the image given a set of unordered pieces, allowing the model to learn a joint emb… ▽ More

    Submitted 12 July, 2022; originally announced July 2022.

    Comments: Accepted at International Conference of Image Processing (ICIP22)

  37. arXiv:2206.14244  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Masked World Models for Visual Control

    Authors: Younggyo Seo, Danijar Hafner, Hao Liu, Fangchen Liu, Stephen James, Kimin Lee, Pieter Abbeel

    Abstract: Visual model-based reinforcement learning (RL) has the potential to enable sample-efficient robot learning from visual observations. Yet the current approaches typically train a single model end-to-end for learning both visual representations and dynamics, making it difficult to accurately model the interaction between robots and small objects. In this work, we introduce a visual model-based RL fr… ▽ More

    Submitted 27 May, 2023; v1 submitted 28 June, 2022; originally announced June 2022.

    Comments: Project website: https://sites.google.com/view/mwm-rl. Accepted to CoRL 2022

  38. arXiv:2206.11940  [pdf, other

    cs.AI cs.LG

    World Value Functions: Knowledge Representation for Learning and Planning

    Authors: Geraud Nangue Tasse, Benjamin Rosman, Steven James

    Abstract: We propose world value functions (WVFs), a type of goal-oriented general value function that represents how to solve not just a given task, but any other goal-reaching task in an agent's environment. This is achieved by equipping an agent with an internal goal space defined as all the world states where it experiences a terminal transition. The agent can then modify the standard task rewards to de… ▽ More

    Submitted 23 June, 2022; originally announced June 2022.

    Comments: Accepted at the Planning and Reinforcement Learning Workshop at ICAPS 2022. arXiv admin note: text overlap with arXiv:2205.08827

  39. arXiv:2206.04003  [pdf, other

    cs.CV cs.LG

    Patch-based Object-centric Transformers for Efficient Video Generation

    Authors: Wilson Yan, Ryo Okumura, Stephen James, Pieter Abbeel

    Abstract: In this work, we present Patch-based Object-centric Video Transformer (POVT), a novel region-based video generation architecture that leverages object-centric information to efficiently model temporal dynamics in videos. We build upon prior work in video prediction via an autoregressive transformer over the discrete latent space of compressed videos, with an added modification to model object-cent… ▽ More

    Submitted 18 June, 2022; v1 submitted 8 June, 2022; originally announced June 2022.

    Comments: Project Website: https://sites.google.com/view/povt-public

  40. arXiv:2206.03271  [pdf, other

    cs.LG cs.AI cs.CV cs.RO

    On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning

    Authors: Zhao Mandi, Pieter Abbeel, Stephen James

    Abstract: Intelligent agents should have the ability to leverage knowledge from previously learned tasks in order to learn new ones quickly and efficiently. Meta-learning approaches have emerged as a popular solution to achieve this. However, meta-reinforcement learning (meta-RL) algorithms have thus far been restricted to simple environments with narrow task distributions. Moreover, the paradigm of pretrai… ▽ More

    Submitted 16 February, 2023; v1 submitted 7 June, 2022; originally announced June 2022.

  41. arXiv:2205.12532  [pdf, other

    cs.LG cs.LO

    Skill Machines: Temporal Logic Skill Composition in Reinforcement Learning

    Authors: Geraud Nangue Tasse, Devon Jarvis, Steven James, Benjamin Rosman

    Abstract: It is desirable for an agent to be able to solve a rich variety of problems that can be specified through language in the same environment. A popular approach towards obtaining such agents is to reuse skills learned in prior tasks to generalise compositionally to new ones. However, this is a challenging problem due to the curse of dimensionality induced by the combinatorially large number of ways… ▽ More

    Submitted 16 March, 2024; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: Published as a conference paper at ICLR 2024

  42. arXiv:2205.08827  [pdf, other

    cs.LG

    World Value Functions: Knowledge Representation for Multitask Reinforcement Learning

    Authors: Geraud Nangue Tasse, Steven James, Benjamin Rosman

    Abstract: An open problem in artificial intelligence is how to learn and represent knowledge that is sufficient for a general agent that needs to solve multiple tasks in a given world. In this work we propose world value functions (WVFs), which are a type of general value function with mastery of the world - they represent not only how to solve a given task, but also how to solve any other goal-reaching tas… ▽ More

    Submitted 18 May, 2022; originally announced May 2022.

    Comments: Accepted to the 5th Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM), 2022

  43. arXiv:2205.06000  [pdf, other

    cs.LG cs.CV

    Accounting for the Sequential Nature of States to Learn Features for Reinforcement Learning

    Authors: Nathan Michlo, Devon Jarvis, Richard Klein, Steven James

    Abstract: In this work, we investigate the properties of data that cause popular representation learning approaches to fail. In particular, we find that in environments where states do not significantly overlap, variational autoencoders (VAEs) fail to learn useful features. We demonstrate this failure in a simple gridworld domain, and then provide a solution in the form of metric learning. However, metric l… ▽ More

    Submitted 12 May, 2022; originally announced May 2022.

    Comments: arXiv admin note: text overlap with arXiv:2202.13341

    ACM Class: I.2; I.2.6

  44. arXiv:2205.02092  [pdf, other

    cs.LG cs.AI

    Learning Abstract and Transferable Representations for Planning

    Authors: Steven James, Benjamin Rosman, George Konidaris

    Abstract: We are concerned with the question of how an agent can acquire its own representations from sensory data. We restrict our focus to learning representations for long-term planning, a class of problems that state-of-the-art learning methods are unable to solve. We propose a framework for autonomously learning state abstractions of an agent's environment, given a set of skills. Importantly, these abs… ▽ More

    Submitted 4 May, 2022; originally announced May 2022.

    Comments: Accepted to the 5th Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM), 2022

  45. arXiv:2204.12471  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Coarse-to-fine Q-attention with Tree Expansion

    Authors: Stephen James, Pieter Abbeel

    Abstract: Coarse-to-fine Q-attention enables sample-efficient robot manipulation by discretizing the translation space in a coarse-to-fine manner, where the resolution gradually increases at each layer in the hierarchy. Although effective, Q-attention suffers from "coarse ambiguity" - when voxelization is significantly coarse, it is not feasible to distinguish similar-looking objects without first inspectin… ▽ More

    Submitted 2 May, 2022; v1 submitted 26 April, 2022; originally announced April 2022.

    Comments: Project page and code: https://sites.google.com/view/q-attention-qte

  46. arXiv:2204.11842  [pdf, other

    cs.LG cs.AI

    Adaptive Online Value Function Approximation with Wavelets

    Authors: Michael Beukman, Michael Mitchley, Dean Wookey, Steven James, George Konidaris

    Abstract: Using function approximation to represent a value function is necessary for continuous and high-dimensional state spaces. Linear function approximation has desirable theoretical guarantees and often requires less compute and samples than neural networks, but most approaches suffer from an exponential growth in the number of functions as the dimensionality of the state space increases. In this work… ▽ More

    Submitted 22 April, 2022; originally announced April 2022.

    Comments: Accepted to RLDM 2022. Code is located at https://github.com/Michael-Beukman/WaveletRL

  47. arXiv:2204.08327  [pdf, other

    cs.RO

    Automatic Encoding and Repair of Reactive High-Level Tasks with Learned Abstract Representations

    Authors: Adam Pacheck, Steven James, George Konidaris, Hadas Kress-Gazit

    Abstract: We present a framework that, given a set of skills a robot can perform, abstracts sensor data into symbols that we use to automatically encode the robot's capabilities in Linear Temporal Logic. We specify reactive high-level tasks based on these capabilities, for which a strategy is automatically synthesized and executed on the robot, if the task is feasible. If a task is not feasible given the ro… ▽ More

    Submitted 18 April, 2022; originally announced April 2022.

    Comments: 27 pages, 15 figures, Submitted to The International Journal of Robotics Research (IJRR)

  48. arXiv:2204.07049  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Sim-to-Real 6D Object Pose Estimation via Iterative Self-training for Robotic Bin Picking

    Authors: Kai Chen, Rui Cao, Stephen James, Yichuan Li, Yun-Hui Liu, Pieter Abbeel, Qi Dou

    Abstract: In this paper, we propose an iterative self-training framework for sim-to-real 6D object pose estimation to facilitate cost-effective robotic grasping. Given a bin-picking scenario, we establish a photo-realistic simulator to synthesize abundant virtual data, and use this to train an initial pose estimation network. This network then takes the role of a teacher model, which generates pose predicti… ▽ More

    Submitted 21 July, 2022; v1 submitted 14 April, 2022; originally announced April 2022.

    Comments: Accepted to ECCV 2022

  49. Procedural Content Generation using Neuroevolution and Novelty Search for Diverse Video Game Levels

    Authors: Michael Beukman, Christopher W Cleghorn, Steven James

    Abstract: Procedurally generated video game content has the potential to drastically reduce the content creation budget of game developers and large studios. However, adoption is hindered by limitations such as slow generation, as well as low quality and diversity of content. We introduce an evolutionary search-based approach for evolving level generators using novelty search to procedurally generate divers… ▽ More

    Submitted 14 April, 2022; originally announced April 2022.

    Comments: Accepted to the Genetic and Evolutionary Computation Conference (GECCO '22), July 9--13, 2022, Boston, MA, USA. Code is located at https://github.com/Michael-Beukman/PCGNN

  50. arXiv:2204.01571  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Coarse-to-Fine Q-attention with Learned Path Ranking

    Authors: Stephen James, Pieter Abbeel

    Abstract: We propose Learned Path Ranking (LPR), a method that accepts an end-effector goal pose, and learns to rank a set of goal-reaching paths generated from an array of path generating methods, including: path planning, Bezier curve sampling, and a learned policy. The core idea being that each of the path generation modules will be useful in different tasks, or at different stages in a task. When LPR is… ▽ More

    Submitted 4 April, 2022; originally announced April 2022.

    Comments: Project page and code: https://sites.google.com/view/q-attention-lpr