subscribe to arXiv mailings

Equivariant Offline Reinforcement Learning

Authors: Arsh Tangri, Ondrej Biza, Dian Wang, David Klee, Owen Howell, Robert Platt

Abstract: Sample efficiency is critical when applying learning-based methods to robotic manipulation due to the high cost of collecting expert demonstrations and the challenges of on-robot policy learning through online Reinforcement Learning (RL). Offline RL addresses this issue by enabling policy learning from an offline dataset collected using any behavioral policy, regardless of its quality. However, re… ▽ More Sample efficiency is critical when applying learning-based methods to robotic manipulation due to the high cost of collecting expert demonstrations and the challenges of on-robot policy learning through online Reinforcement Learning (RL). Offline RL addresses this issue by enabling policy learning from an offline dataset collected using any behavioral policy, regardless of its quality. However, recent advancements in offline RL have predominantly focused on learning from large datasets. Given that many robotic manipulation tasks can be formulated as rotation-symmetric problems, we investigate the use of $SO(2)$-equivariant neural networks for offline RL with a limited number of demonstrations. Our experimental results show that equivariant versions of Conservative Q-Learning (CQL) and Implicit Q-Learning (IQL) outperform their non-equivariant counterparts. We provide empirical evidence demonstrating how equivariance improves offline learning algorithms in the low-data regime. △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2307.03704 [pdf, other]

Equivariant Single View Pose Prediction Via Induced and Restricted Representations

Authors: Owen Howell, David Klee, Ondrej Biza, Linfeng Zhao, Robin Walters

Abstract: Learning about the three-dimensional world from two-dimensional images is a fundamental problem in computer vision. An ideal neural network architecture for such tasks would leverage the fact that objects can be rotated and translated in three dimensions to make predictions about novel images. However, imposing SO(3)-equivariance on two-dimensional inputs is difficult because the group of three-di… ▽ More Learning about the three-dimensional world from two-dimensional images is a fundamental problem in computer vision. An ideal neural network architecture for such tasks would leverage the fact that objects can be rotated and translated in three dimensions to make predictions about novel images. However, imposing SO(3)-equivariance on two-dimensional inputs is difficult because the group of three-dimensional rotations does not have a natural action on the two-dimensional plane. Specifically, it is possible that an element of SO(3) will rotate an image out of plane. We show that an algorithm that learns a three-dimensional representation of the world from two dimensional images must satisfy certain geometric consistency properties which we formulate as SO(2)-equivariance constraints. We use the induced and restricted representations of SO(2) on SO(3) to construct and classify architectures which satisfy these geometric consistency constraints. We prove that any architecture which respects said consistency constraints can be realized as an instance of our construction. We show that three previously proposed neural architectures for 3D pose prediction are special cases of our construction. We propose a new algorithm that is a learnable generalization of previously considered methods. We test our architecture on three pose predictions task and achieve SOTA results on both the PASCAL3D+ and SYMSOL pose estimation tasks. △ Less

Submitted 7 July, 2023; originally announced July 2023.

arXiv:2302.13926 [pdf, other]

Image to Sphere: Learning Equivariant Features for Efficient Pose Prediction

Authors: David M. Klee, Ondrej Biza, Robert Platt, Robin Walters

Abstract: Predicting the pose of objects from a single image is an important but difficult computer vision problem. Methods that predict a single point estimate do not predict the pose of objects with symmetries well and cannot represent uncertainty. Alternatively, some works predict a distribution over orientations in $\mathrm{SO}(3)$. However, training such models can be computation- and sample-inefficien… ▽ More Predicting the pose of objects from a single image is an important but difficult computer vision problem. Methods that predict a single point estimate do not predict the pose of objects with symmetries well and cannot represent uncertainty. Alternatively, some works predict a distribution over orientations in $\mathrm{SO}(3)$. However, training such models can be computation- and sample-inefficient. Instead, we propose a novel mapping of features from the image domain to the 3D rotation manifold. Our method then leverages $\mathrm{SO}(3)$ equivariant layers, which are more sample efficient, and outputs a distribution over rotations that can be sampled at arbitrary resolution. We demonstrate the effectiveness of our method at object orientation prediction, and achieve state-of-the-art performance on the popular PASCAL3D+ dataset. Moreover, we show that our method can model complex object symmetries, without any modifications to the parameters or loss function. Code is available at https://dmklee.github.io/image2sphere. △ Less

Submitted 27 February, 2023; originally announced February 2023.

arXiv:2211.00194 [pdf, other]

SEIL: Simulation-augmented Equivariant Imitation Learning

Authors: Mingxi Jia, Dian Wang, Guanang Su, David Klee, Xupeng Zhu, Robin Walters, Robert Platt

Abstract: In robotic manipulation, acquiring samples is extremely expensive because it often requires interacting with the real world. Traditional image-level data augmentation has shown the potential to improve sample efficiency in various machine learning tasks. However, image-level data augmentation is insufficient for an imitation learning agent to learn good manipulation policies in a reasonable amount… ▽ More In robotic manipulation, acquiring samples is extremely expensive because it often requires interacting with the real world. Traditional image-level data augmentation has shown the potential to improve sample efficiency in various machine learning tasks. However, image-level data augmentation is insufficient for an imitation learning agent to learn good manipulation policies in a reasonable amount of demonstrations. We propose Simulation-augmented Equivariant Imitation Learning (SEIL), a method that combines a novel data augmentation strategy of supplementing expert trajectories with simulated transitions and an equivariant model that exploits the $\mathrm{O}(2)$ symmetry in robotic manipulation. Experimental evaluations demonstrate that our method can learn non-trivial manipulation tasks within ten demonstrations and outperforms the baselines with a significant margin. △ Less

Submitted 31 October, 2022; originally announced November 2022.

arXiv:2207.11313 [pdf, other]

Graph-Structured Policy Learning for Multi-Goal Manipulation Tasks

Authors: David Klee, Ondrej Biza, Robert Platt

Abstract: Multi-goal policy learning for robotic manipulation is challenging. Prior successes have used state-based representations of the objects or provided demonstration data to facilitate learning. In this paper, by hand-coding a high-level discrete representation of the domain, we show that policies to reach dozens of goals can be learned with a single network using Q-learning from pixels. The agent fo… ▽ More Multi-goal policy learning for robotic manipulation is challenging. Prior successes have used state-based representations of the objects or provided demonstration data to facilitate learning. In this paper, by hand-coding a high-level discrete representation of the domain, we show that policies to reach dozens of goals can be learned with a single network using Q-learning from pixels. The agent focuses learning on simpler, local policies which are sequenced together by planning in the abstract space. We compare our method against standard multi-goal RL baselines, as well as other methods that leverage the discrete representation, on a challenging block construction domain. We find that our method can build more than a hundred different block structures, and demonstrate forward transfer to structures with novel objects. Lastly, we deploy the policy learned in simulation on a real robot. △ Less

Submitted 22 July, 2022; originally announced July 2022.

arXiv:2207.08925 [pdf, other]

Image to Icosahedral Projection for $\mathrm{SO}(3)$ Object Reasoning from Single-View Images

Authors: David Klee, Ondrej Biza, Robert Platt, Robin Walters

Abstract: Reasoning about 3D objects based on 2D images is challenging due to variations in appearance caused by viewing the object from different orientations. Tasks such as object classification are invariant to 3D rotations and other such as pose estimation are equivariant. However, imposing equivariance as a model constraint is typically not possible with 2D image input because we do not have an a prior… ▽ More Reasoning about 3D objects based on 2D images is challenging due to variations in appearance caused by viewing the object from different orientations. Tasks such as object classification are invariant to 3D rotations and other such as pose estimation are equivariant. However, imposing equivariance as a model constraint is typically not possible with 2D image input because we do not have an a priori model of how the image changes under out-of-plane object rotations. The only $\mathrm{SO}(3)$-equivariant models that currently exist require point cloud or voxel input rather than 2D images. In this paper, we propose a novel architecture based on icosahedral group convolutions that reasons in $\mathrm{SO(3)}$ by learning a projection of the input image onto an icosahedron. The resulting model is approximately equivariant to rotation in $\mathrm{SO}(3)$. We apply this model to object pose estimation and shape classification tasks and find that it outperforms reasonable baselines. Project website: \url{https://dmklee.github.io/image2icosahedral} △ Less

Submitted 15 November, 2022; v1 submitted 18 July, 2022; originally announced July 2022.

arXiv:2202.05333 [pdf, other]

Factored World Models for Zero-Shot Generalization in Robotic Manipulation

Authors: Ondrej Biza, Thomas Kipf, David Klee, Robert Platt, Jan-Willem van de Meent, Lawson L. S. Wong

Abstract: World models for environments with many objects face a combinatorial explosion of states: as the number of objects increases, the number of possible arrangements grows exponentially. In this paper, we learn to generalize over robotic pick-and-place tasks using object-factored world models, which combat the combinatorial explosion by ensuring that predictions are equivariant to permutations of obje… ▽ More World models for environments with many objects face a combinatorial explosion of states: as the number of objects increases, the number of possible arrangements grows exponentially. In this paper, we learn to generalize over robotic pick-and-place tasks using object-factored world models, which combat the combinatorial explosion by ensuring that predictions are equivariant to permutations of objects. Previous object-factored models were limited either by their inability to model actions, or by their inability to plan for complex manipulation tasks. We build on recent contrastive methods for training object-factored world models, which we extend to model continuous robot actions and to accurately predict the physics of robotic pick-and-place. To do so, we use a residual stack of graph neural networks that receive action information at multiple levels in both their node and edge neural networks. Crucially, our learned model can make predictions about tasks not represented in the training data. That is, we demonstrate successful zero-shot generalization to novel tasks, with only a minor decrease in model performance. Moreover, we show that an ensemble of our models can be used to plan for tasks involving up to 12 pick and place actions using heuristic search. We also demonstrate transfer to a physical robot. △ Less

Submitted 10 February, 2022; originally announced February 2022.

arXiv:2111.12153 [pdf]

Methodology and feasibility of neurofeedback to improve visual attention to letters in mild Alzheimer's disease

Authors: Deirdre McLaughlin, Daniel Klee, Tab Memmott, Betts Peters, Jack Wiedrick, Melanie Fried-Oken, Barry Oken

Abstract: Brain computer interfaces systems are controlled by users through neurophysiological input for a variety of applications including communication, environmental control, motor rehabilitation, and cognitive training. Although individuals with severe speech and physical impairment are the primary users of this technology, BCIs have emerged as a potential tool for broader populations, especially with… ▽ More Brain computer interfaces systems are controlled by users through neurophysiological input for a variety of applications including communication, environmental control, motor rehabilitation, and cognitive training. Although individuals with severe speech and physical impairment are the primary users of this technology, BCIs have emerged as a potential tool for broader populations, especially with regards to delivering cognitive training or interventions with neurofeedback. The goal of this study was to investigate the feasibility of using a BCI system with neurofeedback as an intervention for people with mild Alzheimer's disease. The study focused on visual attention and language since ad is often associated with functional impairments in language and reading. The study enrolled five adults with mild ad in a nine to thirteen week BCI EEG based neurofeedback intervention to improve attention and reading skills. Two participants completed intervention entirely. The remaining three participants could not complete the intervention phase because of restrictions related to covid. Pre and post assessment measures were used to assess reliability of outcome measures and generalization of treatment to functional reading, processing speed, attention, and working memory skills. Participants demonstrated steady improvement in most cognitive measures across experimental phases, although there was not a significant effect of NFB on most measures of attention. One subject demonstrated significantly significant improvement in letter cancellation during NFB. All participants with mild AD learned to operate a BCI system with training. Results have broad implications for the design and use of bci systems for participants with cognitive impairment. Preliminary evidence justifies implementing NFB-based cognitive measures in AD. △ Less

Submitted 23 November, 2021; originally announced November 2021.

Comments: 50 pages including 6 figures and 4 tables

arXiv:2002.06642 [pdf]

BciPy: Brain-Computer Interface Software in Python

Authors: Tab Memmott, Aziz Koçanaoğulları, Matthew Lawhead, Daniel Klee, Shiran Dudy, Melanie Fried-Oken, Barry Oken

Abstract: There are high technological and software demands associated with conducting brain-computer interface (BCI) research. In order to accelerate the development and accessibility of BCI, it is worthwhile to focus on open-source and desired tooling. Python, a prominent computer language, has emerged as a language of choice for many research and engineering purposes. In this manuscript, we present BciPy… ▽ More There are high technological and software demands associated with conducting brain-computer interface (BCI) research. In order to accelerate the development and accessibility of BCI, it is worthwhile to focus on open-source and desired tooling. Python, a prominent computer language, has emerged as a language of choice for many research and engineering purposes. In this manuscript, we present BciPy, an open-source, Python-based software for conducting BCI research. It was developed with a focus on restoring communication using event-related potential (ERP) spelling interfaces, however, it may be used for other non-spelling and non-ERP BCI paradigms. Major modules in this system include support for data acquisition, data queries, stimuli presentation, signal processing, signal viewing and modeling, language modeling, task building, and a simple Graphical User Interface (GUI). △ Less

Submitted 16 February, 2020; originally announced February 2020.

Comments: 24 pages, 15 Figures, 1 Table

Showing 1–9 of 9 results for author: Klee, D