Inverse reinforcement learning for decentralized non-cooperative multiagent systems

TS Reddy, V Gopikrishna, G Zaruba…�- 2012 ieee international�…, 2012 - ieeexplore.ieee.org
… Here we present an IRL algorithm that considers the case … in nature and the decision process
is decentralized such that … We briefly summarize the work done by Ng and Russell [2000]…

Inverse reinforcement learning with simultaneous estimation of rewards and dynamics

M Herman, T Gindele, J Wagner…�- Artificial intelligence�…, 2016 - proceedings.mlr.press
Inverse Reinforcement Learning (IRL) describes the problem of learning an unknown reward
function of a Markov Decision Process (MDP) from observed behavior of an agent. Since …

Evaluation of inverse reinforcement learning

A Schmitt - 2019 - minds.wisconsin.edu
… [8] improve upon the limitations of Ng and Russell’s[5] … will is formulated using a Markov
Decision Process (MDP). The goal … Q-learning is a reinforcement learning algorithm that takes a …

Discriminatively learning inverse optimal control models for predicting human intentions

S Gaurav, BD Ziebart�- International Conference on Autonomous Agents�…, 2019 - par.nsf.gov
… maximum entropy inverse reinforcement learning models [39] … Next, we describe in
detail our algorithm for obtaining goal … R(st ) = θ � ϕ(st ), Abbeel & Ng [1] propose the apprenticeship …

Semi-supervised apprenticeship learning

M Valko, M Ghavamzadeh…�- …�on reinforcement learning, 2013 - proceedings.mlr.press
inverse reinforcement learning proposed by Abbeel and Ng [… -world domains showing that
the semi-supervised algorithm2000] is to learn a good behavior by observing the behavior …

Teaching AI agents ethical values using reinforcement learning and policy orchestration

R Noothigattu, D Bouneffouf, N Mattei…�- IBM Journal of�…, 2019 - ieeexplore.ieee.org
… We detail a novel approach that uses inverse reinforcement learning to learn a set … , we use
the linear IRL algorithm as described in Section . For Pac-Man, observe that … Ng and Stuart J. …

[HTML][HTML] Toward robust policy summarization

I Lage, D Lifschitz, F Doshi-Velez…�- Autonomous agents and�…, 2019 - ncbi.nlm.nih.gov
… assumption that people do inverse reinforcement learning to infer an … Involving a human
user in the evaluation process can help … We modified the algorithm to extract a fixed budget by …

Evolving rewards to automate reinforcement learning

A Faust, A Francis, D Mehta�- arXiv preprint arXiv:1905.07628, 2019 - arxiv.org
… control tasks over two RL algorithms, shows improvements over … This human-intensive
process raises questions: a) Can we … We train up to ng = 1000 agents parallelized across nmc = …

On pathologies in KL-regularized reinforcement learning from expert demonstrations

…, C Lu, MA Osborne, Y Gal, Y Teh�- Advances in Neural�…, 2021 - proceedings.neurips.cc
… shown that KLregularized reinforcement learning from expert … unsolved by standard deep
reinforcement learning algorithms. … model classes for deep reinforcement learning algorithms, …

Cross-domain imitation learning via optimal transport

A Fickinger, S Cohen, S Russell, B Amos�- arXiv preprint arXiv:2110.03684, 2021 - arxiv.org
… work aims at improving this algorithm relative to the training … 2018) to find the optimal policy
for the Markov decision process … Deepmimic: Exampleguided deep reinforcement learning of …