Google Scholar

Inverse reinforcement learning for decentralized non-cooperative multiagent systems

TS Reddy, V Gopikrishna, G Zaruba…�- 2012 ieee international�…, 2012 - ieeexplore.ieee.org

… Here we present an IRL algorithm that considers the case … in nature and the decision process
is decentralized such that … We briefly summarize the work done by Ng and Russell [2000]…

Save Cite Cited by 50 Related articles All 4 versions

[PDF] mlr.press

Inverse reinforcement learning with simultaneous estimation of rewards and dynamics

M Herman, T Gindele, J Wagner…�- Artificial intelligence�…, 2016 - proceedings.mlr.press

Inverse Reinforcement Learning (IRL) describes the problem of learning an unknown reward
function of a Markov Decision Process (MDP) from observed behavior of an agent. Since …

Save Cite Cited by 81 Related articles All 6 versions View as HTML

[PDF] wisconsin.edu

Evaluation of inverse reinforcement learning

A Schmitt - 2019 - minds.wisconsin.edu

… [8] improve upon the limitations of Ng and Russell’s[5] … will is formulated using a Markov
Decision Process (MDP). The goal … Q-learning is a reinforcement learning algorithm that takes a …

[PDF] nsf.gov

Discriminatively learning inverse optimal control models for predicting human intentions

S Gaurav, BD Ziebart�- International Conference on Autonomous Agents�…, 2019 - par.nsf.gov

… maximum entropy inverse reinforcement learning models [39] … Next, we describe in
detail our algorithm for obtaining goal … R(st ) = θ � ϕ(st ), Abbeel & Ng [1] propose the apprenticeship …

Save Cite Cited by 15 Related articles All 4 versions View as HTML

[PDF] mlr.press

Semi-supervised apprenticeship learning

M Valko, M Ghavamzadeh…�- …�on reinforcement learning, 2013 - proceedings.mlr.press

… inverse reinforcement learning proposed by Abbeel and Ng [… -world domains showing that
the semi-supervised algorithm … 2000] is to learn a good behavior by observing the behavior …

Save Cite Cited by 23 Related articles All 20 versions View as HTML

[PDF] cmu.edu

Teaching AI agents ethical values using reinforcement learning and policy orchestration

R Noothigattu, D Bouneffouf, N Mattei…�- IBM Journal of�…, 2019 - ieeexplore.ieee.org

… We detail a novel approach that uses inverse reinforcement learning to learn a set … , we use
the linear IRL algorithm as described in Section . For Pac-Man, observe that … Ng and Stuart J. …

Save Cite Cited by 92 Related articles All 10 versions

[HTML] nih.gov

[HTML][HTML] Toward robust policy summarization

I Lage, D Lifschitz, F Doshi-Velez…�- Autonomous agents and�…, 2019 - ncbi.nlm.nih.gov

… assumption that people do inverse reinforcement learning to infer an … Involving a human
user in the evaluation process can help … We modified the algorithm to extract a fixed budget by …

Save Cite Cited by 12 Related articles All 10 versions

[PDF] arxiv.org

Evolving rewards to automate reinforcement learning

A Faust, A Francis, D Mehta�- arXiv preprint arXiv:1905.07628, 2019 - arxiv.org

… control tasks over two RL algorithms, shows improvements over … This human-intensive
process raises questions: a) Can we … We train up to ng = 1000 agents parallelized across nmc = …

Save Cite Cited by 58 Related articles All 5 versions View as HTML

[PDF] neurips.cc

On pathologies in KL-regularized reinforcement learning from expert demonstrations

…, C Lu, MA Osborne, Y Gal, Y Teh�- Advances in Neural�…, 2021 - proceedings.neurips.cc

… shown that KLregularized reinforcement learning from expert … unsolved by standard deep
reinforcement learning algorithms. … model classes for deep reinforcement learning algorithms, …

Save Cite Cited by 26 Related articles All 9 versions View as HTML

[PDF] arxiv.org

Cross-domain imitation learning via optimal transport

A Fickinger, S Cohen, S Russell, B Amos�- arXiv preprint arXiv:2110.03684, 2021 - arxiv.org

… work aims at improving this algorithm relative to the training … 2018) to find the optimal policy
for the Markov decision process … Deepmimic: Exampleguided deep reinforcement learning of …

Save Cite Cited by 42 Related articles All 4 versions View as HTML

Create alert

Cite

Advanced search

Saved to My library