Investigating the extent to which inverse reinforcement learning can learn Rrewards from noisy demonstrations

Perdikis, Charalampos

Investigating the extent to which inverse reinforcement learning can learn Rrewards from noisy demonstrations

Title

Investigating the extent to which inverse reinforcement learning can learn Rrewards from noisy demonstrations

Author

Perdikis, Charalampos (TU Delft Electrical Engineering, Mathematics and Computer Science)

Contributor

Cavalcante Siebert, L. (mentor)
Caregnato Neto, A. (mentor)
Weber, J.M. (graduation committee)

Degree granting institution

Delft University of Technology

Programme

Computer Science and Engineering

Project

CSE3000 Research Project

Date

2023-06-29

Abstract

Inverse Reinforcement Learning (IRL) aims to recover a reward function from expert demonstrations in a Markov Decision Process (MDP). The objective is to understand the underlying intentions and behaviors of experts and derive a reward function based on their reasoning, rather than their exact actions. However, expert demonstrations can be influenced by various types of noise (e.g., from random behavior) which can affect their accuracy and effectiveness in solving the MDP. This research investigates the capability of IRL to recover reward functions from noisy demonstrations. Three types of noises, namely Random Action Noise, Random Bias Noise, and Sparse Noise, are introduced and modeled. Demonstrations are generated with these noises, and the corresponding reward functions are recovered. Comparisons are made between the noisy and optimal recovered rewards using various metrics. The results indicate that IRL exhibits certain tolerance level against Random Events and Sparse Noise, while being more vulnerable to Random Bias Noise.

Subject

Inverse Reinforcement Learning
Noisy Demonstrations
Maximum Entropy

To reference this document use:

http://resolver.tudelft.nl/uuid:edcf1a42-416c-4969-a625-898f2c28418b

Part of collection

Student theses

Document type

bachelor thesis

Rights

Files

PDF

CSE3000_Final_Paper.pdf

512.73 KB

Close viewer