Print Email Facebook Twitter A fast hybrid reinforcement learning framework with human corrective feedback Title A fast hybrid reinforcement learning framework with human corrective feedback Author Celemin, Carlos (TU Delft Learning & Autonomous Control; Universidad de Santiago de Chile) Ruiz-del-Solar, Javier (Universidad de Santiago de Chile) Kober, J. (TU Delft Learning & Autonomous Control) Date 2018 Abstract Reinforcement Learning agents can be supported by feedback from human teachers in the learning loop that guides the learning process. In this work we propose two hybrid strategies of Policy Search Reinforcement Learning and Interactive Machine Learning that benefit from both sources of information, the cost function and the human corrective feedback, for accelerating the convergence and improving the final performance of the learning process. Experiments with simulated and real systems of balancing tasks and a 3 DoF robot arm validate the advantages of the proposed learning strategies: (i) they speed up the convergence of the learning process between 3 and 30 times, saving considerable time during the agent adaptation, and (ii) they allow including non-expert feedback because they have low sensibility to erroneous human advice. Subject Interactive machine learningLearning from demonstrationPolicy searchReinforcement learning To reference this document use: http://resolver.tudelft.nl/uuid:753b8fad-e98f-4c2d-b959-67ec56fe4bc1 DOI https://doi.org/10.1007/s10514-018-9786-6 ISSN 0929-5593 Source Autonomous Robots, 43 (2019) (5), 1173-1186 Part of collection Institutional Repository Document type journal article Rights © 2018 Carlos Celemin, Javier Ruiz-del-Solar, J. Kober Files PDF Celemin2018_Article_AFast ... Learni.pdf 2.41 MB Close viewer /islandora/object/uuid:753b8fad-e98f-4c2d-b959-67ec56fe4bc1/datastream/OBJ/view