Print Email Facebook Twitter The Effects of Large Disturbances on On-Line Reinforcement Learning for aWalking Robot Title The Effects of Large Disturbances on On-Line Reinforcement Learning for aWalking Robot Author Schuitema, E. Caarls, W. Wisse, M. Jonker, P.P. Babuska, R. Faculty Mechanical, Maritime and Materials Engineering Department Biomechanical Engineering Date 2010-10-25 Abstract Reinforcement Learning is a promising paradigm for adding learning capabilities to humanoid robots. One of the difficulties of the real world is the presence of disturbances. In Reinforcement Learning, disturbances are typically dealt with stochastically. However, large and infrequent disturbances do not fit well in this framework; essentially, they are outliers and not part of the underlying (stochastic) Markov Decision Process. Therefore, they can negatively influence learning. The main reasons for such disturbances for a humanoid robot are sudden changes in the dynamics (such as a sudden push), sensor noise and sampling time irregularities. We investigate the effects of these types of outliers on the on-line learning process of a simple walking robot simulation. We propose to exclude the outliers from the learning process with the aim to improve convergence and the final solution. While infrequent sensor and timing outliers had a negligible influence, infrequent pushes heavily disrupted the learning process. By excluding the outliers from the learning process, performance was again restored. To reference this document use: http://resolver.tudelft.nl/uuid:b40171fc-a7dc-42dd-8f4c-cd4d4aa2c36f Source BNAIC 2010: 22rd Benelux Conference on Artificial Intelligence, Luxembourg, 25-26 October 2010 Part of collection Institutional Repository Document type conference paper Rights (c) 2010 The Author(s) Files PDF Schuitema.pdf 912.95 KB Close viewer /islandora/object/uuid:b40171fc-a7dc-42dd-8f4c-cd4d4aa2c36f/datastream/OBJ/view