Impact-Aware Learning from Demonstration

de Zwart, Sjouke

Impact-Aware Learning from Demonstration

Title

Impact-Aware Learning from Demonstration

Author

de Zwart, Sjouke (TU Delft Mechanical, Maritime and Materials Engineering)

Contributor

Kober, J. (mentor)
Saccon, Alessandro (mentor)

Degree granting institution

Delft University of Technology

Programme

Mechanical Engineering | Systems and Control

Date

2019-11-27

Abstract

We often establish contact with our environment at non-zero speed. Grabbing and pushing objects without the need to stop our hands at the moment of impact is an example of this. Although humans learn and execute such tasks with relative ease, robots cannot. The difficulty in executing such tasks lies in the complexity of control at the moment of impact. Traditional control approaches avoid contact at non-zero speed by a so called transition phase in which the relative velocity is reduced to zero near contact. Learning from demonstration refers to the process used to transfer new skills to a machine through human demonstrations instead of traditional, time consuming, robotic programming. The goal of this research is to develop a learning strategy that is able to learn and execute tasks in which contact is made at non-zero speed.

The new learning strategy is an adaptation of the state of the art learning from demonstration method, probabilistic movement primitives, combined with the impact-aware robot control strategy, reference spreading. Probabilistic movement primitives translate demonstration data into a trajectory distribution. Reference spreading tackles the problem of having a different time of impact than expected by defining a new error which compares the current state to an extended reference trajectory, switching to the extended trajectory of another mode upon impact. In this work, these methods are combined by extending the demonstration data, to subsequently fit the probabilistic movement primitives resulting in extended trajectory distribution for multiple modes. This trajectory, in conjunction with the reference spreading error can be used for control. The proposed method is numerically validated by simulating two end effectors, dynamically picking up a box to then put it on top of a shelf. The task is successfully learned and executed, showing the effectiveness of the impact-aware learning strategy.

To reference this document use:

http://resolver.tudelft.nl/uuid:c6f91fb2-2544-4802-bcda-4ee70ab0e2be

Part of collection

Student theses

Document type

master thesis

Rights

Files

PDF

IA_LfD_final_SWdeZwart.pdf

5.68 MB

Close viewer