Learning cycling styles using experimental trajectory data with Inverse Reinforcement Learning

Andretta, Francesca

Learning cycling styles using experimental trajectory data with Inverse Reinforcement Learning

Title

Learning cycling styles using experimental trajectory data with Inverse Reinforcement Learning

Author

Andretta, Francesca (TU Delft Mechanical, Maritime and Materials Engineering; TU Delft Delft Center for Systems and Control)

Contributor

Dabiri, A. (mentor)
Moore, J.K. (mentor)
Mohajerin Esfahani, P. (graduation committee)

Degree granting institution

Delft University of Technology

Programme

Mechanical Engineering | Systems and Control

Date

2022-04-29

Abstract

Cycling is an increasingly attractive transportation mode, thanks to its health and environmental benefits. Personalized travel assistance services can help make cycling more appealing by providing speed or route advices that can reduce travel time and increase safety while taking into account the personal preferences of cyclists. Due to its ability to learn agents' reward function, Inverse Reinforcement Learning is a suitable algorithm for learning cycling preferences from data.
This thesis aims to describe cycling styles as a set of cycling preferences encoded as a reward function composed of a weighted sum of features. The weights associated to the features composing the reward function represent the importance given to each cycling preference and express the trade-off between different goals of a cyclist. Continuous-time Inverse Reinforcement Learning extracts the weights from empirical cyclists' trajectories collected during an experiment performed in Delft. During the experiment, cyclists were asked to cycle according to three different cycling styles: cautious, normal and aggressive. Differences between weight sets extracted for each cycling styles were analyzed by means of the Kruskar-Wallis statistical test and K-Means clustering algorithm, and the averaged weights for each cycling style were used to simulate a set of test trajectories.
It is shown by simulations that the reward function identified for a specific cycling style leads to an improvement in terms of similarity to test trajectories with the same cycling style with respect to the reward functions corresponding to other cycling styles. The statistical analysis shows that the weights of cautious and aggressive cycling styles show statistical differences and define separate clusters.

Subject

Inverse Reinforcement Learninig
Cycling style
Reward function

To reference this document use:

http://resolver.tudelft.nl/uuid:41ffc288-91ce-40bc-adfc-ea6e5ba9e3dc

Part of collection

Student theses

Document type

master thesis

Rights

Files

PDF

Thesis_FrancescaAndretta.pdf

5.57 MB

Close viewer