Safe Navigation in Dense Traffic Scenarios using Reinforcement Learning as Global Guidance for a Model Predictive Controller

Agarwal, Achin

Safe Navigation in Dense Traffic Scenarios using Reinforcement Learning as Global Guidance for a Model Predictive Controller

Title

Safe Navigation in Dense Traffic Scenarios using Reinforcement Learning as Global Guidance for a Model Predictive Controller

Author

Agarwal, Achin (TU Delft Mechanical, Maritime and Materials Engineering)

Contributor

Alonso Mora, Javier (mentor)
Ferreira de Brito, B.F. (mentor)

Degree granting institution

Delft University of Technology

Programme

Mechanical Engineering | Vehicle Engineering

Date

2020-12-14

Abstract

The successful integration of autonomous vehicles (AVs) in human environments is highly dependent on their ability to navigate safely and timely through dense traffic conditions. Such conditions involve a diverse range of human behaviors, ranging from cooperative (willing to yield) to non-cooperative human drivers (unwilling to yield) that need to be identified without any explicit inter-vehicle communication. In order to maneuver through such conditions, AVs must not only compute a collision-free trajectory but also account for the effects of its actions on the surrounding agents to negotiate the navigation maneuver safely. Existing motion planning techniques fail in these environments because they suffer from one or more of the following drawbacks: suffer from ”the curse of dimensionality” due to the high number of agents (e.g., optimization-based methods); do not account for the interaction effects among the agents; do not provide any collision avoidance or trajectory feasibility guarantees (e.g., learning-based methods). In this paper, we propose a novel navigation framework combining the strengths of learning-based with optimization-based algorithms. More specifically, we employ a Soft Actor-Critic agent to learn a continuous guidance policy that provides global guidance to an optimization-based planner generating feasible and collision- free trajectories. We evaluate our method in a highly inter- active simulation environment where we compare our method with two baseline approaches, a learning-based method and an optimization-based method, and present performance results demonstrating our method significantly reduces the number of collisions and increase the success rate with fewer number of deadlocks. We also show that that our method is able to generalise and applicable to other traffic scenarios (e.g., an unprotected left turn).

Subject

Safe Navigation
Motion Planning
Deep Reinforcement Learning
Optimal Control

To reference this document use:

http://resolver.tudelft.nl/uuid:a18ae4f2-788e-4b49-af0a-4c4d3a964c97

Part of collection

Student theses

Document type

master thesis

Rights

Files

PDF

Thesis_Achin_final.pdf

2.77 MB

Close viewer