Print Email Facebook Twitter Safe Navigation in Dense Traffic Scenarios using Reinforcement Learning as Global Guidance for a Model Predictive Controller Title Safe Navigation in Dense Traffic Scenarios using Reinforcement Learning as Global Guidance for a Model Predictive Controller Author Agarwal, Achin (TU Delft Mechanical, Maritime and Materials Engineering) Contributor Alonso Mora, Javier (mentor) Ferreira de Brito, B.F. (mentor) Degree granting institution Delft University of Technology Programme Mechanical Engineering | Vehicle Engineering Date 2020-12-14 Abstract The successful integration of autonomous vehicles (AVs) in human environments is highly dependent on their ability to navigate safely and timely through dense traffic conditions. Such conditions involve a diverse range of human behaviors, ranging from cooperative (willing to yield) to non-cooperative human drivers (unwilling to yield) that need to be identified without any explicit inter-vehicle communication. In order to maneuver through such conditions, AVs must not only compute a collision-free trajectory but also account for the effects of its actions on the surrounding agents to negotiate the navigation maneuver safely. Existing motion planning techniques fail in these environments because they suffer from one or more of the following drawbacks: suffer from ”the curse of dimensionality” due to the high number of agents (e.g., optimization-based methods); do not account for the interaction effects among the agents; do not provide any collision avoidance or trajectory feasibility guarantees (e.g., learning-based methods). In this paper, we propose a novel navigation framework combining the strengths of learning-based with optimization-based algorithms. More specifically, we employ a Soft Actor-Critic agent to learn a continuous guidance policy that provides global guidance to an optimization-based planner generating feasible and collision- free trajectories. We evaluate our method in a highly inter- active simulation environment where we compare our method with two baseline approaches, a learning-based method and an optimization-based method, and present performance results demonstrating our method significantly reduces the number of collisions and increase the success rate with fewer number of deadlocks. We also show that that our method is able to generalise and applicable to other traffic scenarios (e.g., an unprotected left turn). Subject Safe NavigationMotion PlanningDeep Reinforcement LearningOptimal Control To reference this document use: http://resolver.tudelft.nl/uuid:a18ae4f2-788e-4b49-af0a-4c4d3a964c97 Part of collection Student theses Document type master thesis Rights © 2020 Achin Agarwal Files PDF Thesis_Achin_final.pdf 2.77 MB Close viewer /islandora/object/uuid:a18ae4f2-788e-4b49-af0a-4c4d3a964c97/datastream/OBJ/view