Learning-based resilience guarantee for multi-UAV collaborative QoS management

Bai, C.; Yan, Peng; Yu, Xiaoqiang; Guo, Jifeng

doi:10.1016/j.patcog.2021.108166

Learning-based resilience guarantee for multi-UAV collaborative QoS management

Title

Learning-based resilience guarantee for multi-UAV collaborative QoS management

Author

Bai, C. (TU Delft Robot Dynamics)
Yan, Peng (Harbin Institute of Technology)
Yu, Xiaoqiang (Harbin Institute of Technology)
Guo, Jifeng (Harbin Institute of Technology)

Date

2022

Abstract

Unmanned and intelligent technologies are the future development trend in the business field. It is of great significance for the connotation analysis and application characterization of massive interactive data. Particularly, during major epidemics or disasters, how to provide business services safely and securely is crucial. Specifically, providing users with resilient and guaranteed communication services is a challenging business task when the communication facilities are damaged. Unmanned aerial vehicles (UAVs), with flexible deployment and high maneuverability, can be used to serve as aerial base stations (BSs) to establish emergency networks. However, it is challenging to control multiple UAVs to provide efficient and fair communication quality of service (QoS) to users due to their limited communication service capabilities. In this paper, we propose a learning-based resilience guarantee framework for multi-UAV collaborative QoS management. We formulate this problem as a partial observable Markov decision process and solve it with proximal policy optimization (PPO), which is a policy-based deep reinforcement learning method. A centralized training and decentralized execution paradigm is used, where the experience collected by all UAVs is used to train the shared control policy. Each UAV takes actions based on the partial environment information it observes. In addition, the design of the reward function considers the average and variance of the communication QoS of all users. Extensive simulations are conducted for performance evaluation. The simulation results indicate that (1) the trained policies can adapt to different scenarios and provide resilient and guaranteed communication QoS to users, (2) increasing the number of UAVs can compensate for the lack of service capabilities of UAVs, (3) when UAVs have local communication service capabilities, the policies trained with PPO have better performance compared with the policies trained with other algorithms.

Subject

Communication service
Deep reinforcement learning
Multi-UAV
QoS-aware
System resilience
Unmanned business

To reference this document use:

http://resolver.tudelft.nl/uuid:76f38a0f-437d-491f-937b-eaa8a1966a74

DOI

https://doi.org/10.1016/j.patcog.2021.108166

Embargo date

2023-07-01

ISSN

0031-3203

Source

Pattern Recognition, 122

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Part of collection

Institutional Repository

Document type

journal article

Rights

Files

PDF

1_s2.0_S0031320321003538_main.pdf

2.92 MB

Close viewer