skip to main content
Primo Search
Search in: Busca Geral
Tipo de recurso Mostra resultados com: Mostra resultados com: Índice

Research on reinforcement learning-based safe decision-making methodology for multiple unmanned aerial vehicles

Yue, Longfei ; Yang, Rennong ; Zhang, Ying ; Zuo, Jialiang

Frontiers in neurorobotics, 2023-01, Vol.16, p.1105480-1105480 [Periódico revisado por pares]

Switzerland: Frontiers Research Foundation

Texto completo disponível

Citações Citado por
  • Título:
    Research on reinforcement learning-based safe decision-making methodology for multiple unmanned aerial vehicles
  • Autor: Yue, Longfei ; Yang, Rennong ; Zhang, Ying ; Zuo, Jialiang
  • Assuntos: Algorithms ; constrained Markov decision process ; Cooperation ; Decision making ; Efficiency ; Entropy ; Knowledge ; Lagrange multiplier ; Machine learning ; Markov analysis ; Methods ; multi-UAV ; Neural networks ; Neuroscience ; Optimization ; Reinforcement ; reinforcement learning ; SAC-Lagrangian ; Safety ; Transfer learning ; Unmanned aerial vehicles
  • É parte de: Frontiers in neurorobotics, 2023-01, Vol.16, p.1105480-1105480
  • Notas: ObjectType-Article-1
    SourceType-Scholarly Journals-1
    ObjectType-Feature-2
    content type line 23
    Reviewed by: Fuqiang Gu, Chongqing University, China; Haoyu Tian, Harbin Institute of Technology, China
    Edited by: Weiran Yao, Harbin Institute of Technology, China
  • Descrição: A system with multiple cooperating unmanned aerial vehicles (multi-UAVs) can use its advantages to accomplish complicated tasks. Recent developments in deep reinforcement learning (DRL) offer good prospects for decision-making for multi-UAV systems. However, the safety and training efficiencies of DRL still need to be improved before practical use. This study presents a transfer-safe soft actor-critic (TSSAC) for multi-UAV decision-making. Decision-making by each UAV is modeled with a constrained Markov decision process (CMDP), in which safety is constrained to maximize the return. The soft actor-critic-Lagrangian (SAC-Lagrangian) algorithm is combined with a modified Lagrangian multiplier in the CMDP model. Moreover, parameter-based transfer learning is used to enable cooperative and efficient training of the tasks to the multi-UAVs. Simulation experiments indicate that the proposed method can improve the safety and training efficiencies and allow the UAVs to adapt to a dynamic scenario.
  • Editor: Switzerland: Frontiers Research Foundation
  • Idioma: Inglês

Buscando em bases de dados remotas. Favor aguardar.