skip to main content

An FPGA-Based On-Device Reinforcement Learning Approach using Online Sequential Learning

Watanabe, Hirohisa ; Tsukada, Mineto ; Matsutani, Hiroki

arXiv.org, 2023-03

Ithaca: Cornell University Library, arXiv.org

Texto completo disponível

Citações Citado por
  • Título:
    An FPGA-Based On-Device Reinforcement Learning Approach using Online Sequential Learning
  • Autor: Watanabe, Hirohisa ; Tsukada, Mineto ; Matsutani, Hiroki
  • Assuntos: Algorithms ; Artificial neural networks ; Back propagation ; Computer Science - Learning ; Field programmable gate arrays ; Low cost ; Machine learning ; Markov analysis ; Neural networks ; Optimization ; Regularization ; Statistics - Machine Learning
  • É parte de: arXiv.org, 2023-03
  • Descrição: DQN (Deep Q-Network) is a method to perform Q-learning for reinforcement learning using deep neural networks. DQNs require a large buffer and batch processing for an experience replay and rely on a backpropagation based iterative optimization, making them difficult to be implemented on resource-limited edge devices. In this paper, we propose a lightweight on-device reinforcement learning approach for low-cost FPGA devices. It exploits a recently proposed neural-network based on-device learning approach that does not rely on the backpropagation method but uses OS-ELM (Online Sequential Extreme Learning Machine) based training algorithm. In addition, we propose a combination of L2 regularization and spectral normalization for the on-device reinforcement learning so that output values of the neural network can be fit into a certain range and the reinforcement learning becomes stable. The proposed reinforcement learning approach is designed for PYNQ-Z1 board as a low-cost FPGA platform. The evaluation results using OpenAI Gym demonstrate that the proposed algorithm and its FPGA implementation complete a CartPole-v0 task 29.77x and 89.40x faster than a conventional DQN-based approach when the number of hidden-layer nodes is 64.
  • Editor: Ithaca: Cornell University Library, arXiv.org
  • Idioma: Inglês

Buscando em bases de dados remotas. Favor aguardar.