skip to main content
Tipo de recurso Mostra resultados com: Mostra resultados com: Índice

An online reinforcement learning approach to charging and order-dispatching optimization for an e-hailing electric vehicle fleet

Yan, Pengyu ; Yu, Kaize ; Chao, Xiuli ; Chen, Zhibin

European journal of operational research, 2023-11, Vol.310 (3), p.1218-1233 [Periódico revisado por pares]

Elsevier B.V

Texto completo disponível

Citações Citado por
  • Título:
    An online reinforcement learning approach to charging and order-dispatching optimization for an e-hailing electric vehicle fleet
  • Autor: Yan, Pengyu ; Yu, Kaize ; Chao, Xiuli ; Chen, Zhibin
  • Assuntos: Charging and dispatching decision ; Electric vehicle ; Markov decision process ; Reinforcement learning ; Transportation
  • É parte de: European journal of operational research, 2023-11, Vol.310 (3), p.1218-1233
  • Descrição: •The electrification of e-hailing service contributes to the carbon reduction.•Consider the uncertainty of orders and dynamically changing workload of chargers.•Optimize the charging and order-dispatching schemes for e-hailing electric vehicles.•Develop a model-based online reinforcement learning approach. Given the uncertainty of orders and the dynamically changing workload of charging stations, how to dispatch and charge electric vehicle (EV) fleets becomes a significant challenge facing e-hailing platforms. The common practice is to dispatch EVs to serve orders by heuristic matching methods but enable EV drivers to independently make charging decisions based on their experiences, which may compromise the platform’s performance. This study proposes a Markov decision process to jointly optimize the charging and order-dispatching schemes for an e-hailing EV fleet, which provides pick-up services for passengers only from a designated transportation hub (i.e., no pick-up from different locations). The objective is to maximize the total revenue of the fleet throughout a finite horizon. The complete state transition equations of the EV fleet are formulated to track the state-of-charge of their batteries. To learn the charging and order-dispatching policy in a dynamic stochastic environment, an online approximation algorithm is developed, which integrates the model-based reinforcement learning (RL) framework with a novel SARSA(Δ)-sample average approximation (SAA) architecture. Compared with the model-free RL algorithm and approximation dynamic programming (ADP), our algorithm explores high-quality decisions by an SAA model with empirical state transitions and exploits the best decisions so far by an SARSA(Δ) sample-trajectory updating. Computational results based on a real case show that, compared with the existing heuristic method and the ADP in the literature, the proposed approach increases the daily revenue by an average of 31.76% and 14.22%, respectively.
  • Editor: Elsevier B.V
  • Idioma: Inglês

Buscando em bases de dados remotas. Favor aguardar.