Charging Schedule Optimization of Electric Buses Based on Reinforcement Learning

  • Author / Creator
    Chen, Wenzhuo
  • In recent years, due to the environmental concerns caused by the emissions from public transit services relying on traditional fossil fuels, the electrification of the public transit sector has attracted great attention from both automobile industry and academia. Specifically, the electric buses (EBs) potentially driven by decarbonized electricity can reduce air pollutions while achieving energy savings from regenerative braking. Yet, for a large-scale deployment of EBs in public transit services, technical challenges associated with charging schedule optimization for electricity cost and battery degradation cost reduction still need to be addressed. These challenges are further complicated by the uncertainties in EB operation related to the randomness in road and traffic conditions, passenger counts, and arrival and departure times of EBs at bus stations. In this thesis, we address these technical challenges by developing model-free reinforcement learning (RL) approaches to optimize the charging schedules of EBs. Compared with the traditional model-based approaches, the proposed RL approaches do not rely on specific models of the aforementioned uncertainties, such that they can be implemented in real-world public transit services with great flexibility. Specifically, three research topics related to EB charging schedule optimization are investigated in this thesis. Firstly, a Markov decision process (MDP) is developed to model the operation process of EBs with in-station charging capabilities, for which the EBs are only charged at specific bus stations such as terminals and/or transit centers with pre-determined charging durations. Then, a double Q-learning algorithm is utilized to optimize the amount of power to charge each EB at each charging station. By utilizing the battery degradation cost as the reward of the RL, the optimal charging strategy for EB operation cost reduction can be obtained through an iteration process. In the case study, the performance of the proposed RL approach is evaluated based on the real-world EB operation data obtained from St. Albert Transit, AB, Canada. And the results indicate that our approach can reduce the battery degradation cost in comparison with other existing approaches. By considering the en-route EB charging applications, for which the EBs are charged momentarily when they pick up and/or drop off passengers, an extension of the above MDP and RL approach is investigated in our second work. Specifically, a physical EB model and a battery degradation model are built to calculate the EB energy consumption and battery degradation cost, respectively. Then, a semi-Markov decision process (SMDP) is developed to characterize the operation process of EB. The main difference between the SMDP and MDP is that, for SMDP, the duration in between two adjacent charging decision-making epochs can be random, which can better characterize the real-world en-route charging operation conditions of EBs due to the randomness in road and traffic conditions, as well as the uncertainties in passenger pick-up and drop-off times. Accordingly, an average reward reinforcement learning (ARRL) approach is proposed to optimize the en-route charging strategy of EBs. The efficiency of the proposed approach is demonstrated via the real-world EB operation data provided by St. Albert Transit and the results are compared with that of the traditional charging approaches. To further improve the efficiency of EB en-route charging, a relative value iteration reinforcement learning (RVIRL) approach is proposed in our third work. Based on the energy consumption and battery degradation models of EB operation, an extended SMDP problem is formulated to determine the charging schedule of EB on the route by considering the SoC changes, number of charging stations, maximum sojourn time at charging station, and real-time electricity pricing. Then, the RVIRL approach is utilized to obtain the optimal EB charging strategy for each en-route charging station. The convergence of the RVIRL approach is proved mathematically, which is critical to ensure the reliable operation of public transit services with EBs. The performance of the proposed approach is evaluated based on the real-world data obtained from St. Albert Transit. And the results indicate that the proposed approach can significantly reduce the electricity cost and battery lifetime degradation in comparison with other existing en-route EB charging approaches.

  • Subjects / Keywords
  • Graduation date
    Fall 2021
  • Type of Item
  • Degree
    Master of Science
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.