Optimal Real-Time Battery Scheduling with Reinforcement Learning and Neural Networks

  • Author / Creator
    Quiroz Juarez, Carolina
  • Climate change concerns have raised awareness about the importance of decarbonizing the power sector. In achieving such a goal, energy storage is a critical operation that is currently done using mostly fossil fuels as chemical energy storage. The only viable alternative is battery energy storage systems (BESS) given their portability, scalability and ease to install when compared with other storage technologies. BESS have been an important subject of research for decades. However, their massification has not been fully realized due to their costs and operational complexities.
    The battery scheduling problem has been extensively analyzed and a great variety of algorithms have been proposed as a solution. Nevertheless, considering that BESS operation is highly dependant on electrolyte chemistry, not all scheduling and control algorithms are useful for every real-time condition and every battery. Moreover, sophisticated high performing BESS control algorithms demand high computational resources that prevent them from being implemented in distributed energy systems. For instance, behind-the-meter (BTM) applications for residential buildings require real-time BESS control with high time resolution data.
    In this work, we propose a real-time BESS control method based on reinforcement learning and neural networks aimed at working with reduced computational resources and independently from the battery chemistry, which is then amenable to embedded systems applications.
    On the one hand, neural network (NN) algorithms popularity stems from their capability to solve high dimensional complex problems with minimal computational resources once the model has been trained. On the other hand, the NN training process requires high amounts of good quality labeled data. During this project, we used 1-min resolution datasets containing photovoltaic (PV) generation, residential demand, and price signals. Notwithstanding, the datasets used did not contain BESS charge and discharge information. We, however, generated charge and discharge data with a reinforcement learning (RL)-based Q-learning algorithm that took into account the system characteristics of a real vanadium redox flow battery experimental setup as well as the technical features of a lithium-ion battery available in the market.
    The RL-agent training process made use of large amounts of data and takes considerable processing time to obtain an optimal policy for a daily operational period. Therefore, the RL-agent’s main function is to generate labels to train different NN models, but not to be deployed on a real-time controller. The RL reward function privileged charge and discharge sequences that minimize final user costs compared to a PV system with no BESS. A positive reward was awarded every time the total electricity cost of a PV system was higher than the cost obtained with a PV-BESS system. All electricity costs were finally compared with the baseline PV system. However, the battery-agent was not always able to decrease electricity costs below the baseline as its performance is dependant on battery size and efficiency. In turn, the scheduling labels resulting from the RL-agent operation allowed us to train our NN models with accuracy such that we were able to abate PV system electricity costs.
    Finally, an application of our workflow to the BTM problem is explored, by comparing electricity costs calculated with both, Q-learning and NN algorithms, to a residential flat tariff offered by a local electricity provider in Edmonton. Our simulation suggests that in the current scenario, it is still not economically viable to adopt BESS technologies at a large scale in Alberta, Canada.

  • Subjects / Keywords
  • Graduation date
    Fall 2021
  • Type of Item
  • Degree
    Master of Science
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.