Usage
  • 74 views
  • 69 downloads

Goal Space Planning with Reward Shaping

  • Author / Creator
    Roice, Kevin
  • Planning and goal-conditioned reinforcement learning aim to create more efficient and scalable methods for complex, long-horizon tasks. These approaches break tasks into manageable subgoals and leverage prior knowledge to guide learning. However, learned models may predict inaccurate next states and have compounding errors over long-horizon predictions. This often makes background planning with learned models worse than model-free alternatives, even though the former uses significantly more memory and computation. Methods that plan in an abstract space, such as Goal-Space Planning, avoid these typical problems of models by background planning with models that are abstract in state and time. This thesis shows how potential-based reward shaping can propagate value and speed up learning with local, subgoal-conditioned models. We demonstrate the effectiveness of this approach in tabular, linear, and deep value-based learners, and study its sensitivity to changes in environment dynamics and the chosen subgoals.

  • Subjects / Keywords
  • Graduation date
    Fall 2024
  • Type of Item
    Thesis
  • Degree
    Master of Science
  • DOI
    https://doi.org/10.7939/r3-dkhk-6b72
  • License
    This thesis is made available by the University of Alberta Library with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.