Improving Deep Deterministic Policy Gradient for Sparse Reward and Goal-Conditioned Continuous Control

Futuhi, Ehsan

doi:doi:10.7939/r3-9dey-8478

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

131 views
172 downloads

Improving Deep Deterministic Policy Gradient for Sparse Reward and Goal-Conditioned Continuous Control

Author / Creator

Futuhi, Ehsan
We propose an improved version of deep deterministic policy gradient (DDPG) for sparse reward and goal-conditioned reinforcement learning. To enhance exploration, we introduce \emph{${\epsilon}{t}$-greedy}, which uses search to generate exploratory options, focusing on less-visited states. We prove that $\epsilon t$-greedy has polynomial sample complexity under mild MDP assumptions. To more efficiently use the information provided by rewarded transitions, we design a new goal-conditioned dual experience replay buffer framework \emph{GDRB} and use \emph{longest n-step returns}. The resulting algorithm \emph{ETGL-DDPG} combines \bm{$\epsilon t$}-greedy, \textbf{G}DRB, and \textbf{L}ongest $n$-step with DDPG. We evaluate ETGL-DDPG on standard sparse-reward continuous tasks, which include a maze and two robotics tasks. We show that ETGL-DDPG significantly outperforms DDPG as well as other state-of-the-art methods in all environments. Further experiments show how each strategy individually enhances the performance of DDPG.
Subjects / Keywords
Graduation date

Spring 2024
Type of Item

Thesis
Degree

Master of Science
DOI

https://doi.org/10.7939/r3-9dey-8478
License

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

English
Institution

University of Alberta
Degree level

Master's
Department
- Department of Computing Science
Supervisor / co-supervisor and their department(s)
- Müller, Martin