This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.
Search
Skip to Search Results- 2Off-policy learning
- 1Incremental learning
- 1Online learning
- 1Prediction learning
- 1Reinforcement learning
- 1Ste-size Ratchet
-
Fall 2017
Model-free off-policy temporal-difference (TD) algorithms form a powerful component of scalable predictive knowledge representation due to their ability to learn numerous counter- factual predictions in a computationally scalable manner. In this dissertation, we address and overcome two...
-
Spring 2022
In this dissertation, we study online off-policy temporal-difference learning algorithms, a class of reinforcement learning algorithms that can learn predictions in an efficient and scalable manner. The contributions of this dissertation are one of the two kinds: (1) empirically studying existing...