Search

Skip to Search Results
  • Fall 2017

    Mahmood, Ashique

    Model-free off-policy temporal-difference (TD) algorithms form a powerful component of scalable predictive knowledge representation due to their ability to learn numerous counter- factual predictions in a computationally scalable manner. In this dissertation, we address and overcome two...

  • Spring 2022

    Sina Ghiassian

    In this dissertation, we study online off-policy temporal-difference learning algorithms, a class of reinforcement learning algorithms that can learn predictions in an efficient and scalable manner. The contributions of this dissertation are one of the two kinds: (1) empirically studying existing...

1 - 2 of 2