Search

Skip to Search Results
  • Fall 2011

    Farahmand, Amir-massoud

    This thesis studies the reinforcement learning and planning problems that are modeled by a discounted Markov Decision Process (MDP) with a large state space and finite action space. We follow the value-based approach in which a function approximator is used to estimate the optimal value function....

1 - 1 of 1