Usage
  • 359 views
  • 943 downloads

Single-Agent Optimization with Monte-Carlo Tree Search and Deep Reinforcement Learning

  • Author / Creator
    Seify, Arta
  • Single-agent optimization tasks, also referred to as single-player games, include any domain with an agent whose goal is to maximize an objective function(s), without interference from any other agents. Such tasks have been studied for decades. For example, in 2006, NASA automated the design of antennas by framing the problem as a single-agent optimization task.

    The combination of Monte-Carlo Tree Search (MCTS) and deep reinforcement learning is state-of-the-art in zero-sum two-player perfect-information games. In this thesis, my goal is to bring the success of these algorithms to single-player games. I begin by introducing a variant of MCTS that is suitable for games where the bounds on rewards is not known, which is the case in many optimization problems. My enhancements include using a general action-value normalization technique, as well as a virtual loss function, which enables effective search parallelization. I then introduce Policy-MCTS, which uses a deep policy network trained by generations of self-play to guide the search. Lastly, I provide initial work on using value estimations to entirely replace the rollouts of MCTS.

    I gauge the effectiveness of my methods in SameGame, an NP-hard single-player test domain. I demonstrate that Policy-MCTS is competitive with state-of-the-art search-based methods on a benchmark set of positions.

  • Subjects / Keywords
  • Graduation date
    Spring 2020
  • Type of Item
    Thesis
  • Degree
    Master of Science
  • DOI
    https://doi.org/10.7939/r3-ydsv-6d91
  • License
    Permission is hereby granted to the University of Alberta Libraries to reproduce single copies of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes only. Where the thesis is converted to, or otherwise made available in digital form, the University of Alberta will advise potential users of the thesis of these terms. The author reserves all other publication and other rights in association with the copyright in the thesis and, except as herein before provided, neither the thesis nor any substantial portion thereof may be printed or otherwise reproduced in any material form whatsoever without the author's prior written permission.