This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

Search

Filter

Author / Creator / Contributor

1Trudeau, Alexandre

Subject / Keyword

1AlphaZero
1Planning
1Reinforcement Learning
1Search Control

Year

Collections

1Graduate and Postdoctoral Studies (GPS), Faculty of
1Graduate and Postdoctoral Studies (GPS), Faculty of/Theses and Dissertations

Languages

1English

Item type

1Thesis

Departments

1Department of Computing Science

Supervisors

1Bowling, Michael (Computing Science)

Targeted Search Control in AlphaZero for Effective Policy Improvement
Download

Spring 2023

Trudeau, Alexandre

AlphaZero is a self-play reinforcement learning algorithm that achieves superhuman play in the games of chess, shogi, and Go via policy iteration. To be an effective policy improvement operator, AlphaZero’s search needs to have accurate value estimates for the states that appear in its search...

1 - 1 of 1

Search

Items (1)

Collections

Communities

Targeted Search Control in AlphaZero for Effective Policy Improvement