Search

Filter

Author / Creator / Contributor

1Chan, Alan

Departments

1Department of Computing Science

Languages

1English

Supervisors

1White, Martha (Computing Science)

Subject / Keyword

Year

Collections

Item type

1Thesis

Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
Download

Fall 2020

Chan, Alan

Policy gradient methods typically estimate both explicit policy and value functions. The long-extant view of policy gradient methods as approximate policy iteration---alternating between policy evaluation and policy improvement by greedification---is a helpful framework to elucidate algorithmic...

1 - 1 of 1