Search

Filter

Subject / Keyword

1Softmax policies

Show 1 more ...

Supervisors

Author / Creator / Contributor

1Garg, Shivam

Year

Collections

Languages

1English

Item type

1Thesis

Departments

1Department of Computing Science

Analysis of an Alternate Policy Gradient Estimator for Softmax Policies
Download

Spring 2022

Garg, Shivam

Policy gradient (PG) estimators are ineffective in dealing with softmax policies that are sub-optimally saturated, which refers to the situation when the policy concentrates its probability mass on sub-optimal actions. Sub-optimal policy saturation may arise from a bad policy initialization or a...

1 - 1 of 1