Usage
• 11 views

# Improving Different Aspects in RL - Accelerating Convergence Rate & Enhancing Safety and Robustness

• Author / Creator
Gao, Yue
• Reinforcement learning (RL) has moved from toy domains to real-world applications, while each of these applications has inherent difficulties which are long-standing challenges in RL, such as: stucking at plateaus, limited training time, costly exploration and safety considerations. I, with my collaborates proposed several RL algorithms to improve different aspects of the performance including \textbf{geometry-aware gradient descent (GNGD)}, a policy gradient method (which is also applicable to other non-convex optimizations) which is powerful in terms of theoretical convergence result; and \textbf{a family of Q-learning algorithms} enhancing risk-aversion and robustness empirically in trading market.

Not only in RL, \textbf{geometry-aware descent methods} could also be applied in any first-order non-uniform optimization and can
converge to global optimality faster than the classical $\Omega(1/t^2)$ lower bounds.

e.g, for its application to PG and GLM,
it can be shown that normalizing the gradient ascent method
can accelerate convergence to $O(e^{-t})$
while incurring less overhead than existing algorithms, which significantly improves the best known results. It can also be shown that the proposed geometry-aware descent methods
escape landscape plateaus faster than standard gradient descent. Experimental results are used to illustrate and complement the theoretical findings.

On the empirical side of RL, for the purpose of enhancing robustness and reducing risk, a family of Q-learning algorithm were proposed by taking characteristics such as \emph{risk-awareness}, \emph{robustness to perturbations} and \emph{low learning variance} as building blocks, and they perform well in trading market and balance theoretical guarantees with practical use.

• Subjects / Keywords
Fall 2021
• Type of Item
Thesis
• Degree
Master of Science
• DOI
https://doi.org/10.7939/r3-1sxj-1148