Vector Step-size Adaptation for Continual, Online Prediction

  • Author / Creator
    Jacobsen, Andrew
  • In this thesis, we investigate different vector step-size adaptation approaches for continual, online prediction problems. Vanilla stochastic gradient descent can be considerably improved by scaling the update with a vector of appropriately chosen step-sizes. Many methods, including AdaGrad, RMSProp, and AMSGrad, keep statistics about the learning process to approximate a second-order update --- a vector approximation of the inverse Hessian. Another family of approaches uses meta-gradient descent to adapt the step-size parameters to minimize prediction error. These meta-descent strategies are promising for non-stationary problems, but have not been as extensively explored as quasi-second order methods. We derive a general, incremental meta-descent algorithm, called AdaGain, designed to be applicable to a broader range of algorithms, including those with semi-gradient updates or even those with accelerations, such as RMSProp. We introduce an instance of AdaGain which combines meta-descent with RMSProp --- a method we call RMSGain --- which is particularly robust across several prediction problems and is competitive with the state-of-the-art method on a large-scale, time-series prediction problem on real data from a mobile robot.

  • Subjects / Keywords
  • Graduation date
    Fall 2019
  • Type of Item
  • Degree
    Master of Science
  • DOI
  • License
    Permission is hereby granted to the University of Alberta Libraries to reproduce single copies of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes only. Where the thesis is converted to, or otherwise made available in digital form, the University of Alberta will advise potential users of the thesis of these terms. The author reserves all other publication and other rights in association with the copyright in the thesis and, except as herein before provided, neither the thesis nor any substantial portion thereof may be printed or otherwise reproduced in any material form whatsoever without the author's prior written permission.