Usage
  • 26 views
  • 40 downloads

On Applications of Multi-Armed Bandit and Bayesian Optimization Approaches for AutoML Problems

  • Author / Creator
    Lu, Shan
  • Automated Machine Learning (AutoML) aims to alleviate human efforts and automate the time-consuming and iterative processes in the development of machine learning methods. In this thesis, we study two AutoML tasks: automated Data Augmentation (DA) and Hyperparameter Optimization (HPO). Automated DA searches for optimal data augmentation policies and is a widely used regularization technique for training deep neural networks. However, since early approaches, e.g., AutoAugment, cost thousands of GPU hours, there is a recent trend to investigate low-cost search methods that still yield effective augmentation policies. In this thesis, we propose a novel multi-armed bandit algorithm, named Bandit Data Augment (BDA), to efficiently search for optimal and transferable augmentation policies. We design a reward signal based on each batch training step of neural networks to reduce the evaluation cost of augmentation policies. Moreover, we propose the Evolutionary Pruning algorithm to allocate more search resources on potentially optimal operation pairs, leading to sparse selection of operation pairs and generalizable policies. Extensive experiments demonstrate that BDA can achieve comparable or better performance than previous auto-augmentation methods on a wide range of models on CIFAR-10/100, SVHN and ImageNet benchmarks.
    Automated HPO algorithms search for optimal hyperparameter configurations used in machine learning and are key to boosting model performance in reality. Many previous HPO methods are based on Bayesian Optimization. However, Bayesian Optimization requires sequentially exploring many data samples to find promising configurations. In this thesis, we propose an efficient batch HPO algorithm that utilizes a combination of Bayesian Optimization and Covariance Matrix Adaptation Evolution Strategy (CMA-ES) as a hybrid sampler to sample configurations while balancing exploitation and exploration. We also propose an ensemble prediction model consisting of various surrogate models to approximate the objective function more accurately. We conduct our experiments on 20 HPO datasets from recommendation system scenarios. Our approach ranked the 4th and 7th places in the training and tournament stages of the Automated HPO Contest in QQ Brower 2021 AI Algorithm Competition at CIKM 2021 AnalytiCup, respectively.

  • Subjects / Keywords
  • Graduation date
    Spring 2022
  • Type of Item
    Thesis
  • Degree
    Master of Science
  • DOI
    https://doi.org/10.7939/r3-rnyd-pq10
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.