Toward Generate-and-Test Algorithms for Continual Feature Discovery

  • Author / Creator
    Parash Rahman
  • The backpropagation algorithm is a fundamental algorithm for training modern artificial neural networks (ANNs). However, it is known the backpropagation algorithm performs poorly on changing problems. We demonstrate the backpropagation algorithm can perform poorly on a clear, generic, changing task. The task is online meaning the agent learns from one sample at a time from a stream of samples. The task is nonstationary since the sample distribution regularly changes. We call it the generic continual feature discovery task (GCFD), as it is sufficiently difficult that the backpropagation algorithm must regularly discover new features to perform well.

    We propose an explanation for the poor performance of the backpropagation algorithm on the GCFD task. The backpropagation algorithm consists of two phases: initializing an ANN with small random weights, and using stochastic gradient descent to update the weights with data. It is known that the initialization step is crucial to the fast discovery of useful features with the backpropagation algorithm, and a typical initialization step sets the weights to small, random numbers. We corroborate that the small, random weight initialization step leads to conditions that speed up the discovery of useful features with the backpropagation algorithm. Then, we show that these conditions are not maintained during the GCFD task. Without the maintenance of these conditions, there is little reason to expect the backpropagation algorithm to quickly discover useful features for new sample distributions.

    We demonstrate that the backpropagation algorithm's performance on the GCFD task can be significantly improved with generate-and-test algorithms. The generate-and-test algorithms replace the least useful features of the ANN with features that have small, random weights. By regularly introducing features with small random weights, we restore conditions the backpropagation algorithm can use to quickly discover useful features for new data distributions.

  • Subjects / Keywords
  • Graduation date
    Spring 2021
  • Type of Item
  • Degree
    Master of Science
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.