On the Application of Continuous Deterministic Reinforcement Learning in Neural Architecture Search

Mills, Keith G.

doi:doi:10.7939/r3-brt8-eb15

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

387 views
448 downloads

On the Application of Continuous Deterministic Reinforcement Learning in Neural Architecture Search

Author / Creator

Mills, Keith G.
Architecture evaluation is a major bottleneck of Neural Architecture Search (NAS). Recent trends have seen a shift in favor of weight-sharing networks capable of superimposing all possible candidate architectures in a search space. Nevertheless, this technique is not beyond reproach, and has already encountered significant criticism. Of these is the ability of weight-sharing supernets to accurately represent the characteristics of a single discrete architecture when they are purposefully designed to mimic the behaviour of many.

As the cost of NAS evaluation decreased, the complexity of search algorithms has grown. In this thesis, we explore the application of Reinforcement Learning (RL) in the problem space of weight-sharing NAS. Specifically, we focus on the usage of deterministic agents operating in a continuous action space. First, analogous to gradient-based optimization, we train both the supernet and agent simultaneously and interface them accordingly. Our agent consists of an actor-critic framework, where the actor generates architectures based on the teachings of the critic. Rewards are calculated to encourage the selection and further improvement of high-performance architectures.

Next, we refine the efficiency of our weight-sharing supernet, while decoupling optimization with the RL agent. These reforms lower the resource cost during architecture search and remove unhelpful biases the supernet may have imposed on the agent. We adapt the RL agent to these changes by redefining the state as statistical representation of the best architectures observed. Finally, in order to focus on only the most high-performance architectures, we incorporate the check loss into the critic.

Experimental results on DARTS show that our first scheme is capable of generating architectures that achieve over 97% test accuracy on CIFAR-10 and 81% test accuracy on CIFAR-100. Findings indicate that the agent of our second approach is capable of state-of-the-art test performance on NAS-Bench-201. Additionally, architectures generated by our second approach achieve over 97.4% test accuracy on CIFAR-10 and 75% top-1 accuracy on ImageNet.
Subjects / Keywords
Graduation date

Spring 2021
Type of Item

Thesis
Degree

Master of Science
DOI

https://doi.org/10.7939/r3-brt8-eb15
License

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

English
Institution

University of Alberta
Degree level

Master's
Department
- Department of Electrical and Computer Engineering
Specialization
- Computer Engineering
Supervisor / co-supervisor and their department(s)
- Niu, Di (Electrical and Computer Engineering)