Abstract

This paper addresses the scalability challenge of automatic deep neural architecture search by implementing a parameter sharing approach with regularized genetic algorithm (RGE). The key idea is to use a regularized genetic algorithm (RGE) on a pre-determined template and discover a high-performance architecture by searching for the optimal chromosome. During evolution, each model corresponding to a discovered chromosome is trained for a fixed number of epochs to minimize a canonical cross-entropy loss on a given training dataset. Meanwhile, the performance of the trained model on validation dataset is used as a fitness value to perform the evolutions. Because of parameter sharing the trained weights in each generation are carried to the next, thereby reducing the GPU hours required for maximizing the validation accuracy. On the CIFAR-10 dataset, the approach finds a novel architecture that outperforms the best human-invented deep architecture (DenseNet). The CIFAR-10 model achieved a test error of 4.22% with only 0.96M parameters which is better than DenseNet of 4.51% with 0.8M parameters. On CIFAR-100 dataset, the approach was able to compose a novel architecture that achieved 20.53% test error with 3.7M parameters which is on par with 20.50% test error of wide ResNet with 36.5M parameters.

Meeting Name

Complex Adaptive Systems Conference (2019: Nov. 13-15, Malvern, PA)

Department(s)

Engineering Management and Systems Engineering

Keywords and Phrases

Architecture search; Deep neural networks

International Standard Serial Number (ISSN)

1877-0509

Document Type

Article - Conference proceedings

Document Version

Final Version

File Type

text

Language(s)

English

Rights

© 2020 The Authors, All rights reserved.

Creative Commons Licensing

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Publication Date

13 May 2020

Share

 
COinS