Abstract
This paper addresses the scalability challenge of automatic deep neural architecture search by implementing a parameter sharing approach with regularized genetic algorithm (RGE). The key idea is to use a regularized genetic algorithm (RGE) on a pre-determined template and discover a high-performance architecture by searching for the optimal chromosome. During evolution, each model corresponding to a discovered chromosome is trained for a fixed number of epochs to minimize a canonical cross-entropy loss on a given training dataset. Meanwhile, the performance of the trained model on validation dataset is used as a fitness value to perform the evolutions. Because of parameter sharing the trained weights in each generation are carried to the next, thereby reducing the GPU hours required for maximizing the validation accuracy. On the CIFAR-10 dataset, the approach finds a novel architecture that outperforms the best human-invented deep architecture (DenseNet). The CIFAR-10 model achieved a test error of 4.22% with only 0.96M parameters which is better than DenseNet of 4.51% with 0.8M parameters. On CIFAR-100 dataset, the approach was able to compose a novel architecture that achieved 20.53% test error with 3.7M parameters which is on par with 20.50% test error of wide ResNet with 36.5M parameters.
Recommended Citation
R. D. Gottapu and C. H. Dagli, "Efficient Architecture Search for Deep Neural Networks," Procedia Computer Science, vol. 168, pp. 19 - 25, Elsevier B.V., May 2020.
The definitive version is available at https://doi.org/10.1016/j.procs.2020.02.246
Meeting Name
Complex Adaptive Systems Conference (2019: Nov. 13-15, Malvern, PA)
Department(s)
Engineering Management and Systems Engineering
Keywords and Phrases
Architecture search; Deep neural networks
International Standard Serial Number (ISSN)
1877-0509
Document Type
Article - Conference proceedings
Document Version
Final Version
File Type
text
Language(s)
English
Rights
© 2020 The Authors, All rights reserved.
Creative Commons Licensing
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.
Publication Date
13 May 2020