Skip to main content Skip to main navigation


Less is More: Proxy Datasets in NAS approaches

Brian Moser; Federico Raue; Jörn Hees; Andreas Dengel
In: Brian Moser; Federico Raue; Jörn Hees; Andreas Dengel (Hrsg.). The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR). International Conference on Computer Vision and Pattern Recognition (CVPR-2022), IEEE, 3/2022.


Neural Architecture Search (NAS) defines the design of Neural Networks as a search problem. Unfortunately, NAS is computationally intensive because of various possibilities depending on the number of elements in the design and the possible connections between them. In this work, we extensively analyze the role of the dataset size based on several sampling approaches for reducing the dataset size (unsupervised and supervised cases) as an agnostic approach to reduce search time. We compared these techniques with four common NAS approaches in NAS-Bench-201 in roughly 1,400 experiments on CIFAR-100. One of our surprising findings is that in most cases we can reduce the amount of training data to 25 %, consequently also reducing search time to 25 %, while at the same time maintaining the same accuracy as if training on the full dataset. In addition, some designs derived from subsets out-perform designs derived from the full dataset by up to 22 p.p. accuracy.