In: The Learning Workshop. The Learning Workshop March 6-April 9 Cliff Lodge, Snowbird Utah United States Online 4/2010.
Here, we report on the evaluation of a simple algorithm (AutoMLP) for both learning rate and size adjustment of neural networks during training. The algorithm combines ideas from genetic algorithms and stochastic optimization. It maintains a small ensemble of networks that are trained in parallel with different rates and different numbers of hidden units. After a small, fixed number of epochs, the error rate is determined on a validation set and the worst performers are replaced with copies of the best networks, modified to have different numbers of hidden units and learning rates. Hidden unit numbers and learning rates are drawn according to probability distributions derived from successful rates and sizes. In our experiments, we compared AutoMLP against MLP and libsvm with a full grid search over 90 data sets from the UCI database. Training time with grid search was 120 hours, 3 hours with AutoMLP. Grid search and libsvm performed very similarly (with some outliers in favor of grid search), while AutoMLP generally performed close to both grid search and libsvm (Figure 1) at 1/40th of the computational cost. Differences could be further reduced by continuing AutoMLP training (additional training time will only improve performance, so AutoMLP can be kept running based on how much CPU time is available). Of course, in practice, for problems of the size of the benchmark problems, there is little reason not to perform the full grid search or use libsvm. But these results give us confidence that AutoMLP is a reasonable procedure to use for problem instances that are so large that grid search and libsvm are not feasible choices anymore.