Abstract by Michael Nelson
Predicting Optimal Learning Rates
Recent developments in the field of Deep Learning have resulted in enormous gains across the STEM economy. The development of techniques and hardware which enable training of much larger artificial neural networks are largely responsible for these recent gains. Building on the work done by recent PHD graduate Chris Hettinger, we hope to enable the training of larger networks by predicting beforehand valid ranges for critical hyperparameters. In particular, by identifying the range of stable initial learning rates based on the size and architecture of a network before costly training attempts, we hope to reduce the overall cost of training particularly large networks. Current work centers on numerical validation of Hettinger’s work in linear networks. Future research will extend this work to convolutional architectures.