
AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks
AdaBatch adaptively increases the batch size during training to preserve the convergence behavior of small batches while improving computational efficiency. The method is evaluated with AlexNet, ResNet, and VGG on CIFAR-10, CIFAR-100, and ImageNet and improves performance by up to 6.25x on 4 NVIDIA Tesla P100 GPUs while changing accuracy by less than 1% relative to fixed batch sizes.