Stable rank normalization for improved generalization in neural networks and GANs

Sanyal A, Torr P, Dokania PK

Exciting new work on the generalization bounds for neural networks (NN) givenby Neyshabur et al. , Bartlett et al. closely depend on twoparameter-depenedent quantities: the Lipschitz constant upper-bound and thestable rank (a softer version of the rank operator). This leads to aninteresting question of whether controlling these quantities might improve thegeneralization behaviour of NNs. To this end, we propose stable ranknormalization (SRN), a novel, optimal, and computationally efficientweight-normalization scheme which minimizes the stable rank of a linearoperator. Surprisingly we find that SRN, inspite of being non-convex problem,can be shown to have a unique optimal solution. Moreover, we show that SRNallows control of the data-dependent empirical Lipschitz constant, which incontrast to the Lipschitz upper-bound, reflects the true behaviour of a modelon a given dataset. We provide thorough analyses to show that SRN, when appliedto the linear layers of a NN for classification, provides strikingimprovements-11.3% on the generalization gap compared to the standard NN alongwith significant reduction in memorization. When applied to the discriminatorof GANs (called SRN-GAN) it improves Inception, FID, and Neural divergencescores on the CIFAR 10/100 and CelebA datasets, while learning mappings withlow empirical Lipschitz constants.