AFFINE INVARIANCE IN MULTILAYER PERCEPTRON TRAINING
Abstract
Training methods for both shallow and deep neural nets are dominated by first order algorithms related to back propagation and conjugate gradient. However, these methods lack affine invariance so performance is damaged by nonzero input means, dependent inputs, dependent hidden units and the use of only one learning factor. This dissertation reviews affine invariance and shows how MLP training can be made partially affine invariant when Newton's method is used to train small numbers of MLP parameters. Several novel methods are proposed for scalable partially affine invariant MLP training. The potential application of the algorithm to deep learning is discussed. Ten-fold testing errors for several datasets show that the proposed algorithm outperforms back propagation and conjugate gradient, and that it scales far better than Levenberg-Marquardt.