Improved initialization for the multi layer perceptron
View/ Open
Date
2018-05-08Author
Mainkar, Abhishek Vinay
0000-0001-7566-5902
Metadata
Show full item recordAbstract
A Multilayer Perceptron (MLP) neural network is used for solving nonlinear functional problems like function approximation, classification, data processing etc. MLP neural networks are usually trained using back propagation, which is a non-convex optimization problem for most of the loss functions. As there are multiple local minima, non-convex optimization curves generally converge to different optimal points for different initial conditions. So it not only affects the speed of the convergence but optimality as well. Initial parameters of neural networks are as important as the network architecture and initialization has been thoroughly studied in the past. This report discusses the fusion method and modified sigmoid method which are used for network initialization. Both initialization methods discussed in this report are based on the regular Hidden weight optimization – Multiple optimal learning factors (HWO-MOLF) MLP. Due to non-convex optimization, training an MLP for large networks has the possibility of finding local minima instead of the global minima. The network has a possibility to stick at saddle points when minimizing the error function. Both the initialization procedures in this report, try to avoid the likelihood of finding a local minimum. The training experiments and results obtained are demonstrated in this report. We can see that using both the initialization methods for training HWO-MOLF network, helps mitigate local minima problem, hidden units saturation problem, and dependent hidden units.