Page 196 - The-5th-MCAIT2021-eProceeding
P. 196
Principal Component Analysis Variant Initialization in
Convolutional Neural Network
a
b
Nor Sakinah Md Othman *, Azizi Abdullah
a,b Center for Artificial Intelligence Technology (CAIT), Faculty of Information Science and Technology, Universiti Kebangsaan
Malaysia, 43600 Bangi, Selangor, Malaysia
*Email: p108639@siswa.ukm.edu.my
Abstract
In Convolutional Neural Network (CNN), there are various weight initialization strategies that have been proposed to handle
overfitting and slow convergence. This paper proposed an alternative weight initialization technique that utilizes Gaussian
Principal Component Analysis (GPCA) initialization and Generalized Gaussian Principal Component Analysis (G-GPCA)
initialization on LeNet-5 and AlexNet. The set of overlapping Gaussian windows is used to generate GPCA filters that
simulates the characteristics of orientation and texture receptive fields which will lead to better performance in extracting
low level features. The proposed method is tested on five different datasets namely MNIST, CIFAR-10, SVHN, GTSRB
dan Covid-19. The results show that PCA variant initialization (PCA, GPCA, G-GPCA) obtained consistent accuracy that
can be translated to consistent performance on a variety of datasets.
Keywords: Weight initialization; principal component analysis; convolutional neural network; image classification.
1. Introduction
Applications of convolutional neural network (CNN) on various domains such as object recognition have
been increased significantly due to greater computing power and higher volume of training datasets. Even
though it is a well-known research topic and various works have been introduced, the issue on obtaining
consistent accuracy and faster convergence still persists. Several methods have been proposed to optimize the
training process of CNNs and one of the approaches made by other researchers is by focusing on the weight
initialization strategy (Koturwar & Merchant, 2017).
Popular methods of setting weights in the convolutional layer are Xavier initialization (Glorot & Bengio,
2010) and He initialization (He et al., 2015). In (Koturwar & Merchant, 2017), both methods are believed to be
able to handle gradient diffusion problem and the dying neuron problem. Gradient diffusion can be defined as
amplification or attenuation of gradient values throughout the backpropagation process which results to an event
of exploding or vanishing of gradients (Sun, 2020). Utilizing the standard initialization has several downsides
such as independent to data statistics (Koturwar & Merchant, 2017), prone to dying neuron problem (Lu et al.,
2019) and are often produced in redundance (Luan et al., 2018). Thus, it motivates other different types of
weight initialization techniques such as PCA initialization, Linear Discriminant Analysis (LDA) initialization
(Alberti et al., 2017) and have shown promising results in image classification.
In this paper, initialization technique using Principal Component Analysis (PCA) variants such as PCA,
Gaussian PCA (GPCA) and Generalized GPCA (G-GPCA) is introduced. LeNet-5 and AlexNet models are then
used to investigate in CNN for image classification tasks. Several papers have introduced the usage of PCA
filters on CNN, but there is still insufficient research conducted on GPCA and G-GPCA filters (Brause et al.,
1999). In this method, PCA filters are generated by utilizing the image data statistics (Koturwar & Merchant,
2017). It has some advantages such as ability to handle gradient diffusion problem (Ren et al., 2016) and provide
robustness against image transformations (Soon et al., 2020). The contribution of this work is to provide an
E- Proceedings of The 5th International Multi-Conference on Artificial Intelligence Technology (MCAIT 2021) [183]
Artificial Intelligence in the 4th Industrial Revolution