Structured Deep Learning: Theory and Applications

Zhu, Fangqi

View/Open

ZHU-DISSERTATION-2020.pdf (4.810Mb)

Date

2020-08-11

Author

Zhu, Fangqi

Metadata

Show full item record

Abstract

The increasing amount of data generation has boosted the broad range of research in big data and artificial intelligence. Besides the success of the deep learning in wide range of research area, it meets its pitfalls on the following three problems: * Data hungry: current models often require to feed GB, TB even PB level of data, which is easily overfit. * Hard to generalize: deep learning constructs representations that memorize their training data rather than generalize to unseen scenarios * Missing critical information: o -the-self deep learning framework may not fully utilize the underlying information of the data We investigate the above problems to find the underlying structural information in the data and framework of the neural network and try to understand how these two interact with each other. In this study, two typical structured learning models are mainly investigated: time series and graph data. The long-existing vanishing and gradient problems cause the bottleneck of convergence problem, which impede the process of keeping memory in sequential models. We formulate this problem by considering the weights variation in the recurrent weight matrix and utilize the constraints of the Stiefel manifold to design decomposition methods. Two orthogonal constrained recurrent neural network (OCRNN) and their corresponding training algorithms are proposed and demonstrated that they outperform previous structures on the synthetic memory keeping dataset. In order to verify the performance of the OCRNN on real problems, we investigate the problem of throat polyp detection problem based on the acoustic sampling data. By preprocessing the acoustic data using standard time-frequency expansion, the processed features are sent to the OCRNNs. The OCRNN can save 1/10 of the total number of parameters compared to the standard RNN while achieving competitive performance, which demonstrate its computational efficiency. The creation of the mixed model with the one-dimesion convolutional neural network (1dCNN) and OCRNN is deployed on the sleep stage scoring problem based on the EEG signals. The 1dCNN is leveraged to extract fine and coarse time-frequency feature map for the downstream OCRNN. We demonstrate that the 1dCNN-OCRNN can save both the time complexity and spatial complexity and reach a comparatively better performance compare to previous benchmark results. The graph neural network based model for image anomaly detection is introduced in the final part. The superpixels serve as the mapping from pixel-wise image to graph feature and topology information. The hierarchical variational graph autoencoder with pooling and upsampling operations is adopted for semi-supervised anomaly detection fashion. The feasibility of this model is demonstrated on road surface anomaly detection, which demonstrates its competitive ability and savings of the storage of the neural network overload.

URI

http://hdl.handle.net/10106/29443