Neural Network Architecture Optimization  Using Reinforcement Learning

Vadhera, Raghav

dc.contributor.advisor	Huber, Manfred
dc.creator	Vadhera, Raghav
dc.date.accessioned	2023-06-14T17:06:46Z
dc.date.available	2023-06-14T17:06:46Z
dc.date.created	2023-05
dc.date.issued	2023-05-19
dc.date.submitted	May 2023
dc.identifier.uri	http://hdl.handle.net/10106/31257
dc.description.abstract	Deep learning has emerged as an increasingly valuable tool, employed across a myriad of applications. However, the intricacies of deep learning systems, stemming from their sensitivity to specific network architectures, have rendered them challenging for non-experts to harness, thus highlighting the need for automatic network architecture optimization. Prior research predominantly optimizes a network for a single problem through architecture search, necessitating extensive training of various architectures during optimization.\\ To tackle this issue and unlock the potential for transferability across tasks, this dissertation presents a groundbreaking approach that employs Reinforcement Learning to develop a network optimization policy based on an abstract problem and architecture embedding. This approach enables the optimization of networks for novel problems without the burden of excessive additional training. Leveraging policy learning and an abstract problem embedding, the method facilitates the transfer of policy across problems, thus optimizing networks for new challenges with minimal additional training.\\ Initial evaluations of this method's capabilities were conducted using a standard classification problem, demonstrating its effectiveness in optimizing architectures for a specific issue within a given range of fully connected networks. The results also indicate the approach's potential in learning efficient policies for automatic network architecture improvement. Subsequent experiments were performed using a variety of complex problems, further showcasing the approach's capacity to optimize architectures effectively.\\ Siamese networks were adeptly employed to establish a coherent embedding of the network architecture space. In conjunction with a problem-specific feature vector, which captures the intricacies of the problem, the Reinforcement Learning agent was able to acquire a transferable policy for deriving high-performing network architectures across a vast spectrum of problems.\\ This dissertation ultimately proposes a novel method for optimizing architectures for diverse problems within a range of more complex networks. Experiments reveal that the proposed system successfully learns an embedding space and policy that can derive and optimize network architectures nearing optimality, even for unencountered problems. Multiple datasets, each possessing unique feature vectors representing distinct entities or problems, were utilized to facilitate the optimization of one problem at a time. A random initial policy was employed to construct trajectories in the embedding space during training. To assess the performance and functionality of various network components, a series of pre-training steps were undertaken, focusing on distinct components and examining the outcomes prior to training subsequent components.\\ Building upon these foundations, the dissertation delves deeper into the practical applications and implications of the proposed approach. It examines the scalability of the method to larger and more intricate network architectures, such as convolutional and recurrent neural networks, with the intent of broadening its applicability across a diverse array of problem domains. Furthermore, the research explores the integration of the learned optimization policy with other optimization techniques, such as gradient-based methods and evolutionary algorithms, to bolster the efficiency and effectiveness of network architecture optimization.\\ To validate the generalizability of the learned policies, the dissertation examines their performance on real-world problems, spanning various industries and domains, including healthcare, finance, sports, human psychology and auto. These case studies aim to demonstrate the practical utility of the proposed approach in addressing real-world challenges and uncover potential areas for further refinement and improvement.\\ In addition to these empirical investigations, the dissertation discusses the theoretical underpinnings of the method, examining the convergence properties, stability, and robustness of the learned policies. These investigations provide valuable insights into the factors that influence policy transferability and optimization performance across diverse problem domains, offering guidance for future research in the field of deep learning and network architecture optimization.\\ In conclusion, this dissertation presents a novel and effective approach for optimizing deep learning network architectures through the use of Reinforcement Learning, abstract problem embeddings, and transferable policies. Through rigorous experimentation and evaluation, the research demonstrates the potential of this method to significantly improve the optimization process for deep learning systems, making them more accessible and efficient for non-experts and experts alike. By bridging the gap between complex network architectures and real-world applications, this groundbreaking approach paves the way for advancements in deep learning and AI-driven solutions across various industries and domains.
dc.format.mimetype	application/pdf
dc.language.iso	en_US
dc.subject	Reinforcement learning
dc.subject	Deep learning
dc.subject	NAO
dc.subject	TD3
dc.subject	Agent
dc.subject	Actor
dc.subject	Critic
dc.subject	Siamese
dc.title	Neural Network Architecture Optimization Using Reinforcement Learning
dc.type	Thesis
dc.date.updated	2023-06-14T17:06:46Z
thesis.degree.department	Computer Science and Engineering
thesis.degree.grantor	The University of Texas at Arlington
thesis.degree.level	Doctoral
thesis.degree.name	Doctor of Philosophy in Computer Science
dc.type.material	text
dc.creator.orcid	0009-0006-0679-3904

Files in this item

Name:: VADHERA-DISSERTATION-2023.pdf
Size:: 11.31Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Show simple item record