Learning and control for complex multiagent systems

Donge, Vrushabh S

View/Open

DONGE-DISSERTATION-2023.pdf (4.853Mb)

Date

2023-12-06

Author

Donge, Vrushabh S

0000-0003-0606-2803

Metadata

Show full item record

Abstract

**Please note that the full text is embargoed until 02/01/2025** Complex multiagent systems (MASs) are pervasive in various fields, from power system networks, and autonomous robotics to traffic management, where groups of agents interact to achieve collective objectives. Effective coordination and control of such systems pose significant challenges due to their inherent complexity and the need for adaptive, efficient strategies. This dissertation explores combining data-driven approaches, reinforcement learning (RL), and control theory to address these challenges. We present novel methodologies for learning and controlling complex MASs, emphasizing the development of adaptive algorithms that can autonomously adapt to dynamic environments, collaborate with other agents, and optimize system-wide performance. Our findings offer promising insights into creating intelligent MASs that can operate efficiently and effectively in diverse applications. This thesis navigates the intricate realm of large-scale systems, focusing on MAS and complex nonlinear structures. It introduces innovative methodologies rooted in inverse RL to tackle challenges ranging from uncovering unknown cost functions to enabling data-efficient optimal control within MAS frameworks. The research begins by unveiling an inverse RL algorithm designed for graphical apprentice games in MAS. This algorithm employs an inner-loop optimal control update and an outer-loop inverse optimal control (IOC) update as subproblems, where reward functions that the learner MAS finds are proven to be both stabilizing and non-unique. A simulation study of DC microgrid validates the effectiveness of this approach. Expanding the scope, the thesis explores the application of decomposition principles to discrete-time RL for optimal control in networked subsystems. Here, a model-free algorithm based on online behaviors is enhanced by employing dynamic mode decomposition (DMD) to handle larger networks, validated through consensus and power system networks. Additionally, the work advances a data-efficient model-free RL algorithm using Koopman operators for complex nonlinear systems. This methodology lifts the nonlinear system into a linear model, deriving an off-policy Bellman equation that reduces data requirements for optimal control learning. Validation within power system excitation control demonstrates its efficacy. Furthermore, the thesis addresses reward-shaping challenges in large-scale MAS using inverse RL, proposing a scalable model-free algorithm. Leveraging DMD, this approach significantly diminishes data requirements while ensuring algorithm convergence, stability, and the non-uniqueness of state-reward weights. Validation in a large-scale consensus network confirms the method’s efficacy through comparisons of data sizes and computational time for reward shaping. Through these diverse methodologies and validations across various complex systems, this thesis not only contributes theoretical advancements but also offers practical solutions for managing, controlling, and shaping behaviors within intricate large-scale networked systems.

URI

http://hdl.handle.net/10106/31991

Collections

Theses and Dissertations(library)