Learning and control for complex multiagent systems
View/ Open
Date
2023-12-06Author
Donge, Vrushabh S
0000-0003-0606-2803
Metadata
Show full item recordAbstract
**Please note that the full text is embargoed until 02/01/2025** Complex multiagent systems (MASs) are pervasive in various fields, from power system networks, and autonomous robotics to traffic management, where groups of agents
interact to achieve collective objectives. Effective coordination and control of such systems
pose significant challenges due to their inherent complexity and the need for adaptive, efficient strategies. This dissertation explores combining data-driven approaches, reinforcement
learning (RL), and control theory to address these challenges. We present novel methodologies for learning and controlling complex MASs, emphasizing the development
of adaptive algorithms that can autonomously adapt to dynamic environments, collaborate
with other agents, and optimize system-wide performance. Our findings offer promising insights into creating intelligent MASs that can operate efficiently and effectively in diverse
applications.
This thesis navigates the intricate realm of large-scale systems, focusing on MAS and complex nonlinear structures. It introduces innovative methodologies rooted in inverse
RL to tackle challenges ranging from uncovering unknown cost functions to enabling data-efficient optimal control within MAS frameworks.
The research begins by unveiling an inverse RL algorithm designed for graphical apprentice games in MAS. This algorithm employs an inner-loop optimal control update and an outer-loop inverse optimal control (IOC) update as subproblems, where reward functions that the learner MAS finds are proven to be both stabilizing and non-unique. A simulation study of DC microgrid validates the effectiveness of this approach. Expanding the scope, the thesis explores the application of decomposition principles to discrete-time RL for optimal control in networked subsystems. Here, a model-free algorithm based on online
behaviors is enhanced by employing dynamic mode decomposition (DMD) to handle larger networks, validated through consensus and power system networks.
Additionally, the work advances a data-efficient model-free RL algorithm using Koopman operators for complex nonlinear systems. This methodology lifts the nonlinear system into a linear model, deriving an off-policy Bellman equation that reduces data requirements for optimal control learning. Validation within power system excitation control demonstrates its efficacy. Furthermore, the thesis addresses reward-shaping challenges in large-scale MAS using inverse RL, proposing a scalable model-free algorithm. Leveraging DMD, this approach significantly diminishes data requirements while ensuring algorithm
convergence, stability, and the non-uniqueness of state-reward weights. Validation in a large-scale consensus network confirms the method’s efficacy through comparisons of data sizes and computational time for reward shaping.
Through these diverse methodologies and validations across various complex systems, this thesis not only contributes theoretical advancements but also offers practical solutions for managing, controlling, and shaping behaviors within intricate large-scale networked systems.