使用未知状态矩阵的线性系统的结构化控制的强化学习

论文标题

使用未知状态矩阵的线性系统的结构化控制的强化学习

Reinforcement Learning of Structured Control for Linear Systems with Unknown State Matrix

论文作者

Mukherjee, Sayak, Vu, Thanh Long

论文摘要

本文深入研究了具有未知状态矩阵的连续线性系统的稳定反馈控制收益，其中控制受到一般结构约束。我们将增强学习（RL）的想法与足够的稳定性和性能保证一起，以使用状态和控制措施的轨迹测量来设计这些结构化的收益。我们首先使用动态编程（DP）制定基于模型的框架，以将结构约束嵌入到线性二次调节器（LQR）在连续时间设置中获得计算。随后，我们将此LQR公式转换为政策迭代RL算法，该算法可以减轻已知状态矩阵的要求，并结合保持反馈增益结构。提供理论保证，以实现结构化RL（SRL）算法的稳定性和收敛性。引入的RL框架是一般的，可以应用于任何控制结构。该RL框架启用的特殊控制结构是分布式学习控制，这对于许多大规模的网络物理系统都是必需的。因此，我们通过数值模拟在多代理网络线性时间流动（LTI）动态系统上验证我们的理论结果。

This paper delves into designing stabilizing feedback control gains for continuous linear systems with unknown state matrix, in which the control is subject to a general structural constraint. We bring forth the ideas from reinforcement learning (RL) in conjunction with sufficient stability and performance guarantees in order to design these structured gains using the trajectory measurements of states and controls. We first formulate a model-based framework using dynamic programming (DP) to embed the structural constraint to the Linear Quadratic Regulator (LQR) gain computation in the continuous-time setting. Subsequently, we transform this LQR formulation into a policy iteration RL algorithm that can alleviate the requirement of known state matrix in conjunction with maintaining the feedback gain structure. Theoretical guarantees are provided for stability and convergence of the structured RL (SRL) algorithm. The introduced RL framework is general and can be applied to any control structure. A special control structure enabled by this RL framework is distributed learning control which is necessary for many large-scale cyber-physical systems. As such, we validate our theoretical results with numerical simulations on a multi-agent networked linear time-invariant (LTI) dynamic system.

下载PDF全文

下载文献需遵守相关版权规定

论文标题