论文标题
与耦合约束和效用观察的多代理系统的协调在线学习
Coordinated Online Learning for Multi-Agent Systems with Coupled Constraints and Perturbed Utility Observations
论文作者
论文摘要
竞争性的非合作性在线决策代理人的行动增加了稀缺资源的拥塞,这构成了广泛的现代大型应用程序的模型。为了确保可持续的资源行为,我们介绍了一种新颖的方法,将代理商引导到稳定的人群状态,并实现给定的耦合资源限制。所提出的方法是一种基于游戏拉格朗日增强所产生的资源负载的去中心化资源定价方法。假设在线学习代理只有嘈杂的一阶效用反馈,我们表明,对于多项式衰减的代理人的步长/学习率,人口的动态几乎肯定会融合到广义的NASH平衡中。后者的一个特殊结果是在渐近极限中实现资源限制。此外,我们通过为预期的资源约束违规量绑定的非杂种时间衰减来调查所提出算法的有限时间质量。
Competitive non-cooperative online decision-making agents whose actions increase congestion of scarce resources constitute a model for widespread modern large-scale applications. To ensure sustainable resource behavior, we introduce a novel method to steer the agents toward a stable population state, fulfilling the given coupled resource constraints. The proposed method is a decentralized resource pricing method based on the resource loads resulting from the augmentation of the game's Lagrangian. Assuming that the online learning agents have only noisy first-order utility feedback, we show that for a polynomially decaying agents' step size/learning rate, the population's dynamic will almost surely converge to generalized Nash equilibrium. A particular consequence of the latter is the fulfillment of resource constraints in the asymptotic limit. Moreover, we investigate the finite-time quality of the proposed algorithm by giving a nonasymptotic time decaying bound for the expected amount of resource constraint violation.