论文标题

使用多代理强化学习对通用池资源管理的网络系统控制的游戏理论分析

A game-theoretic analysis of networked system control for common-pool resource management using multi-agent reinforcement learning

论文作者

Pretorius, Arnu, Cameron, Scott, van Biljon, Elan, Makkink, Tom, Mawjee, Shahil, Plessis, Jeremy du, Shock, Jonathan, Laterre, Alexandre, Beguir, Karim

论文摘要

多代理强化学习最近显示出巨大的希望,作为进行网络系统控制的一种方法。可以说,大规模网络系统控制适用的最困难和最重要的任务之一是普通池资源管理。至关重要的通用资源包括可耕地,淡水,湿地,野生动植物,鱼类,森林和气氛,其中适当的管理与社会最大的挑战有关,例如粮食安全,不平等和气候变化。在这里,我们从最近的一项研究计划中汲取灵感,该计划调查了在社会困境情况下(例如众所周知的公共悲剧)中人类的游戏理论激励措施。但是,我们的关心不是专注于生物发展的人类般的药物,而是要更好地了解包括通用强化学习剂的工程网络系统的学习和操作行为,仅受到非生物学约束,例如记忆,计算,计算,通信带宽。利用经验游戏理论分析的工具,我们分析了由采用不同信息结构在网络多代理系统设计中采用不同信息结构所产生的解决方案概念的差异。这些信息结构与代理之间共享的信息类型以及所采用的通信协议和网络拓扑有关。我们的分析对与某些设计选择相关的后果有了新的见解,并提供了超出效率,鲁棒性,可扩展性和平均控制性能的系统之间比较的附加维度。

Multi-agent reinforcement learning has recently shown great promise as an approach to networked system control. Arguably, one of the most difficult and important tasks for which large scale networked system control is applicable is common-pool resource management. Crucial common-pool resources include arable land, fresh water, wetlands, wildlife, fish stock, forests and the atmosphere, of which proper management is related to some of society's greatest challenges such as food security, inequality and climate change. Here we take inspiration from a recent research program investigating the game-theoretic incentives of humans in social dilemma situations such as the well-known tragedy of the commons. However, instead of focusing on biologically evolved human-like agents, our concern is rather to better understand the learning and operating behaviour of engineered networked systems comprising general-purpose reinforcement learning agents, subject only to nonbiological constraints such as memory, computation and communication bandwidth. Harnessing tools from empirical game-theoretic analysis, we analyse the differences in resulting solution concepts that stem from employing different information structures in the design of networked multi-agent systems. These information structures pertain to the type of information shared between agents as well as the employed communication protocol and network topology. Our analysis contributes new insights into the consequences associated with certain design choices and provides an additional dimension of comparison between systems beyond efficiency, robustness, scalability and mean control performance.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源