决策Holdem：安全深度限制的解决方案，以不同的对手进行不完善的信息游戏

论文标题

决策Holdem：安全深度限制的解决方案，以不同的对手进行不完善的信息游戏

DecisionHoldem: Safe Depth-Limited Solving With Diverse Opponents for Imperfect-Information Games

论文作者

Zhou, Qibin, Bai, Dongdong, Zhang, Junge, Duan, Fuqing, Huang, Kaiqi

论文摘要

不完美的信息游戏是一种具有非对称信息的游戏。它在生活中比完美的信息游戏更普遍。诸如扑克之类的不完美信息游戏中的人工智能（AI）在近年来取得了长足的进步和成功。超人扑克AI（例如Libratus和Deepstack）取得了巨大的成功，吸引了研究人员注意扑克研究。但是，缺乏开源代码限制了德克萨斯州的发展在某种程度上持有AI。本文介绍了DecisionHoldem，这是一种高级AI，可通过考虑对手的私人手的可能范围来减少该战略的可利用性，以安全的深度限制子游戏解决方案。实验结果表明，DecisionHoldem在不受限制的Texas Hold'em扑克中击败了最强大的公开代理商，即Slumbot，即DeepStack，DeepStack，Viz，OpenStack的高级再现，超过730 Mbb/h（每回合一千分之一大的盲人）和700 MBB/H。此外，我们发布了决策的源代码和工具，以促进不完美信息游戏中的AI开发。

An imperfect-information game is a type of game with asymmetric information. It is more common in life than perfect-information game. Artificial intelligence (AI) in imperfect-information games, such like poker, has made considerable progress and success in recent years. The great success of superhuman poker AI, such as Libratus and Deepstack, attracts researchers to pay attention to poker research. However, the lack of open-source code limits the development of Texas hold'em AI to some extent. This article introduces DecisionHoldem, a high-level AI for heads-up no-limit Texas hold'em with safe depth-limited subgame solving by considering possible ranges of opponent's private hands to reduce the exploitability of the strategy. Experimental results show that DecisionHoldem defeats the strongest openly available agent in heads-up no-limit Texas hold'em poker, namely Slumbot, and a high-level reproduction of Deepstack, viz, Openstack, by more than 730 mbb/h (one-thousandth big blind per round) and 700 mbb/h. Moreover, we release the source codes and tools of DecisionHoldem to promote AI development in imperfect-information games.

下载PDF全文

下载文献需遵守相关版权规定

论文标题