论文标题
桥梁竞标中的人类代理合作
Human-Agent Cooperation in Bridge Bidding
论文作者
论文摘要
我们为合作游戏介绍了一种与人兼容的增强学习方法,利用第三方手工编码的人与人与兼容的机器人来生成初始培训数据并执行初始评估。我们的学习方法包括模仿学习,搜索和政策迭代。我们训练有素的代理商在三种环境中实现了新的桥梁竞标的最新最先进:一个与自身副本合作演奏的代理商;代理人合作的机器人;和代理人合作人类球员。
We introduce a human-compatible reinforcement-learning approach to a cooperative game, making use of a third-party hand-coded human-compatible bot to generate initial training data and to perform initial evaluation. Our learning approach consists of imitation learning, search, and policy iteration. Our trained agents achieve a new state-of-the-art for bridge bidding in three settings: an agent playing in partnership with a copy of itself; an agent partnering a pre-existing bot; and an agent partnering a human player.