论文标题

有效的沃斯坦恒星自然梯度用于加固学习

Efficient Wasserstein Natural Gradients for Reinforcement Learning

论文作者

Moskovitz, Ted, Arbel, Michael, Huszar, Ferenc, Gretton, Arthur

论文摘要

提出了一种新颖的优化方法,以应用于政策梯度方法和增强学习的演化策略(RL)。该过程使用计算高效的Wasserstein自然梯度(WNG)下降,该下降利用了沃斯坦(Wasserstein)惩罚引起的几何形状来进行速度优化。该方法遵循最新主题,其中包括建立信托区域的目标的分歧罚款。有关挑战任务的实验表明,计算成本和性能的改善对高级基准。

A novel optimization approach is proposed for application to policy gradient methods and evolution strategies for reinforcement learning (RL). The procedure uses a computationally efficient Wasserstein natural gradient (WNG) descent that takes advantage of the geometry induced by a Wasserstein penalty to speed optimization. This method follows the recent theme in RL of including a divergence penalty in the objective to establish a trust region. Experiments on challenging tasks demonstrate improvements in both computational cost and performance over advanced baselines.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源