估计长期治疗效果的增强学习方法

论文标题

估计长期治疗效果的增强学习方法

A Reinforcement Learning Approach to Estimating Long-term Treatment Effects

论文作者

Tang, Ziyang, Duan, Yiheng, Zhang, Stephanie, Li, Lihong

论文摘要

随机实验（又称A/B测试）是估计治疗效果，为业务，医疗保健和其他应用做出决定的强大工具。在许多问题中，这种治疗的持久作用会随着时间的流逝而发展。随机实验的一个限制是，它们不容易扩展以测量长期效果，因为进行长期实验是耗时且昂贵的。在本文中，我们采用了强化学习（RL）方法，以估计马尔可夫进程中的平均奖励。在现实世界中，观察到的状态过渡是非本质的，我们为一类非组织问题开发了一种新算法，并在两个合成数据集和一个在线商店数据集中展示了有希望的结果。

Randomized experiments (a.k.a. A/B tests) are a powerful tool for estimating treatment effects, to inform decisions making in business, healthcare and other applications. In many problems, the treatment has a lasting effect that evolves over time. A limitation with randomized experiments is that they do not easily extend to measure long-term effects, since running long experiments is time-consuming and expensive. In this paper, we take a reinforcement learning (RL) approach that estimates the average reward in a Markov process. Motivated by real-world scenarios where the observed state transition is nonstationary, we develop a new algorithm for a class of nonstationary problems, and demonstrate promising results in two synthetic datasets and one online store dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题