论文标题
有效的内核UCB用于上下文匪徒
Efficient Kernel UCB for Contextual Bandits
论文作者
论文摘要
在本文中,我们解决了上下文匪徒中内核化UCB算法的计算效率。尽管标准方法需要一个O(CT^3)的复杂性,其中T是地平线,而常数C与优化UCB规则有关,但我们提出了一种有效的大规模问题上下文算法。具体而言,我们的方法依赖于上下文和动作的关节内核嵌入的增量nystrom近似值。这使我们能够达到O(CTM^2)的复杂性,其中M是Nystrom点的数量。要恢复与标准内核化UCB算法相同的遗憾,M需要处于问题的有效维度的顺序,最多是O(\ sqrt(t)),在某些情况下几乎是恒定的。
In this paper, we tackle the computational efficiency of kernelized UCB algorithms in contextual bandits. While standard methods require a O(CT^3) complexity where T is the horizon and the constant C is related to optimizing the UCB rule, we propose an efficient contextual algorithm for large-scale problems. Specifically, our method relies on incremental Nystrom approximations of the joint kernel embedding of contexts and actions. This allows us to achieve a complexity of O(CTm^2) where m is the number of Nystrom points. To recover the same regret as the standard kernelized UCB algorithm, m needs to be of order of the effective dimension of the problem, which is at most O(\sqrt(T)) and nearly constant in some cases.