加强学习，以进行有效和无调的链接适应

论文标题

加强学习，以进行有效和无调的链接适应

Reinforcement Learning for Efficient and Tuning-Free Link Adaptation

论文作者

Saxena, Vidit, Tullberg, Hugo, Jaldén, Joakim

论文摘要

无线链接将数据传输参数调整为动态通道状态 - 这称为链接适应。经典链接适应依赖于调整参数，这些参数具有挑战性，以配置以达到最佳链接性能。最近，已经提出了增强学习以使链接适应性自动化，其中传输参数被建模为多臂匪徒的离散臂。在这种情况下，我们提出了一个潜在的学习模型，以利用数据传输参数之间的相关性。此外，我们提出了一种潜在的汤普森采样（LTS）算法，这是由于汤普森采样对多武器匪徒问题的最新成功所激发的，该算法很快学习了给定信道状态的最佳参数。我们通过无调的机制将LTS扩展到褪色的无线通道，该机制自动跟踪通道动力学。与最先进的链接适应算法相比，在随着无线通道褪色的数值评估中，LTS的链接最多可提高100％。

Wireless links adapt the data transmission parameters to the dynamic channel state -- this is called link adaptation. Classical link adaptation relies on tuning parameters that are challenging to configure for optimal link performance. Recently, reinforcement learning has been proposed to automate link adaptation, where the transmission parameters are modeled as discrete arms of a multi-armed bandit. In this context, we propose a latent learning model for link adaptation that exploits the correlation between data transmission parameters. Further, motivated by the recent success of Thompson sampling for multi-armed bandit problems, we propose a latent Thompson sampling (LTS) algorithm that quickly learns the optimal parameters for a given channel state. We extend LTS to fading wireless channels through a tuning-free mechanism that automatically tracks the channel dynamics. In numerical evaluations with fading wireless channels, LTS improves the link throughout by up to 100% compared to the state-of-the-art link adaptation algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题