SIM2REAL用于增强学习驱动的下一代网络

论文标题

SIM2REAL用于增强学习驱动的下一代网络

Sim2real for Reinforcement Learning Driven Next Generation Networks

论文作者

Li, Peizheng, Thomas, Jonathan, Wang, Xiaoyang, Erdol, Hakan, Ahmad, Abdelrahim, Inacio, Rui, Kapoor, Shipra, Parekh, Arjun, Doufexi, Angela, Shojaeifard, Arman, Piechocki, Robert

论文摘要

下一代网络将积极采用人工智能（AI）和机器学习（ML）技术，用于自动化网络和最佳网络操作策略。以Open Ran（O-Ran）为代表的新兴网络结构符合这一趋势，并且其规范中心的无线电智能控制器（RIC）用作ML应用程序主机。各种ML模型，尤其是增强学习模型，被认为是解决与RAN相关的多目标优化问题的关键。但是，应该认识到，当前大多数RL成功都局限于抽象和简化的仿真环境，这可能不会直接转化为复杂的真实环境中的高性能。主要原因之一是模拟与真实环境之间的建模差距，这可能会使RL代理通过模拟训练不足的真实环境。此问题称为SIM2REAL差距。本文在O-Ran的背景下引起了SIM2REAL挑战。具体而言，它强调了数字双胞胎（DT）可以作为模型开发和验证的地方的特征和好处。提出了几种用例，以举例说明并证明在实际环境中训练有训练的RL模型的故障模式。讨论了DT协助RL算法开发的有效性。然后，提出了通常用于克服SIM2REAL挑战的基于艺术学习的方法的当前状态。最后，从潜在问题，诸如数据交互，环境瓶颈和算法设计等潜在问题的角度讨论了O-RAN中RL应用程序实现的开发和部署问题。

The next generation of networks will actively embrace artificial intelligence (AI) and machine learning (ML) technologies for automation networks and optimal network operation strategies. The emerging network structure represented by Open RAN (O-RAN) conforms to this trend, and the radio intelligent controller (RIC) at the centre of its specification serves as an ML applications host. Various ML models, especially Reinforcement Learning (RL) models, are regarded as the key to solving RAN-related multi-objective optimization problems. However, it should be recognized that most of the current RL successes are confined to abstract and simplified simulation environments, which may not directly translate to high performance in complex real environments. One of the main reasons is the modelling gap between the simulation and the real environment, which could make the RL agent trained by simulation ill-equipped for the real environment. This issue is termed as the sim2real gap. This article brings to the fore the sim2real challenge within the context of O-RAN. Specifically, it emphasizes the characteristics, and benefits that the digital twins (DT) could have as a place for model development and verification. Several use cases are presented to exemplify and demonstrate failure modes of the simulations trained RL model in real environments. The effectiveness of DT in assisting the development of RL algorithms is discussed. Then the current state of the art learning-based methods commonly used to overcome the sim2real challenge are presented. Finally, the development and deployment concerns for the RL applications realisation in O-RAN are discussed from the view of the potential issues like data interaction, environment bottlenecks, and algorithm design.

下载PDF全文

下载文献需遵守相关版权规定

论文标题