论文标题
电信知识预训练用于故障分析
Tele-Knowledge Pre-training for Fault Analysis
论文作者
论文摘要
在这项工作中,我们分享了我们在电信知识预训练中进行故障分析的经验,这是电信应用程序中至关重要的任务,需要在机器日志数据和产品文档中发现广泛的知识。为了统一地从专家那里组织这些知识,我们建议创建一个Tele-KG(Tele-Insworkledgle图)。使用这些有价值的数据,我们进一步提出了TeleDomain语言预培训模型Telebert及其知识增强版本,即Tele-Inswareledge重新训练模型Ktelebert。其中包括有效的及时提示,自适应数值数据编码和两个知识注入范例。具体而言,我们的提议包括两个阶段:首先是在2000万电视相关的Corpora上进行预培训的电信,然后对100万个因果关系和机器相关的Corpora进行重新培训,以获取Ktelebert。我们对与电视应用中的故障分析有关的多个任务的评估,包括根本原因分析,事件关联预测和故障链跟踪,表明使用TeleDomain数据进行预训练的语言模型对下游任务有益。此外,Ktelebert重新训练进一步提高了任务模型的性能,突出了将各种电信知识纳入模型的有效性。
In this work, we share our experience on tele-knowledge pre-training for fault analysis, a crucial task in telecommunication applications that requires a wide range of knowledge normally found in both machine log data and product documents. To organize this knowledge from experts uniformly, we propose to create a Tele-KG (tele-knowledge graph). Using this valuable data, we further propose a tele-domain language pre-training model TeleBERT and its knowledge-enhanced version, a tele-knowledge re-training model KTeleBERT. which includes effective prompt hints, adaptive numerical data encoding, and two knowledge injection paradigms. Concretely, our proposal includes two stages: first, pre-training TeleBERT on 20 million tele-related corpora, and then re-training it on 1 million causal and machine-related corpora to obtain KTeleBERT. Our evaluation on multiple tasks related to fault analysis in tele-applications, including root-cause analysis, event association prediction, and fault chain tracing, shows that pre-training a language model with tele-domain data is beneficial for downstream tasks. Moreover, the KTeleBERT re-training further improves the performance of task models, highlighting the effectiveness of incorporating diverse tele-knowledge into the model.