分析与意图诱导有关对话的意图相关的话语嵌入和聚类方法

论文标题

分析与意图诱导有关对话的意图相关的话语嵌入和聚类方法

Analysis of Utterance Embeddings and Clustering Methods Related to Intent Induction for Task-Oriented Dialogue

论文作者

Park, Jeiyoon, Jang, Yoonna, Lee, Chanhee, Lim, Heuiseok

论文摘要

这项工作的重点是调查无监督的方法，以克服设计面向任务的对话框架构时要挑战的典型挑战：将意图标签分配给每个对话框转弯（意图聚类）并基于意图聚类方法（意图诱导）生成一组意图。我们假设有两个显着因素可以自动引起意图：（1）用于意图标签的聚类算法和（2）用户话语嵌入空间。我们比较了基于DSTC11评估的现有现成的聚类模型和嵌入。我们的广泛实验表明，应仔细考虑在意图诱导任务中的嵌入和聚类方法的组合选择。我们还提出，介绍的微小粒子与聚集聚类显示出NMI，ARI，F1，精度和示例覆盖范围的显着改善。源代码可在https://github.com/jeiyoon/dstc11-track2上找到。

The focus of this work is to investigate unsupervised approaches to overcome quintessential challenges in designing task-oriented dialog schema: assigning intent labels to each dialog turn (intent clustering) and generating a set of intents based on the intent clustering methods (intent induction). We postulate there are two salient factors for automatic induction of intents: (1) clustering algorithm for intent labeling and (2) user utterance embedding space. We compare existing off-the-shelf clustering models and embeddings based on DSTC11 evaluation. Our extensive experiments demonstrate that the combined selection of utterance embedding and clustering method in the intent induction task should be carefully considered. We also present that pretrained MiniLM with Agglomerative clustering shows significant improvement in NMI, ARI, F1, accuracy and example coverage in intent induction tasks. The source codes are available at https://github.com/Jeiyoon/dstc11-track2.

下载PDF全文

下载文献需遵守相关版权规定

论文标题