使用生成模型开放词汇极端分类

论文标题

使用生成模型开放词汇极端分类

Open Vocabulary Extreme Classification Using Generative Models

论文作者

Simig, Daniel, Petroni, Fabio, Yanki, Pouya, Popat, Kashyap, Du, Christina, Riedel, Sebastian, Yazdani, Majid

论文摘要

极端的多标签分类（XMC）任务旨在用来自极大的标签集的标签子集对内容进行标记。标签词汇通常由域专家提前定义，并假定捕获所有必要的标签。但是，在现实世界中，这个标签设置虽然很大，但通常不完整，专家经常需要完善它。为了开发简化此过程的系统，我们介绍了开放词汇XMC（OXMC）的任务：给定一块内容，预测一组标签，其中一些标签可能不在已知的标签集之外。因此，除了没有对某些标签的培训数据（如零拍摄的情况）之外，模型还需要在当时发明一些标签。我们提出了GROOV，这是OXMC的微调SEQ2SEQ模型，该模型将标签集作为平坦序列生成，并使用新型损失进行训练，而与预测的标签顺序无关。我们显示了该方法的功效，并尝试使用流行的XMC数据集，而Grov可以在给定词汇量之外预测有意义的标签，同时与最新的已知标签解决方案相同。

The extreme multi-label classification (XMC) task aims at tagging content with a subset of labels from an extremely large label set. The label vocabulary is typically defined in advance by domain experts and assumed to capture all necessary tags. However in real world scenarios this label set, although large, is often incomplete and experts frequently need to refine it. To develop systems that simplify this process, we introduce the task of open vocabulary XMC (OXMC): given a piece of content, predict a set of labels, some of which may be outside of the known tag set. Hence, in addition to not having training data for some labels - as is the case in zero-shot classification - models need to invent some labels on-the-fly. We propose GROOV, a fine-tuned seq2seq model for OXMC that generates the set of labels as a flat sequence and is trained using a novel loss independent of predicted label order. We show the efficacy of the approach, experimenting with popular XMC datasets for which GROOV is able to predict meaningful labels outside the given vocabulary while performing on par with state-of-the-art solutions for known labels.

下载PDF全文

下载文献需遵守相关版权规定

论文标题