通过未标记的对话上下文增强数据以进行讽刺检测

论文标题

通过未标记的对话上下文增强数据以进行讽刺检测

Augmenting Data for Sarcasm Detection with Unlabeled Conversation Context

论文作者

Lee, Hankyol, Yu, Youngjae, Kim, Gunhee

论文摘要

我们提出了一种新颖的数据增强技术，即CRA（上下文响应增强），该技术利用会话上下文生成有意义的样本进行培训。我们还通过更改模型的输入输出格式来减轻有关上下文长度的问题，从而可以有效地处理不同的上下文长度。具体而言，我们提出的模型接受了建议的数据增强技术培训，该模型参加了FIGLANG2020的讽刺检测任务，并在Reddit和Twitter数据集中赢得了最佳性能。

We present a novel data augmentation technique, CRA (Contextual Response Augmentation), which utilizes conversational context to generate meaningful samples for training. We also mitigate the issues regarding unbalanced context lengths by changing the input-output format of the model such that it can deal with varying context lengths effectively. Specifically, our proposed model, trained with the proposed data augmentation technique, participated in the sarcasm detection task of FigLang2020, have won and achieves the best performance in both Reddit and Twitter datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题