构建外域检测器时，内域样品的准确性很重要：对Marek等人的答复。（2021）

论文标题

构建外域检测器时，内域样品的准确性很重要：对Marek等人的答复。（2021）

Accuracy on In-Domain Samples Matters When Building Out-of-Domain detectors: A Reply to Marek et al. (2021)

论文作者

Zheng, Yinhe, Chen, Guanyi

论文摘要

我们注意到Marek等人。（2021）尝试重新实现我们的论文Zheng等。（2020a）在他们的工作中，“ Oodgan：用于域外数据生成的生成对抗网络”。我们的论文提出了一个模型，以生成类似于内域（IND）输入话语的伪OOD样本。这些伪OOD样品可用于通过在构建IND分类器时优化熵正则项来改善OOD检测性能。 Marek等。（2021）在Clinc150数据集上报告了他们重新实现的结果与我们的重新实现之间的差距（Larson等，2019）。本文讨论了一些可能导致如此巨大的差距的关键观察。这些观察大多数源于我们的实验，因为Marek等人。（2021）尚未发布其代码1。最重要的观察结果之一是，更强大的IND分类器通常具有更强大的检测OOD样品的能力。我们希望这些观察结果可以帮助包括Marek等人在内的其他研究人员。（2021），在其应用中开发更好的OOD检测器。

We have noticed that Marek et al. (2021) try to re-implement our paper Zheng et al. (2020a) in their work "OodGAN: Generative Adversarial Network for Out-of-Domain Data Generation". Our paper proposes a model to generate pseudo OOD samples that are akin to IN-Domain (IND) input utterances. These pseudo OOD samples can be used to improve the OOD detection performance by optimizing an entropy regularization term when building the IND classifier. Marek et al. (2021) report a large gap between their re-implemented results and ours on the CLINC150 dataset (Larson et al., 2019). This paper discusses some key observations that may have led to such a large gap. Most of these observations originate from our experiments because Marek et al. (2021) have not released their codes1. One of the most important observations is that stronger IND classifiers usually exhibit a more robust ability to detect OOD samples. We hope these observations help other researchers, including Marek et al. (2021), to develop better OOD detectors in their applications.

下载PDF全文

下载文献需遵守相关版权规定

论文标题