虚拟典型网络，用于几个开放式关键字点斑点

论文标题

虚拟典型网络，用于几个开放式关键字点斑点

Dummy Prototypical Networks for Few-Shot Open-Set Keyword Spotting

论文作者

Kim, Byeonggeun, Yang, Seunghan, Chung, Inseop, Chang, Simyung

论文摘要

关键字斑点是检测流音频中的关键字的任务。传统的关键字点斑点目标预定义的关键字分类，但是在几个射击（示例）关键字点斑点（例如，N-way分类给定M-Shot支持样本）中越来越关注。此外，在实际情况下，可能会有意外类别（开放设定）的话语，需要被拒绝，而不是归类为N类之一。结合了两个需求，我们将几个开放键关键字点斑点与名为SplitGSC的新基准设置进行了处理。我们提出了基于公制学习的情节 - 已知的虚拟原型，以更好地检测开放式设定，并引入一种简单而强大的方法，虚拟原型网络（D-Protonets）。与最新的SplitGSC中的几个射击开放式识别（FSOSR）方法相比，我们的D-Protonets显示出明显的边缘。我们还在标准基准测试，迷你胶原和D-Protonets上验证了我们的方法，显示了FSOSR中最新的开放式检测率。

Keyword spotting is the task of detecting a keyword in streaming audio. Conventional keyword spotting targets predefined keywords classification, but there is growing attention in few-shot (query-by-example) keyword spotting, e.g., N-way classification given M-shot support samples. Moreover, in real-world scenarios, there can be utterances from unexpected categories (open-set) which need to be rejected rather than classified as one of the N classes. Combining the two needs, we tackle few-shot open-set keyword spotting with a new benchmark setting, named splitGSC. We propose episode-known dummy prototypes based on metric learning to detect an open-set better and introduce a simple and powerful approach, Dummy Prototypical Networks (D-ProtoNets). Our D-ProtoNets shows clear margins compared to recent few-shot open-set recognition (FSOSR) approaches in the suggested splitGSC. We also verify our method on a standard benchmark, miniImageNet, and D-ProtoNets shows the state-of-the-art open-set detection rate in FSOSR.

下载PDF全文

下载文献需遵守相关版权规定

论文标题