论文标题
虚拟典型网络,用于几个开放式关键字点斑点
Dummy Prototypical Networks for Few-Shot Open-Set Keyword Spotting
论文作者
论文摘要
关键字斑点是检测流音频中的关键字的任务。传统的关键字点斑点目标预定义的关键字分类,但是在几个射击(示例)关键字点斑点(例如,N-way分类给定M-Shot支持样本)中越来越关注。此外,在实际情况下,可能会有意外类别(开放设定)的话语,需要被拒绝,而不是归类为N类之一。结合了两个需求,我们将几个开放键关键字点斑点与名为SplitGSC的新基准设置进行了处理。我们提出了基于公制学习的情节 - 已知的虚拟原型,以更好地检测开放式设定,并引入一种简单而强大的方法,虚拟原型网络(D-Protonets)。与最新的SplitGSC中的几个射击开放式识别(FSOSR)方法相比,我们的D-Protonets显示出明显的边缘。我们还在标准基准测试,迷你胶原和D-Protonets上验证了我们的方法,显示了FSOSR中最新的开放式检测率。
Keyword spotting is the task of detecting a keyword in streaming audio. Conventional keyword spotting targets predefined keywords classification, but there is growing attention in few-shot (query-by-example) keyword spotting, e.g., N-way classification given M-shot support samples. Moreover, in real-world scenarios, there can be utterances from unexpected categories (open-set) which need to be rejected rather than classified as one of the N classes. Combining the two needs, we tackle few-shot open-set keyword spotting with a new benchmark setting, named splitGSC. We propose episode-known dummy prototypes based on metric learning to detect an open-set better and introduce a simple and powerful approach, Dummy Prototypical Networks (D-ProtoNets). Our D-ProtoNets shows clear margins compared to recent few-shot open-set recognition (FSOSR) approaches in the suggested splitGSC. We also verify our method on a standard benchmark, miniImageNet, and D-ProtoNets shows the state-of-the-art open-set detection rate in FSOSR.