多类原型算法在度量空间中的普遍一致性和收敛速率

论文标题

多类原型算法在度量空间中的普遍一致性和收敛速率

Universal consistency and rates of convergence of multiclass prototype algorithms in metric spaces

论文作者

Györfi, László, Weiss, Roi

论文摘要

我们研究了简单的最近邻居原型规则的普遍一致性和收敛速率，这些规则是指标速度中多类分类问题的问题。我们首先表明，在任何承认普遍一致的规则的度量空间中，一种名为Proto-NN的新颖的数据依赖性分区规则都是普遍一致的。 Proto-NN是OptInet的重要简化，这是一种最近提出的基于压缩的算法，迄今为止，它是唯一已知在这种一般环境中普遍一致的算法。 Practically, Proto-NN is simpler to implement and enjoys reduced computational complexity. We then proceed to study convergence rates of the excess error probability. We first obtain rates for the standard $k$-NN rule under a margin condition and a new generalized-Lipschitz condition. The latter is an extension of a recently proposed modified-Lipschitz condition from $\mathbb R^d$ to metric spaces. Similarly to the modified-Lipschitz condition, the new condition avoids any boundness assumptions on the data distribution.在获得原始NN的价格时，我们表明，第二个原型规则在$ k $ -nn和Proto-NN之间杂交的速率与$ K $ -NN相同，同时享有与原始NN相同的计算优势。 However, as $k$-NN, this hybrid rule is not consistent in general.

We study universal consistency and convergence rates of simple nearest-neighbor prototype rules for the problem of multiclass classification in metric paces. We first show that a novel data-dependent partitioning rule, named Proto-NN, is universally consistent in any metric space that admits a universally consistent rule. Proto-NN is a significant simplification of OptiNet, a recently proposed compression-based algorithm that, to date, was the only algorithm known to be universally consistent in such a general setting. Practically, Proto-NN is simpler to implement and enjoys reduced computational complexity. We then proceed to study convergence rates of the excess error probability. We first obtain rates for the standard $k$-NN rule under a margin condition and a new generalized-Lipschitz condition. The latter is an extension of a recently proposed modified-Lipschitz condition from $\mathbb R^d$ to metric spaces. Similarly to the modified-Lipschitz condition, the new condition avoids any boundness assumptions on the data distribution. While obtaining rates for Proto-NN is left open, we show that a second prototype rule that hybridizes between $k$-NN and Proto-NN achieves the same rates as $k$-NN while enjoying similar computational advantages as Proto-NN. However, as $k$-NN, this hybrid rule is not consistent in general.

下载PDF全文

下载文献需遵守相关版权规定

论文标题