论文标题
Rodian:强大的中位数
RODIAN: Robustified Median
论文作者
论文摘要
我们提出了一种强大的方法,用于平均由大部分异常值污染的数字。我们的方法被称为Rodian,灵感来自Minpran [1]的关键思想:我们假设离群值均匀分布在数据范围内,我们搜索最不可能仅包含异常值的区域。然后,该区域内的数据中位数被视为Rodian。我们的方法可以准确地估算出超过50%的离群值的数据的真实均值,并在时间$ o(n \ log n)$中运行。与其他鲁棒技术不同,它是完全确定性的,并且不依赖已知的近距离误差。我们广泛的评估表明,罗迪安比中位数和最小平方英尺要强大得多。在非均匀异常分布的情况下,此结果也存在。
We propose a robust method for averaging numbers contaminated by a large proportion of outliers. Our method, dubbed RODIAN, is inspired by the key idea of MINPRAN [1]: We assume that the outliers are uniformly distributed within the range of the data and we search for the region that is least likely to contain outliers only. The median of the data within this region is then taken as RODIAN. Our approach can accurately estimate the true mean of data with more than 50% outliers and runs in time $O(n\log n)$. Unlike other robust techniques, it is completely deterministic and does not rely on a known inlier error bound. Our extensive evaluation shows that RODIAN is much more robust than the median and the least-median-of-squares. This result also holds in the case of non-uniform outlier distributions.