论文标题
与分区偏好进行学习级别:Plackett-luce模型的快速估计
Learning-to-Rank with Partitioned Preference: Fast Estimation for the Plackett-Luce Model
论文作者
论文摘要
我们在具有分区偏好的数据上研究了基于Plackett-luce(PL)模型的LISTWIES WILLEWWIES WILLESWISE-LEALch-trank(LTR),其中一组项目被切成有序和分离分区,但是分区中项目的排名尚不清楚。给定带有$ m $分区的$ n $项目,在PL模型下计算带有分区优先级的数据的可能性的时间复杂度为$ O(n+s!)$,其中$ s $是顶级$ M-1 $分区的最大大小。这项计算挑战将大多数基于PL的列表ltr方法限制为划分的偏好特殊情况,即$ k $排名,其中最佳$ k $项目的确切顺序是已知的。在本文中,我们利用了PL模型的随机效用模型公式,并提出了一种有效的数值集成方法,用于计算具有时间复杂性$ O(N+S^3)$的可能性及其梯度。我们证明,所提出的方法的表现优于众所周知的LTR基线,并且通过模拟实验和应用程序对现实世界中极端多标签分类任务保持可扩展。
We investigate the Plackett-Luce (PL) model based listwise learning-to-rank (LTR) on data with partitioned preference, where a set of items are sliced into ordered and disjoint partitions, but the ranking of items within a partition is unknown. Given $N$ items with $M$ partitions, calculating the likelihood of data with partitioned preference under the PL model has a time complexity of $O(N+S!)$, where $S$ is the maximum size of the top $M-1$ partitions. This computational challenge restrains most existing PL-based listwise LTR methods to a special case of partitioned preference, top-$K$ ranking, where the exact order of the top $K$ items is known. In this paper, we exploit a random utility model formulation of the PL model, and propose an efficient numerical integration approach for calculating the likelihood and its gradients with a time complexity $O(N+S^3)$. We demonstrate that the proposed method outperforms well-known LTR baselines and remains scalable through both simulation experiments and applications to real-world eXtreme Multi-Label classification tasks.