论文标题
评论在论坛讨论中排名多元化
Comment Ranking Diversification in Forum Discussions
论文作者
论文摘要
查看具有数百或更多评论的讨论论坛的消费取决于排名,因为大多数用户仅查看排名最高的评论。当评论以有序分数(例如答复或上票的数量)排名而不调整近排名评论的语义相似性时,排名最高的评论更有可能强调多数意见和产生的冗余。在本文中,我们提出了使用最大边缘相关性(MMR)的顶级K评论多元化重新排列模型,并评估其在三类中的影响:(1)语义多样性,(2)在哈佛课程讨论论坛的背景下,将较低级别评论的语义纳入较低级别的评论的语义和(3)冗余。我们进行了一个双盲小规模的评估实验,要求受试者在多元化排名的前5条评论和得分订购的基线排名之间进行选择。对于三个受试者,在100个试验中,受试者选择了多样化的(75%得分,25%多元化),排名显着(1)更多样化,(2)更具包容性和(3)冗余。在每个类别中,评估者间的可靠性显示出适度的一致性,典型的Cohen-kappa得分接近0.2。我们的发现表明,在在线讨论论坛中,我们的模型改善了(1)多元化,(2)纳入和(3)冗余。
Viewing consumption of discussion forums with hundreds or more comments depends on ranking because most users only view top-ranked comments. When comments are ranked by an ordered score (e.g. number of replies or up-votes) without adjusting for semantic similarity of near-ranked comments, top-ranked comments are more likely to emphasize the majority opinion and incur redundancy. In this paper, we propose a top K comment diversification re-ranking model using Maximal Marginal Relevance (MMR) and evaluate its impact in three categories: (1) semantic diversity, (2) inclusion of the semantics of lower-ranked comments, and (3) redundancy, within the context of a HarvardX course discussion forum. We conducted a double-blind, small-scale evaluation experiment requiring subjects to select between the top 5 comments of a diversified ranking and a baseline ranking ordered by score. For three subjects, across 100 trials, subjects selected the diversified (75% score, 25% diversification) ranking as significantly (1) more diverse, (2) more inclusive, and (3) less redundant. Within each category, inter-rater reliability showed moderate consistency, with typical Cohen-Kappa scores near 0.2. Our findings suggest that our model improves (1) diversification, (2) inclusion, and (3) redundancy, among top K ranked comments in online discussion forums.