使用高斯偏见分布来检测新闻文章中的媒体偏见

论文标题

使用高斯偏见分布来检测新闻文章中的媒体偏见

Detecting Media Bias in News Articles using Gaussian Bias Distributions

论文作者

Chen, Wei-Fan, Al-Khatib, Khalid, Stein, Benno, Wachsmuth, Henning

论文摘要

媒体在塑造公众舆论中起着重要作用。有偏见的媒体可以在不良方向上影响人们，因此应该揭露这样的态度。我们观察到，基于功能和神经文本分类方法仅依赖于低级词汇信息的分布无法检测到媒体偏见。对于新事件的文章，这种弱点最为明显，在新事件中，在新事件中出现单词，因此“偏见预测”尚不清楚。因此，在本文中，我们研究文章中有关偏见陈述的二阶信息如何有助于提高检测效率。特别是，我们利用高斯混合模型中词汇和信息句子级偏置的频率，位置和顺序顺序的概率分布。在现有的媒体偏见数据集上，我们发现偏见语句的频率和位置强烈影响文章级别偏差，而它们的确切顺序顺序是次要的。使用用于句子级偏差检测的标准模型，我们提供了经验证据表明，使用二阶信息的文章级偏差检测器清楚地表现出色。

Media plays an important role in shaping public opinion. Biased media can influence people in undesirable directions and hence should be unmasked as such. We observe that featurebased and neural text classification approaches which rely only on the distribution of low-level lexical information fail to detect media bias. This weakness becomes most noticeable for articles on new events, where words appear in new contexts and hence their "bias predictiveness" is unclear. In this paper, we therefore study how second-order information about biased statements in an article helps to improve detection effectiveness. In particular, we utilize the probability distributions of the frequency, positions, and sequential order of lexical and informational sentence-level bias in a Gaussian Mixture Model. On an existing media bias dataset, we find that the frequency and positions of biased statements strongly impact article-level bias, whereas their exact sequential order is secondary. Using a standard model for sentence-level bias detection, we provide empirical evidence that article-level bias detectors that use second-order information clearly outperform those without.

下载PDF全文

下载文献需遵守相关版权规定

论文标题