论文标题
在CTR预测中用于细粒特征学习的多相互关注网络
Multi-Interactive Attention Network for Fine-grained Feature Learning in CTR Prediction
论文作者
论文摘要
在点击率(CTR)预测方案中,用户的顺序行为被很好地用于捕获最近文献中的用户兴趣。但是,尽管经过广泛的研究,但这些顺序方法仍然受到三个局限性。首先,现有方法主要利用对用户行为的关注,这并不总是适合CTR预测,因为用户经常单击与任何与任何历史行为无关的新产品。其次,在实际情况下,有许多用户很久以前进行了操作,但近期是相对不活动的。因此,很难通过早期行为准确地捕获用户的当前偏好。第三,在不同特征子空间中用户历史行为的多个表示。为了解决这些问题,我们提出了一个多相互注意力网络(MIAN),以全面地提取各种细粒度的特征(例如,用户方面的性别,年龄和职业)之间的潜在关系。具体而言,Mian包含一个多相互作用层(MIL),该层集成了三个局部交互模块,以通过顺序行为捕获用户偏好的多个表示,并同时利用了精细的用户特异性和上下文信息。此外,我们设计了一个全局交互模块(GIM),以学习高阶交互并平衡多个功能的不同影响。最后,离线实验来自三个数据集的结果,以及大规模推荐系统中的在线A/B测试,证明了我们提出的方法的有效性。
In the Click-Through Rate (CTR) prediction scenario, user's sequential behaviors are well utilized to capture the user interest in the recent literature. However, despite being extensively studied, these sequential methods still suffer from three limitations. First, existing methods mostly utilize attention on the behavior of users, which is not always suitable for CTR prediction, because users often click on new products that are irrelevant to any historical behaviors. Second, in the real scenario, there exist numerous users that have operations a long time ago, but turn relatively inactive in recent times. Thus, it is hard to precisely capture user's current preferences through early behaviors. Third, multiple representations of user's historical behaviors in different feature subspaces are largely ignored. To remedy these issues, we propose a Multi-Interactive Attention Network (MIAN) to comprehensively extract the latent relationship among all kinds of fine-grained features (e.g., gender, age and occupation in user-profile). Specifically, MIAN contains a Multi-Interactive Layer (MIL) that integrates three local interaction modules to capture multiple representations of user preference through sequential behaviors and simultaneously utilize the fine-grained user-specific as well as context information. In addition, we design a Global Interaction Module (GIM) to learn the high-order interactions and balance the different impacts of multiple features. Finally, Offline experiment results from three datasets, together with an Online A/B test in a large-scale recommendation system, demonstrate the effectiveness of our proposed approach.