混合增强医疗报告生成，并以M线性的关注和重复罚款

论文标题

混合增强医疗报告生成，并以M线性的关注和重复罚款

Hybrid Reinforced Medical Report Generation with M-Linear Attention and Repetition Penalty

论文作者

Xu, Wenting, Xu, Zhenghua, Chen, Junyang, Qi, Chang, Lukasiewicz, Thomas

论文摘要

为了减少医生的工作量，基于深度学习的自动医疗报告的产生最近吸引了越来越多的研究工作，其中使用深层卷积神经网络（CNN）来编码输入图像，并使用经常性的神经网络（RNN）将视觉特征解码为自动报告。但是，这些最先进的方法主要遭受三个缺点：（i）不可思议的优化，（ii）低阶和一维注意机制，以及（iii）重复产生。在本文中，我们提出了一种混合增强医学报告生成方法，并具有M线性的关注和重复惩罚机制（HREMRG-MR）来克服这些问题。具体而言，使用具有不同权重的混合奖励来纠正基于单金属的奖励的局限性。我们还提出了一种具有线性复杂性的搜索算法，以近似最佳的重量组合。此外，我们使用M线性注意模块来探索高级特征交互并实现多模式推理，而重复惩罚在模型的训练过程中对重复条款适用罚款。关于两个公共数据集的广泛实验研究表明，就所有指标而言，HREMRG-MR的表现大大优于最先进的基线。我们还进行了一系列消融实验，以证明我们所有提出的组件的有效性。我们还进行了奖励搜索玩具实验，以证明我们提出的搜索方法可以大大减少搜索时间，同时近似最佳性能。

To reduce doctors' workload, deep-learning-based automatic medical report generation has recently attracted more and more research efforts, where deep convolutional neural networks (CNNs) are employed to encode the input images, and recurrent neural networks (RNNs) are used to decode the visual features into medical reports automatically. However, these state-of-the-art methods mainly suffer from three shortcomings: (i) incomprehensive optimization, (ii) low-order and unidimensional attention mechanisms, and (iii) repeated generation. In this article, we propose a hybrid reinforced medical report generation method with m-linear attention and repetition penalty mechanism (HReMRG-MR) to overcome these problems. Specifically, a hybrid reward with different weights is employed to remedy the limitations of single-metric-based rewards. We also propose a search algorithm with linear complexity to approximate the best weight combination. Furthermore, we use m-linear attention modules to explore high-order feature interactions and to achieve multi-modal reasoning, while a repetition penalty applies penalties to repeated terms during the model's training process. Extensive experimental studies on two public datasets show that HReMRG-MR greatly outperforms the state-of-the-art baselines in terms of all metrics. We also conducted a series of ablation experiments to prove the effectiveness of all our proposed components. We also performed a reward search toy experiment to give evidence that our proposed search approach can significantly reduce the search time while approximating the best performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题