论文标题
通过深入的强化学习来模拟医生的推理,以迈向值得信赖的自动诊断系统
Towards Trustworthy Automatic Diagnosis Systems by Emulating Doctors' Reasoning with Deep Reinforcement Learning
论文作者
论文摘要
医学证据的自动化和诊断过程的自动化最近引起了人们越来越多的关注,以减少医生的工作量并民主化获得医疗服务。但是,机器学习文献中提出的大多数作品仅着重于提高患者病理学的预测准确性。我们认为,这一目标不足以确保医生对此类系统的可接受性。在与患者的最初互动中,医生不仅专注于识别患者患有病理的病理;相反,它们会产生鉴别诊断(以简短的合理疾病列表的形式),因为从患者那里收集的医学证据通常不足以确定最终诊断。此外,医生在将其排除在差分中,尤其是在急性护理环境中,显式探索严重的病理。最后,要使医生相信系统的建议,他们需要了解聚集的证据如何导致预测的疾病。特别是,系统与患者之间的相互作用需要模仿医生的推理。因此,我们建议使用深入的强化学习框架对证据的获取和自动诊断任务进行建模,该框架考虑了医生推理的三个基本方面,即使用勘探确认方法进行差异诊断,同时优先考虑严重的病理。我们建议根据这三个方面评估相互作用质量的指标。我们表明,我们的方法的性能比现有模型更好,同时保持竞争性病理预测准确性。
The automation of the medical evidence acquisition and diagnosis process has recently attracted increasing attention in order to reduce the workload of doctors and democratize access to medical care. However, most works proposed in the machine learning literature focus solely on improving the prediction accuracy of a patient's pathology. We argue that this objective is insufficient to ensure doctors' acceptability of such systems. In their initial interaction with patients, doctors do not only focus on identifying the pathology a patient is suffering from; they instead generate a differential diagnosis (in the form of a short list of plausible diseases) because the medical evidence collected from patients is often insufficient to establish a final diagnosis. Moreover, doctors explicitly explore severe pathologies before potentially ruling them out from the differential, especially in acute care settings. Finally, for doctors to trust a system's recommendations, they need to understand how the gathered evidences led to the predicted diseases. In particular, interactions between a system and a patient need to emulate the reasoning of doctors. We therefore propose to model the evidence acquisition and automatic diagnosis tasks using a deep reinforcement learning framework that considers three essential aspects of a doctor's reasoning, namely generating a differential diagnosis using an exploration-confirmation approach while prioritizing severe pathologies. We propose metrics for evaluating interaction quality based on these three aspects. We show that our approach performs better than existing models while maintaining competitive pathology prediction accuracy.