论文标题

关于合并功能工程和深度学习,以诊断,风险预测和年龄估计基于12铅ECG

On Merging Feature Engineering and Deep Learning for Diagnosis, Risk-Prediction and Age Estimation Based on the 12-Lead ECG

论文作者

Zvuloni, Eran, Read, Jesse, Ribeiro, Antônio H., Ribeiro, Antonio Luiz P., Behar, Joachim A.

论文摘要

目的:机器学习技术已广泛用于12铅心电图(ECG)分析。对于生理时间序列,基于领域知识的深度学习(DL)优势仍然是一个悬而未决的问题。此外,尚不清楚将DL与FE结合起来是否可以提高性能。方法:我们考虑了要解决以下研究差距的三个任务:心律不齐诊断(多类 - 甲比分类),房颤风险预测(二进制分类)和年龄估计(回归)。我们使用了2.3m 12铅ECG录音的总体数据集来训练每个任务的以下模型:i)随机森林以Fe为输入作为经典的机器学习方法培训; ii)端到端DL模型; iii)Fe+DL的合并模型。结果:FE得出的结果可比DL可比,同时需要较少的两个分类任务数据,并且在回归任务中,DL的表现优于DL。对于所有任务,将FE与DL合并并不能单独提高DL的性能。结论:我们发现,对于传统的12铅ECG诊断任务,DL并未对FE产生有意义的改进,而它显着改善了非传统回归任务。我们还发现,将FE与DL相结合并不能单独改善DL,这表明FE与DL学到的功能相关。意义:我们的发现提供了有关基于12铅ECG开发新机器学习模型的任务的机器学习策略和数据制度的重要建议。

Objective: Machine learning techniques have been used extensively for 12-lead electrocardiogram (ECG) analysis. For physiological time series, deep learning (DL) superiority to feature engineering (FE) approaches based on domain knowledge is still an open question. Moreover, it remains unclear whether combining DL with FE may improve performance. Methods: We considered three tasks intending to address these research gaps: cardiac arrhythmia diagnosis (multiclass-multilabel classification), atrial fibrillation risk prediction (binary classification), and age estimation (regression). We used an overall dataset of 2.3M 12-lead ECG recordings to train the following models for each task: i) a random forest taking the FE as input was trained as a classical machine learning approach; ii) an end-to-end DL model; and iii) a merged model of FE+DL. Results: FE yielded comparable results to DL while necessitating significantly less data for the two classification tasks and it was outperformed by DL for the regression task. For all tasks, merging FE with DL did not improve performance over DL alone. Conclusion: We found that for traditional 12-lead ECG based diagnosis tasks DL did not yield a meaningful improvement over FE, while it improved significantly the nontraditional regression task. We also found that combining FE with DL did not improve over DL alone which suggests that the FE were redundant with the features learned by DL. Significance: Our findings provides important recommendations on what machine learning strategy and data regime to chose with respect to the task at hand for the development of new machine learning models based on the 12-lead ECG.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源