论文标题
大规模机器学习的表型显着改善了视神经头形态的基因组发现
Large-scale machine learning-based phenotyping significantly improves genomic discovery for optic nerve head morphology
论文作者
论文摘要
全基因组关联研究(GWAS)需要准确的队列表型,但是专家标签可能是昂贵,耗时和可变的。在这里,我们开发了机器学习(ML)模型,以预测彩色眼睛照片中的青光眼视神经头特征。我们使用该模型来预测垂直杯盘比(VCDR),这是英国生物库(UKB)的65,680名欧洲人的青光眼的诊断参数和基质内表型。基于ML的VCDR的GWA鉴定了299个独立基因组显着(GWS; $ p \ leq5 \ times10^{ - 8} $)命中156个基因座。总部位于ML的GWA从UKB最近的VCDR GWA中复制了65个GWS基因座中的62个,其中两位眼科医生手动为67,040名欧洲人手动标记了图像。基于ML的GWA还确定了92个新颖的基因座,显着扩展了我们对青光眼和VCDR遗传病因的理解。途径分析支持新颖命中对VCDR的生物学意义,其中精选的基因座接近基因参与神经元和突触生物学,或已知会引起严重的Mendelian Ophthalmic疾病。最后,基于ML的GWAS可显着改善独立的Epic-Norfolk队列中VCDR和原发性开角青光眼的多基因预测。
Genome-wide association studies (GWAS) require accurate cohort phenotyping, but expert labeling can be costly, time-intensive, and variable. Here we develop a machine learning (ML) model to predict glaucomatous optic nerve head features from color fundus photographs. We used the model to predict vertical cup-to-disc ratio (VCDR), a diagnostic parameter and cardinal endophenotype for glaucoma, in 65,680 Europeans in the UK Biobank (UKB). A GWAS of ML-based VCDR identified 299 independent genome-wide significant (GWS; $P\leq5\times10^{-8}$) hits in 156 loci. The ML-based GWAS replicated 62 of 65 GWS loci from a recent VCDR GWAS in the UKB for which two ophthalmologists manually labeled images for 67,040 Europeans. The ML-based GWAS also identified 92 novel loci, significantly expanding our understanding of the genetic etiologies of glaucoma and VCDR. Pathway analyses support the biological significance of the novel hits to VCDR, with select loci near genes involved in neuronal and synaptic biology or known to cause severe Mendelian ophthalmic disease. Finally, the ML-based GWAS results significantly improve polygenic prediction of VCDR and primary open-angle glaucoma in the independent EPIC-Norfolk cohort.