论文标题
材料科学中的符号回归:从数据中发现原子间潜力
Symbolic Regression in Materials Science: Discovering Interatomic Potentials from Data
论文作者
论文摘要
基于粒子的原子量表建模在新材料的发展和对其性质的理解中起着重要作用。粒子模拟的准确性由原子间电位确定,该电位允许计算原子系统的势能作为原子坐标和潜在的其他特性的函数。基于原理的临界电位可以达到任意的准确性水平,但是它们的合理性受其高计算成本的限制。 机器学习(ML)最近已成为一种有效的方法,可以通过用经过电子结构数据培训的高效替代物代替昂贵的模型来抵消原子势势的高计算成本。在当前大量方法中,符号回归(SR)正在成为一种强大的“白盒”方法,以发现原子质潜力的功能形式。 这项贡献讨论了符号回归在材料科学(MS)中的作用,并全面概述了当前的方法论挑战和最先进的结果。提出了一种基于遗传编程的方法,用于对原始数据(由原子位置和相关势能的快照组成)建模原子电位进行建模,并在从头算电子结构数据上进行了经验验证。
Particle-based modeling of materials at atomic scale plays an important role in the development of new materials and understanding of their properties. The accuracy of particle simulations is determined by interatomic potentials, which allow to calculate the potential energy of an atomic system as a function of atomic coordinates and potentially other properties. First-principles-based ab initio potentials can reach arbitrary levels of accuracy, however their aplicability is limited by their high computational cost. Machine learning (ML) has recently emerged as an effective way to offset the high computational costs of ab initio atomic potentials by replacing expensive models with highly efficient surrogates trained on electronic structure data. Among a plethora of current methods, symbolic regression (SR) is gaining traction as a powerful "white-box" approach for discovering functional forms of interatomic potentials. This contribution discusses the role of symbolic regression in Materials Science (MS) and offers a comprehensive overview of current methodological challenges and state-of-the-art results. A genetic programming-based approach for modeling atomic potentials from raw data (consisting of snapshots of atomic positions and associated potential energy) is presented and empirically validated on ab initio electronic structure data.