论文标题

Chase-分布式的混合CPU-GPU eigensolver,用于大规模遗传学特征值问题

ChASE -- A Distributed Hybrid CPU-GPU Eigensolver for Large-scale Hermitian Eigenvalue Problems

论文作者

Wu, Xinzhe, Davidovic, Davor, Achilles, Sebastian, Di Napoli, Edoardo

论文摘要

随着现代平行群体的较高的计算节点的越来越大,传统的平行特征材料(例如直接求解器),由于互动和同步的额外层次而努力保持与硬件演化的步伐,并能够有效地扩展。当将传统库移植到配备加速器(例如图形处理单元(GPU))的异质计算体系结构中时,这种困难尤其重要。最近,对基于滤波器的子空间本质量器的开发做出了重大科学贡献,以计算部分特征性。这些类型的算法的更简单结构使它们更容易避免典型的直接求解器的通信和同步瓶颈。 Chebyshev加速的亚空间本质量(Chase)是现代的子空间特征者,可通过基于Chebyshev多项式的过滤器来计算大规模赫米尔特人本本特征的部分极端特征仪。在这项工作中,我们通过增加对分布式混合CPU-Multi-GPU计算体系结构的支持来扩展对Chase的先前工作。我们的测试表明,Chase的缩放性能非常好,最高可达144个节点,而526 NVIDIA A100 GPU总数为$ 360 $ k。

As modern massively parallel clusters are getting larger with beefier compute nodes, traditional parallel eigensolvers, such as direct solvers, struggle keeping the pace with the hardware evolution and being able to scale efficiently due to additional layers of communication and synchronization. This difficulty is especially important when porting traditional libraries to heterogeneous computing architectures equipped with accelerators, such as Graphics Processing Unit (GPU). Recently, there have been significant scientific contributions to the development of filter-based subspace eigensolver to compute partial eigenspectrum. The simpler structure of these type of algorithms makes for them easier to avoid the communication and synchronization bottlenecks typical of direct solvers. The Chebyshev Accelerated Subspace Eigensolver (ChASE) is a modern subspace eigensolver to compute partial extremal eigenpairs of large-scale Hermitian eigenproblems with the acceleration of a filter based on Chebyshev polynomials. In this work, we extend our previous work on ChASE by adding support for distributed hybrid CPU-multi-GPU computing architectures. Our tests show that ChASE achieves very good scaling performance up to 144 nodes with 526 NVIDIA A100 GPUs in total on dense eigenproblems of size up to $360$k.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源