计算物理 ›› 2015, Vol. 32 ›› Issue (6): 631-638.

• 论文 •    下一篇

基于第一性原理量子输运模拟的并行计算

张若兴1, 侯士敏2, 丑强1   

  1. 1. 国家核电技术公司 北京软件技术中心国家能源核电软件重点实验室, 北京 100029;
    2. 北京大学 电子学系 纳米器件物理与化学教育部重点实验室, 北京 100871
  • 收稿日期:2014-11-03 修回日期:2015-02-09 出版日期:2015-11-25 发布日期:2015-11-25
  • 作者简介:张若兴(1980-),男,河北邢台,博士,高级工程师,主要研究领域为高性能网格计算和云计算、大数据分析技术、概率安全分析软件开发,E-mail:zhangruoxing@snptc.com.cn
  • 基金资助:
    国家核电技术公司员工自主创新项目(SNP-KJ-CX-2015-27);大型先进压水堆核电站重大专项核电关键设计软件自主化技术研究课题(2011ZX06004-024)资助

Parallel Computing of First-principles Based Quantum Transport Simulations

ZHANG Ruoxing1, HOU Shimin2, CHOU Qiang1   

  1. 1. National Energy Key Laboratory of Nuclear Power Software, Software Development Center, State Nuclear Power Technology Corporation, Beijing 100029, China;
    2. Key Laboratory for the Physics and Chemistry of Nanodevices, Department of Electronics, Peking University, Beijing 100871, China
  • Received:2014-11-03 Revised:2015-02-09 Online:2015-11-25 Published:2015-11-25

摘要: 为了解决基于第一性原理分析计算大尺度量子输运体系时遇到的耗时长久问题,挖掘密度泛函理论与非平衡格林函数相结合方法(DFT+NEGF方法)在自洽迭代过程中的计算热点,就计算电子密度矩阵时的能量点积分和计算格林函数时的矩阵求逆/乘法运算提出MPI/Open MP并行计算方案.能量点积分采用MPI多进程并行方案,在数据初始化时需要将稀疏矩阵和积分能量点依照轮询调度算法分配给各进程.矩阵求逆/乘法的并行化既可调用ScaLAPACK子程序实现又可调用IntelMKL数学库中的OpenMP多线程加速函数实现.由于不同能量点计算的独立性,能量点积分采用的MPI并行计算获得近乎线性的加速比曲线.由于Open MP多线程并行采用的是基于共享内存的数据交换机制以及线程间切换通信开销小,矩阵求逆/乘法运算的OpenMP并行实现在计算效率上要优于而在程序的可扩展性上要劣于MPI多进程并行实现.

关键词: 第一性原理计算, 密度泛函理论, 格林函数, MPI, 并行计算

Abstract: To solve long time-consuming problem in analysis of large-scale quantum transport systems based on first-principles calculations, we analyze hot spots of self-consistent iterations within the framework that combines non-equilibrium Green's function with density functional theory, namely DFT+NEGF method. Two parallel computing schemes based on MPI/OpenMP are proposed to deal with energy point integration and matrix inversion/multiplication. For energy point integral parallelism, sparse matrix as well as energy points should be assigned to each process over data initialization according to round-robin scheduling algorithm. Either MPI based ScaLAPACK subroutine or OpenMP based Intel MKL subroutine can be called to realize matrix inversion/multiplication parallelization. A sub-linear speedup ratio curve is obtained for energy point integral parallelism due to the fact that calculations related with different energy points are mutually independent. OpenMP parallelism adopts shared memory patterned data exchange mechanism and overhead of switching threads is rather small, and consequently it is better in computing efficiency but worse in code scalability than MPI implementation.

Key words: first-principles calculations, density functional theory, Green's function, MPI, parallel computing

中图分类号: