LBM伪势MRT三维模型GPU并行计算的性能优化

doi:10.19596/j.cnki.1001-246x.7698

计算物理 ›› 2018, Vol. 35 ›› Issue (5): 554-562.DOI: 10.19596/j.cnki.1001-246x.7698

LBM伪势MRT三维模型GPU并行计算的性能优化

彭浩¹, 单鸣雷^1,2, 朱昌平^1,2, 姚澄^1,2

1. 河海大学常州市传感网与环境感知重点实验室并江苏省输配电装备技术重点实验室, 常州 213022;
2. 江苏省“世界水谷”与水生态文明协同创新中心, 南京 211100

收稿日期:2017-05-17 修回日期:2017-07-21 出版日期:2018-09-25 发布日期:2018-09-25
通讯作者: 单鸣雷(1977-),男,讲师,博士,主要从事格子Boltzmann方法多相流建模,E-mail:shanming2003@126.com
作者简介:彭浩(1992-),男,湖北荆门,硕士研究生,主要从事GPU并行计算研究
基金资助:
国家重点研发计划（2016YFC0401606），江苏省重点研发计划（BE2016056）及江苏省自然科学基金（SBK2014043338）资助项目

Performance Optimization of 3D Pseudopotential Multi-Relaxation-Time Lattice Boltzmann Model on GPU

PENG Hao¹, SHAN Minglei^1,2, ZHU Changping^1,2, YAO Cheng^1,2

1. Changzhou Key Laboratory of Sensor Networks and Environmental Sensing, Jiangsu Key Laboratory of Power Transmission and Distribution Equipment Technology, Hohai University, Changzhou 213022, China;
2. Jiangsu Provincial Collaborative Innovation Center of World Water Valley and Water Ecological Civilization, Nanjing 211100, China

Received:2017-05-17 Revised:2017-07-21 Online:2018-09-25 Published:2018-09-25

摘要/Abstract

摘要： 格子Boltzmann方法伪势模型算法中的格点间计算未完全局部化，因此在并行计算时需要更多次的全局内存读写、使用更多数量的寄存器和线程同步操作，从而导致GPU并行计算效率下降.本文针对伪势模型并行计算的局限性，基于三维十五速格子结构的多松弛时间伪势模型，以气液相分离为算例，通过合并访问的方式提高全局内存的读写效率；并提出一种"定向转移"算法，提高格子边界格点获取邻居格点数据的效率；最后探索不同资源分配中各种因素对计算效率的影响，总结最优资源分配的方法.

关键词: LBM, 伪势模型, GPU, 并行计算, 性能优化

Abstract: Pseudopotential model of lattice Boltzmann method is partially non-local for pseudopotential calculation with coupling of lattices, which leads to synchronization of threads in parallel implementation process. Besides, it uses a large number of registers and much time of data access operations when access global memory in calculation process. They lead to low computational efficiency. In this paper, a multi-relaxation-time(MRT) 3D pseudopotential model with D3Q15 lattice is adopted as an example to investigate performance of parallel computing based on GPU. To address limitation of parallel computing of pseudo-potential model, efficiency of reading and writing of global memory is improved by using merge access method. To improve efficiency of grids retrieving data which are in boundary of lattice, a "Directional Transfer" algorithm is proposed. The role of computing resource configuration is investigated with different sizes of block, and optimal resource configuration scheme is obtained.

Key words: LBM, pseudopotential model, GPU, parallel computing, performance optimization

中图分类号:

TQ021.1

彭浩, 单鸣雷, 朱昌平, 姚澄. LBM伪势MRT三维模型GPU并行计算的性能优化[J]. 计算物理, 2018, 35(5): 554-562.

PENG Hao, SHAN Minglei, ZHU Changping, YAO Cheng. Performance Optimization of 3D Pseudopotential Multi-Relaxation-Time Lattice Boltzmann Model on GPU[J]. CHINESE JOURNAL OF COMPUTATIONAL PHYSICS, 2018, 35(5): 554-562.

参考文献

[1] XU A G, ZHANG G C, LI Y J, et al. Modeling and simulation of nonequilibrium and multiphase complex systems-Lattice Boltzmann kinetic theory and application[J]. Progress in Physics, 2014, 34(3):136-167.
[2] REN Q, CHAN C L. GPU accelerated numerical study of PCM melting process in an enclosure with internal fins using lattice Boltzmann method[J]. International Journal of Heat & Mass Transfer, 2016, 100:522-535.
[3] XU A G, ZHANG G C, GAN Y B, et al. Lattice Boltzmann modeling and simulation of compressible flows[J]. Frontiers of Physics, 2012, 7(5):582-600.
[4] 何雅玲, 王勇, 李庆. 格子Boltzmann方法的理论及应用[M]. 北京:科学出版社, 2009:12-26.
[5] 郭照立, 郑楚光. 格子Boltzmann方法的原理及应用[M]. 北京:科学出版社, 2009:45-48.
[6] SUKOP M C, DANIEL T, THORNE Jr. Lattice Boltzmann modeling an introduction for geoscientists and engineers[M]. Berlin Heidelberg:Springer, 2006:23-26.
[7] ZHU Lianhua, GUO Zhaoli. GPU accelerated lattice Boltzmann simulation of flow in porous media[J].Chinese Journal of Computational Physics, 2015, 32(1):20-27.
[8] VALERO-LARA P, JANSSON J. Heterogeneous CPU+GPU approaches for mesh refinement over lattice-Boltzmann simulations[J]. Concurrency & Computation Practice & Experience, 2016, 29(7):163-183.
[9] 黄昌盛, 张文欢, 侯志敏, 等. 基于CUDA的格子Boltzmann方法:算法设计与程序优化[J]. 科学通报, 2011,56(28):2434-2444.
[10] LI Q, LUO K H, KANG Q J, et al. Lattice Boltzmann methods for multiphase flow and phase-change heat transfer[J]. Progress in Energy & Combustion Science, 2015, 52:62-105.
[11] 许爱国, 张广财, 甘延标. 相分离过程的离散Boltzmann方法研究进展[J]. 力学与实践, 2016, 38(4):361-374.
[12] GUNSTENSEN A K, ROTHMAN D H, ZALESKI S, et al. Lattice Boltzmann model of immiscible fluids[J]. Physical Review A, 1991, 43(8):4320-4327.
[13] SWIFT M R, ORLANDINI E, OSBORN Wr, et al. Lattice Boltzmann simulations of liquid-gas and binary fluid systems[J]. Physical Review E, 1996, 54(5):5041-5052.
[14] GAN Y, XU A, ZHANG G, et al. Phase separation in thermal systems:LB study and morphological characterization[J]. Physical Review E, 2011,84:046715.
[15] GONNELLA G, LAMURA A, SOFONEA V. Lattice Boltzmann simulation of thermal nonideal fluids[J]. Physical Review E Statistical Nonlinear & Soft Matter Physics, 2007, 76(2):036703.
[16] XU A, GONNELLA G. Morphologies and flow patterns in quenching of lamellar systems with shear[J]. Physical Review E, 2006,74:011505.
[17] SHAN M, ZHU C, et al.Pseudopotential multi-relaxation-time lattice Boltzmann model for cavitation bubble collapse with high density ratio[J].Chinese Physical B, 2016, 25(10):1741-1749.
[18] XU A, ZHAO T S, et al.A three-dimensional pseudo-potential-based lattice Boltzmann model for multiphase flows with large density ratio and variable surface tension[J]. International Journal of Heat and Fluid Flow, 2015, 56:261-271.
[19] TANG X, YANG S, WANG F. Researches on two-phase flows around a hydrofoil using Shan-Chen multi-phase LBM model[J]. Journal of Mechanical Science and Technology, 2016, 30(2):575-584.
[20] LI D, XU C, WANG Y, et al. Parallelizing and optimizing large-scale 3D multi-phase flow simulations on the Tianhe-2 supercomputer[C]//International Symposium on Network Computing and Applications, IEEE Computer Society, 2014:41-44.
[21] LEI T, MENG X, GUO Z.Lattice Boltzmann study on influence of chemical reaction on mixing of miscible fluids with viscous instability in porous media[J]. Chinese Journal of Computational Physics, 2016, 33(4):399-409.
[22] COSMIN N, LUCIAN M I, et al. GPU accelerated blood flow computation using the lattice Boltzmann method[C]. High Performance Extreme Computing Conference. IEEE, 2013:1-6.
[23] TRAN N, LEE M, HONG S, FRAGUELA B. Performance optimization of 3D lattice Boltzmann flow solver on a GPU[J]. Scientific Programming, 2017, 1:1-16.
[24] CHAI Z, ZHAO T S. Effect of the forcing term in the multiple-relaxation-time lattice Boltzmann equation on the shear stress or the strain rate tensor[J]. Physical Review E, 2012, 86(1 Pt 2):016705.
[25] CHEN F, XU A, ZHANG G, et al. Multiple-relaxation-time lattice Boltzmann model for compressible fluids[J]. Physics Letters A, 2011, 375(21):2129-2139.
[26] LALLEMAND P, LUO L. Theory of the lattice Boltzmann method:Dispersion, dissipation, isotropy, Galilean invariance, and stability[J]. Physical Review E, 2000, 61(6):6546-6562.
[27] YUAN P, SCHAEFER L. Equations of state in a lattice Boltzmann model[J]. Physics of Fluids, 2006, 18(4):329-341.
[28] SHANE C. A developer's guide to parallel computing with GPUs[M]. 苏统华,李东,李松泽,等译.北京:机械工业出版社, 2014:103-138.

LBM伪势MRT三维模型GPU并行计算的性能优化

Performance Optimization of 3D Pseudopotential Multi-Relaxation-Time Lattice Boltzmann Model on GPU

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

作者中心

审稿中心

期刊浏览

期刊介绍

[1]	李凌霄, 翟传磊, 谢辉, 施意. 一种求解三维热辐射输运方程的整体预处理迭代方法及并行计算[J]. 计算物理, 2021, 38(3): 269-279.
[2]	范宣华, 王柯颖, 肖世富, 陈璞. 强脉动压力下飞行器随机振动分析算法与并行实现[J]. 计算物理, 2021, 38(2): 192-198.
[3]	万启坤, 罗松, 尚文强, 张莹, 刘昊天, 朱宝杰. 不连续冷源布局多孔介质内热流耦合LBM数值模拟[J]. 计算物理, 2020, 37(4): 431-438.
[4]	严小松, 杨建伦, 羊奕伟. 康普顿相机图像重建中的快速滤波反投影算法[J]. 计算物理, 2020, 37(2): 153-162.
[5]	于晨阳, 范宣华, 王柯颖, 肖世富. 基于PANDA平台的多点基础激励谐响应的并行计算[J]. 计算物理, 2018, 35(4): 443-450.
[6]	李凌霄. 不可压缩流基于块预处理的并行有限元计算[J]. 计算物理, 2018, 35(2): 151-160.
[7]	包芸, 叶孟翔, 罗嘉辉. 湍流热对流的高效并行直接求解方法[J]. 计算物理, 2017, 34(6): 651-656.
[8]	刘旭, 徐小文, 张爱清. 面向结构网格自适应并行计算的矩形区域求差集快速算法[J]. 计算物理, 2017, 34(5): 563-573.
[9]	祁美玲, 杨琼, 王苍龙, 田园, 杨磊. 结构材料辐照损伤的分子动力学程序GPU并行化及优化[J]. 计算物理, 2017, 34(4): 461-467.
[10]	李双贵, 杭旭登, 杨容, 宋鹏, 翟传磊, 齐进. LARED集成程序辐射输运模拟的性能优化[J]. 计算物理, 2017, 34(3): 320-326.
[11]	徐幼平, 程煜峰, 王斌, 郭红, 普业, 程锐. 基于JASMIN框架的区域大气模式并行程序开发及试验[J]. 计算物理, 2017, 34(1): 47-60.
[12]	雷体蔓, 孟旭辉, 郭照立. 多孔介质中化学反应对非等粘流体混合过程影响的格子Boltzmann研究[J]. 计算物理, 2016, 33(4): 399-409.
[13]	蒋华, 董刚, 陈霄. 激波与火焰面相互作用数值模拟的GPU加速[J]. 计算物理, 2016, 33(1): 23-29.
[14]	张若兴, 侯士敏, 丑强. 基于第一性原理量子输运模拟的并行计算[J]. 计算物理, 2015, 32(6): 631-638.
[15]	上官燕琴, 王娴, 李跃明. 基于格子Boltzmann方法的平板射流大涡模拟[J]. 计算物理, 2015, 32(6): 669-676.