Chinese Journal of Computational Physics

Words from the chief editor

Song JIANG

2024, 41(1): 1-1.

Abstract ( )

HTML ( )

PDF (550KB) ( )

Preface

Zeyao MO

2024, 41(1): 2-2.

Abstract ( )

HTML ( )

PDF (588KB) ( )

Meeting minutes of panel session of HPCMid22

Zeyao MO, Long WANG, Jie LIU, Guangming TAN, Weifeng LIU, Zhibin YU, Jidong ZHAI, Hailong YANG, Xiaowen XU

2024, 41(1): 3-8. DOI: 10.19596/j.cnki.1001-246x.8818

Abstract ( )

HTML ( )

PDF (658KB) ( )

2022年12月12日, 第八届高性能计算中间件技术研讨会(HPCMid22)成功召开。HPCMid (会议网址: http://www.caep-scns.ac.cn/HPCMid.php)每年举办一次, 面向科学与工程计算数值模拟应用在当前及下一代超级计算机上面临的挑战, 围绕高性能计算中间件关键技术, 邀请相关学者报告最新研究进展并探讨未来发展趋势。第八届研讨会以"适配新型体系结构的性能优化技术"为主题, 聚焦后摩尔时代新型体系结构为科学与工程计算带来的机遇与挑战, 探讨新型体系结构下可移植性能优化技术的发展趋势。本届研讨会的专家座谈(Panel Session)环节由莫则尧研究员和徐小文研究员共同主持, 邀请了王龙、刘杰、谭光明、刘伟峰、喻之斌5位来自高校、科研院所、企业的专家围绕"性能优化: 个性vs共性"这一主题开展了深入的讨论与交流, 翟季冬、杨海龙等多位专家也参与了讨论。专家们针对性能优化技术的研究现状与发展趋势、面临的问题与挑战以及人才培养等方面发表了许多有启发性的观点。《计算物理》编辑部特将本次讨论整理后发表, 以飨读者。限于篇幅, 略有删节。

Parallel Algorithm Libraries for Tianhe Supercomputers

Jie LIU, Yongzhen SHI, Bo YANG, Xiang ZHANG, Xinhai CHEN, Huajian ZHANG, Xiaowei GUO, Shengguo LI, Runhua LI, Jintao PENG, Tiaojie XIAO, Xuguang CHEN, Qingyang ZHANG, Biao LI, Can LENG, Yushui LI, Qinglin WANG

2024, 41(1): 9-21. DOI: 10.19596/j.cnki.1001-246x.8784

Abstract ( )

HTML ( )

PDF (7923KB) ( )

Tianhe supercomputers developed by the National University of Defense Technology won first place in the world's supercomputing TOP500 seven times. To exploit the high efficiency of those systems, the Tianhe team extracted the common key computing methods in large-scale scientific and engineering computing, designed and implemented scalable parallel algorithms for those methods according to the characteristics of the Tianhe supercomputers, and developed the Tianhe parallel algorithm libraries which are an important part of the Tianhe application-support environment. This paper first reviews the development history and system structures of Tianhe supercomputing systems. Subsequently, the architecture, functions, and performance of common parallel libraries such as grid processing libraries, partial differential equation discrete solving libraries, matrix computing libraries, particle transport libraries, collective communication libraries, and deep learning libraries are highlighted. Finally, a summary of typical application software on Tianhe supercomputers shows that the parallel algorithm libraries can effectively support the rapid development and performance optimization of typical application software.

Sparse General Matrix-matrix Multiplication for Sunway Manycore Architecture

Kan LIU, Lei YANG, Wei XUE, Wenguang CHEN

2024, 41(1): 22-32. DOI: 10.19596/j.cnki.1001-246x.8766

Abstract ( )

HTML ( )

PDF (8404KB) ( )

A parallel algorithm for sparse general matrix-matrix multiplication (SpGEMM), swSpGEMM, targeting the new generation Sunway many-core architecture is proposed. The algorithm addresses the load balance issue caused by the distribution of nonzeros in input matrix, using a light weight parallel task partitioning. For the irregular memory access and inefficient instruction pipelining in accumulating the product, a hierarchical sparse accumulator has been proposed to maximize the utilization of local memory with different input matrix features and to relieve the instruction dependency in integer searching, resulting in more efficient use of the computing capability of the hardware. On large matrices from the SuiteSparse sparse matrix collection, the algorithm outperforms MKL on two Intel Xeon GOLD 6132 processors by 21.1% and cuSPARSE on NVIDIA A100 by 95.3%.

Auto-tuning for Sparse Matrix-vector Multiplication

Zhen DU, Guangming TAN

2024, 41(1): 33-39. DOI: 10.19596/j.cnki.1001-246x.8763

Abstract ( )

HTML ( )

PDF (4372KB) ( )

SpMV (sparse matrix-vector multiplication) is a widely used kernel in scientific computing. Since the performance of specific SpMV program is closely related to the distribution of non-zero elements in sparse matrices, there is no universal SpMV program design that can achieve high performance in all matrices. Therefore, auto-tuning has become a popular method for high SpMV performance. This paper analyzes the difficulties in optimizing SpMV and introduces two representative works of auto-tuning: SMAT, which is based on pre-implemented templates and AlphaSparse which designs SpMV programs from scratch. This paper introduces their designs, implementations, test results, advantages, and disadvantages. Finally, the trend of SpMV auto-tuning is analyzed and predicted.

Efficient Asynchronous Performance Prediction for Heterogeneous Systems

Yuyang JIN, Zixuan MA, Jidong ZHAI

2024, 41(1): 40-51. DOI: 10.19596/j.cnki.1001-246x.8759

Abstract ( )

HTML ( )

PDF (5225KB) ( )

An efficient asynchronous performance prediction method is proposed to guide the design of asynchronous strategies. This method decomposes the performance behavior of synchronous and asynchronous execution and achieves fast and accurate prediction through hierarchical modeling, graph-based simulation and other techniques. Based on this method, the performance of HPL on the Sunway TaihuLight supercomputer is predicted. The experimental results show that the method achieves an accuracy of 96.61% on average for 4 million cores, with a prediction cost as low as milliseconds.

SEMD: A Cross-platform Automatic Performance Optimization Programming Tool for Real Numerical Simulation Software

Peng ZHANG, Aiqing ZHANG, Zeyao MO, Jingtao WANG

2024, 41(1): 52-63. DOI: 10.19596/j.cnki.1001-246x.8777

Abstract ( )

HTML ( )

PDF (14870KB) ( )

Aiming at the lack of reusability and portability in the manual optimization of software, we propose and implement SEMD, a cross-platform automatic performance optimization programming tool for numerical simulation software. It abstracts numerical computing loop programming using high-level semantics, which is prevalent in the field of numerical simulation, completely shielding underlying hardware features and performance optimization implementations. Therefore, any numerical subroutines written based on SEMD can attain automatic cross-platform performance portability. Our tests demonstrate that SEMD's performance optimization effects exceed those of comparable products on three different processor architectures, including X86, ARM and GPU. Furthermore, SEMD has been successfully applied in the development of four real numerical simulation software programs in the fields of structure, fluid, and electromagnetic, resulting in an average performance improvement of 164% on hotspot subroutines.

Feature-modified Algorithm Framework for Parallel Preconditioning in Sparse Linear Solvers

Xiaowen XU, Zeyao MO, Shaoliang HU, Hengbin AN

2024, 41(1): 64-74. DOI: 10.19596/j.cnki.1001-246x.8787

Abstract ( )

HTML ( )

PDF (9899KB) ( )

To address the high computational complexity of sparse linear solvers caused by complex physical characteristics in practical applications, this paper presents a unified framework for feature-modified preconditioning algorithms. By refining the algebraic features affecting the efficiency from physical characteristics and combining multilevel feature analysis, we construct feature-modified components. The effectiveness of this framework is demonstrated through several typical feature-modified preconditioning algorithms and their application results.

Nonlinear Iterative Methods for Radiation Diffusion Equations

Hengbin AN, Zeyao MO

2024, 41(1): 75-86. DOI: 10.19596/j.cnki.1001-246x.8765

Abstract ( )

HTML ( )

PDF (1439KB) ( )

To improve the robustness and convergence speed of the Newton method and Picard method of solving radiation diffusion equations, several work is introduced when they are used to solve the three temperature radiation diffusion equation system, including the selection of initial iteration value, the treatment of physical constraints in the iterative process, the combination of the Picard iterative method and Anderson acceleration, and the improvement of Anderson acceleration method. By applying application-driven treatments and improvements, the two methods can be used to solve the nonlinear radiation diffusion equations.

Feature-driven Parallel Algebraic Multigrid Methods for Multi-group Radiation Diffusion Problems

Shi SHU, Xiaoqiang YUE, Jianmeng HE, Xiaowen XU, Zeyao MO

2024, 41(1): 87-97. DOI: 10.19596/j.cnki.1001-246x.8768

Abstract ( )

HTML ( )

PDF (1141KB) ( )

Firstly, a review is given by classifying the existing fast algorithms for solving large-scale discrete linear systems arising from the Multi-Group Radiation Diffusion (MGRD) equations. Secondly, based on our recent work on parallel algebraic multigrid (AMG), two preconditioning algorithms and related theoretical frameworks are developed on a higher level. One is the approximate Schur complement type based on physical quantities and the other is the combined type based on physical and algebraic features, and the relevant components of these works are portrayed within these frameworks. Based on the above framework, a approximate Schur complement preconditioner with fundamental approximation property and low computational complexity is designed, and the corresponding spectral equivalence theory is established. Numerical experiments show that the new preconditioner has better robustness and computational efficiency. Finally, several issues that need to be further addressed are presented.

Application-oriented Preconditioning of Seepage Mechanics

Chunsheng FENG, Shizhe LI, Shenghao LIU, Chensong ZHANG, Li ZHAO

2024, 41(1): 98-109. DOI: 10.19596/j.cnki.1001-246x.8791

Abstract ( )

HTML ( )

PDF (1481KB) ( )

The seepage mechanics model comprises multiple nonlinearly coupled partial differential equations. In various applications, seepage mechanics problems exhibit distinct characteristics and the corresponding solution methods are also very different. This paper focuses on the representative mathematical models used in oil and gas reservoir development. It introduces the mathematical formulation and application characteristics of multiphase multicomponent seepage mechanics equations within porous media, along with efficient techniques for solving their discretized linear equation systems, including commonly employed preconditioning methods. Additionally, this study appropriately modifies standard test cases and evaluates the shared-memory parallel efficiency of these preconditioning methods.

JPSOL: A Parallel Numerical Algebraic Solver Driven by Application Features

Shaoliang HU, Xiaowen XU, Hengbin AN, Ran XU, Ronghong FAN

2024, 41(1): 110-121. DOI: 10.19596/j.cnki.1001-246x.8771

Abstract ( )

HTML ( )

PDF (14507KB) ( )

JPSOL(J Parallel Solver Library for Numerical Algebra Problems) is introduced, including the software architecture, matrix vector data structure, three kinds of algorithm libraries (linear, nonlinear and eigenvalue) and domain specific solvers. Then, the high parallel scalability of JPSOL are demonstrated by the testing results of basic iterative methods. Finally, the effect and robustness of JPSOL are demonstrated by several typical practical applications.

Convergence Estimation and Characteristic Analysis of A Two-level Iterative Algorithm for Discretized Three-temperature Energy Linear Systems

Yue HAO, Silu HUANG, Xiaowen XU

2024, 41(1): 122-130. DOI: 10.19596/j.cnki.1001-246x.8767

Abstract ( )

HTML ( )

PDF (1918KB) ( )

In this paper, we study in detail the specific convergence property of the physical-variable-based coarsening two-level iterative method (PCTL) algorithm based on the theory of algebraic multigrid method (AMG), and give a reasonable upper bound on the convergence factor, which provides a theoretical guarantee for the PCTL algorithm. Moreover, we also analyze the algebraic features that affect the convergence of the PCTL algorithm, such as diagonal dominance and coupling strength, hoping to provide theoretical guidance for the applications and algorithm optimization of the PCTL algorithm.

A Review of Algorithms and Applications of Solvers with Quantum Computing Acceleration

Kang XU, Zeyang LI, Zhufeng GUO, Yingtong SHEN, Wei WANG, Minhui GOU, Zizheng WANG, Yukun WANG, Weifeng LIU

2024, 41(1): 131-150. DOI: 10.19596/j.cnki.1001-246x.8778

Abstract ( )

HTML ( )

PDF (1957KB) ( )

Quantum computing is a new computing model based on the principles of quantum mechanics. Because of its powerful parallelism far superior to classical computing, quantum computing is considered as a computational method that may have a subversive impact on the future, providing a new way to solve some complex problems. The algorithms and applications of quantum solvers in numerical computation-related problems of large-scale science and engineering are reviewed. In particular, systems of linear equations, eigenvalue problems, differential equations, Hamiltonian and graph computation, quantum machine learning, quantum solver platform, and practical numerical simulation have been introduced. Aiming at different numerical computing problems, the current mainstream quantum computing algorithms are introduced in detail, and the research progress of relevant algorithms at home and abroad in recent years is comprehensively summarized. Finally, the future development trend of quantum computing in numerical algebra solving is prospected.

Archive

Authors

Reviewers

Journal Online

About Journal