An efficient asynchronous performance prediction method is proposed to guide the design of asynchronous strategies. This method decomposes the performance behavior of synchronous and asynchronous execution and achieves fast and accurate prediction through hierarchical modeling, graph-based simulation and other techniques. Based on this method, the performance of HPL on the Sunway TaihuLight supercomputer is predicted. The experimental results show that the method achieves an accuracy of 96.61% on average for 4 million cores, with a prediction cost as low as milliseconds.