Parallel computing performance of preconditioner

In theory, the Smagorinsky constant Cs can be derived by assuming that the production and dissipation of subgrid-scale turbulent kinetic energy are in equilibrium [25]. In practice, it has been found that the best results could be obtained when the Smagorinsky constant is set as 0.1 [26].

For another unclosed term, SGS heat flux term ~q in the energy equation, it could be expressed by Eq. (13) based on the gradient hypothesis [27]. It has been found that for air-like forced flow in this paper, the simulation result is not sensitive to the value of turbulent Prandtl number, Prt [27]. According to the suggestions from the existing research [26], the turbulent Prandtl number Prt is set as 0.90 in this paper.

~qj ¼ �lt vT vxj


lt ¼ rmtPrt (14)

H. Zhang et al. / Nuclear Engineering and Technology 49 (2017) 1310e13171312

2.3. Implicit continuous-fluid Eulerian arbitraryeLagrangianeEularian algorithm

GASFLOW-MPI uses a robust all-speed numerical methodology, the implicit continuous-fluid Eulerian arbitraryeLagrangiane Eularian method, to solve the 3-D transient compressible Naviere Stokes equations in Cartesian or cylindrical coordinates [11]. The solutionofNaviereStokes equation is time-split into three stepsvia an operator-splitting technique [11]. At the first step and the third step, thediffusion term, convection term, and source termare treated in the explicit scheme to reduce the computational cost. At the second step, an implicit Lagrangian scheme is used to avoid the time step being limited by acoustic wave, as shown by Eqs. (15e18). By defining edp¼ pn¼ pII, the symmetrical elliptic pressure equation Eq. (19) can be derived from Eqs. (15e18) [11].

V II � V I Dt

¼ Vn$ h ðu$AÞII � ðu$AÞn

i (15)

ðrVÞII � ðrVÞI Dt

¼ 0 (16)

ðrVuÞII � ðrVuÞI Dt

¼ �Vn h V � pII � Pn

� i (17)


¼ �VnpnV$ ðu$AÞII � ðu$AÞn



” AVnVdp

ðrVÞI # � V


Vn � pI þ C� dp

¼ V I�pn � pI�

Vn � pI þ C� þ DtV$

h� ðu$AÞI � ðu$AÞn

�i ; (19)

where the coefficient, for an ideal gas, is

C ¼ pn �

R M,cvðTÞ

�A (20)

3. Pressure equation parallel solution

In order to provide more accurate and detailed solutions for containment simulations, the advanced parallel CFD software GASFLOW-MPI had been developed based on the original serial version GASFLOW and had been validated by the well-known test cases [18,19].

Obviously, an efficient scalable linear solver plays an important role for GASFLOW-MPI because a large-scale symmetrical sparse linear equation system derived from the elliptic pressure equation, Eq. (19), should be solved per time step. First, we evaluate the computational performance of the different Krylov subspace methods that are preconditioned by the block Jacobi method. Then, the effect of scalable preconditioning methods is analyzed and compared. Finally, the parallel scalability test is implemented on the KIT/IKET cluster.

A hypothetical 3-D H2 cloud in the airsteam mixture is calcu- lated in this section to evaluate the linear solver computational performance. The feature of this case is that the gas velocity is relatively small, whereas the pressure field drastically changes per time step. Therefore, a considerable number of iterations are required to achieve convergence. In this case, the total number of computational nodes is 200 � 200 � 200.

3.1. Computational performance of Krylov subspace methods

The preconditioned Krylov subspace iteration method is a family of advanced efficient linear solvers. The basic idea of Krylov subspace iteration method is that, at iteration step k, the method extracts a “good” approximate solution to the linear system A , x ¼ b from the Krylov subspace Kk by imposing the PetroveGalerkin condition [28,29].

Krylov subspace:

Kk ¼ KkðA; r0Þ ¼ span � r0;Ar0;…;A

k�1r0 � ; (21)

where x0 is the initial guess solution and r0 ¼ A , x0 e b is the initial residual.

PetroveGalerkin condition:

rk ¼ b� A$xk ⊥ LkðA; r0Þ; (22)

where Kk is the approximation space and Lk is the constraint space. Different versions of Krylov subspace methods stem from

different choices of constraint space Lk and approximation space Kk. In this paper, three Krylov subspace iteration methodsdCG method, MINRES method, and SYMMLQmethoddare used to solve the large-scale symmetrical sparse pressure equation. The choice of constraint space and approximation space for each Krylov subspace method is as follows.

CG method [20]:

Kk ¼ KkðA; r0Þ and � Lk ¼ KkðA; r0Þ (23) MINRES method [21]:

Kk ¼ KkðA; r0Þ and � Lk ¼ AKkðA; r0Þ (24) SYMMLQ method [21]:

�Kk ¼ ATKk � AT ; r0

� and Lk ¼ Kk

� AT ; r0

� (25)

The computational performance test is implemented on the KIT/ IKET server, which provides eight nodes in total and 16 cores on each node using the Intel Xeon Processor E5-2667 v2 CPU (Intel Corporation, Santa Clara, CA, USA) [18]. The convergence rate of three Krylovmethods preconditioned by the block Jacobi method is shown in Fig. 1. The result shows that, for the medium convergence criterion, the performance of all three Krylovmethods is almost the same. However, for the tight convergence criterion, the perfor- mance of CG takes priority over that of MINRES and SYMMLQ because of its strong stability. Because gas velocity is sensitive to pressure gradient, the CG iteration method is recommended to evaluate the pressure field with sufficient accuracy.

3.2. Parallel computing performance of preconditioner

Because there is only a limited amount of parallelism in stan- dard preconditioners such as symmetric successive overrelaxation method and incomplete LU factorization method, the alternative preconditioning techniques that are specially designed for parallel environments should be used. Comparedwith the additive Schwarz method preconditioning, less communication cost is required in the parallel block Jacobi preconditioning method. Thus, the parallel block Jacobi preconditioning method is used in this paper, where each processor has one block. Here, we focus on the effect of block solver on the total computational efficiency. Several block solvers, including the Cholesky factorization and the incomplete Cholesky factorization with fill level k [ICC(k); k ¼ 0, 1,…, 4] [30], are used to solve each individual block. On one hand, with the increase of the

0 100 200 300 400 500 600 10–25






No.of krylov iteration

N or

m al

iz ed

re si

du al


Fig. 1. Convergence rate of preconditioned Krylov methods. CG, conjugate gradient; MINRES, minimal residual; SYMMLQ, symmetric LQ.

H. Zhang et al. / Nuclear Engineering and Technology 49 (2017) 1310e1317 1313

fill level k, the iteration number of ICC(k) continues to decrease and is close to that of exact cholesky factorization, as shown in Fig. 2 and Table 1. On the other hand, the computational cost of block solver ICC(k) also increases with the increase of the parameter k. For our symmetrical sparse pressure equation system, the best balance is achieved at the point of ICC(2), as shown in Table 1. Therefore, the computational performance of CG solver preconditioned by ICC(2) block Jacobi method is the most efficient combinations in our test

0 20 40 60 80 100 10–20





No.of krylov iteration

N or

m al

iz ed

re si

du al

ICC(0) ICC(1) ICC(2) ICC(3) ICC(4) Cholesky

Fig. 2. Convergence rate of CG preconditioned by PBJ with ICC(k). CG, conjugate gradient; ICC(k), incomplete Cholesky factorization with fill level k; PBJ, parallel block Jacobi.

Table 1 Computational performance.

Averaged iteration #

Cost per iteration (sec)

Total cost (sec)