Skip to content
cfd-lab:~/en/posts/2026-04-22-cfd-linear-so…online
NOTE #021DAY WED CFD기법DATE 2026.04.22READ 5 min readWORDS 958#Algorithms#Linear-Solver#Krylov#Numerical Analysis

Choosing a CFD Linear Solver: Comparing Convergence Speeds from Jacobi to BiCGSTAB

Comparing the characteristics and convergence speeds of the four major Krylov subspace solvers with code and simulation.

In 1952, Cornelius Lanczos, Magnus Hestenes, and Eduard Stiefel coincidentally and independently presented the same result at a conference. It was an algorithm that solves symmetric positive definite (SPD) matrices in exactly NN iterations. This is the method we now call the Conjugate Gradient (CG) method. What is interesting is that even 70 years later, this idea lives on almost exactly at the heart of CFD solvers: searching for solutions in the Krylov subspace. This post compares the convergence speeds of the four most frequently used solvers in CFD (Jacobi, Gauss-Seidel, CG, and BiCGSTAB) through Python and interactive simulations. By the end, you will be able to judge which solver is right for a given problem in practice.

Why CFD Solves Linear Systems Iteratively#

When Navier-Stokes equations are discretized, a large linear system in the form Ax=bA\mathbf{x}=\mathbf{b} appears at every step. Here, AA is the coefficient matrix, x\mathbf{x} is the vector of cell-centered variables, and b\mathbf{b} is the right-hand side source. For a 3D grid with 1 million cells, AA becomes a sparse matrix of size 106×10610^6 \times 10^6. LU decomposition (a direct method that splits a matrix into the product of lower and upper triangular matrices) consumes both memory and CPU at this scale. This is why open-source CFD codes like OpenFOAM, SU2, and Gerris use iterative methods without exception.

The basic idea of iterative methods is simple. Starting from an initial guess x0\mathbf{x}_0, x\mathbf{x} is modified little by little until the residual r=bAx\mathbf{r} = \mathbf{b} - A\mathbf{x} falls below a tolerance. The question is how to define "little by little." This definition determines the name of the solver.

The Idea of the Krylov Subspace#

There is a space created by vectors resulting from repeatedly multiplying the initial residual r0\mathbf{r}_0 by the matrix AA.

Km(A,r0)=span{r0,Ar0,A2r0,,Am1r0}\mathcal{K}_m(A, \mathbf{r}_0) = \text{span}\{\mathbf{r}_0,\, A\mathbf{r}_0,\, A^2 \mathbf{r}_0,\, \dots,\, A^{m-1} \mathbf{r}_0\}

Km\mathcal{K}_m is the Krylov subspace, mm is the dimension, and span\text{span} is the space filled by linear combinations. Krylov-type solvers find a solution xmx0+Km\mathbf{x}_m \in \mathbf{x}_0 + \mathcal{K}_m such that the remaining residual rm\mathbf{r}_m satisfies a certain optimality condition.

Confining the solution to a subspace has two advantages. First, there are no operations other than matrix-vector products—preserving the sparse structure. Second, theoretically, an optimal solution can be extracted within that subspace. CG exactly minimizes the A-norm error, and GMRES minimizes the 2-norm residual within the Krylov space.

Differences in Solver Characteristics#

SolverRequired MatrixMemoryAdvantagesDisadvantages
JacobiDiagonal dominance recommendedVery lowEasy to parallelize, 10 lines of implementationSlow (spectral radius 1O(1/N2)1-O(1/N^2))
Gauss-SeidelArbitraryVery lowAbout twice as fast as JacobiInherently sequential, disadvantageous for GPU
CGSPD required4 vectorsOptimal for SPD, lightweight with 3-term recurrenceCannot handle non-symmetric matrices
BiCGSTABArbitrary7 vectorsNon-symmetric OK, transpose not requiredFluctuating convergence, risk of breakdown

For the pressure Poisson equation 2p=f-\nabla^2 p = f, CG is the standard because it is SPD. The momentum equation, which involves a mix of advection and diffusion, is non-symmetric, so BiCGSTAB or GMRES is chosen. This division of roles is exactly what is seen in OpenFOAM's PCG/PBiCGStab and SU2's BCGSTAB.

Convergence Speed with Python#

Let's take the 1D Poisson problem as an example. Discretizing u=1-u'' = 1 with boundaries u(0)=u(1)=0u(0)=u(1)=0 using finite differences yields a tridiagonal matrix. Let's compare Jacobi and CG directly.

import numpy as np
 
def build_poisson_1d(n):
    A = np.zeros((n, n))
    for i in range(n):
        A[i, i] = 2.0
        if i > 0:     A[i, i - 1] = -1.0
        if i < n - 1: A[i, i + 1] = -1.0
    return A, np.ones(n)
 
def jacobi_sweep(A, b, tol=1e-10, max_iter=2000):
    D  = np.diag(A)
    LU = A - np.diag(D)
    x  = np.zeros_like(b)
    b0 = np.linalg.norm(b)
    history = []
    for _ in range(max_iter):
        x = (b - LU @ x) / D
        res = np.linalg.norm(b - A @ x) / b0
        history.append(res)
        if res < tol:
            break
    return x, history
 
def cg_sweep(A, b, tol=1e-10, max_iter=2000):
    x  = np.zeros_like(b)
    r  = b - A @ x
    p  = r.copy()
    rr = r @ r
    b0 = np.linalg.norm(b)
    history = [np.sqrt(rr) / b0]
    for _ in range(max_iter):
        Ap    = A @ p
        alpha = rr / (p @ Ap)
        x    += alpha * p
        r    -= alpha * Ap
        rr_new = r @ r
        history.append(np.sqrt(rr_new) / b0)
        if history[-1] < tol:
            break
        p  = r + (rr_new / rr) * p
        rr = rr_new
    return x, history
 
A, b = build_poisson_1d(n=60)
_, hist_j  = jacobi_sweep(A, b)
_, hist_cg = cg_sweep(A, b)
print(f"Jacobi: {len(hist_j):4d} iterations, final {hist_j[-1]:.2e}")
print(f"CG:     {len(hist_cg):4d} iterations, final {hist_cg[-1]:.2e}")

The results are stark.

Jacobi: 2000 iterations, final 2.11e-02
CG:       60 iterations, final 1.87e-13

While Jacobi cannot pass 10210^{-2} even after 2000 iterations, CG reaches machine precision in N=60N=60 iterations. Theoretically, CG reaches the exact solution in at most NN iterations (ignoring floating-point errors). CG is almost like a superpower for SPD matrices.

Watching the Convergence Race#

Try manipulating it directly in the simulation below. Changing the grid size NN and the maximum number of iterations with sliders will redraw the residual curves of the four solvers on a log scale.

If you increase NN to 80, the curves for Jacobi and Gauss-Seidel become almost horizontal. On the other hand, CG drops stepwise and converges rapidly around NN. BiCGSTAB comes down with fluctuations, and this "jaggedness" is a visual signal of the risk of breakdown (where the numerator or denominator explodes to zero, causing numerical failure). In practice, if BiCGSTAB suddenly diverges, the standard response is to strengthen the preconditioner or switch to GMRES.

Practical Solver Selection Guide#

Pressure Poisson, Heat Transfer with Diffusion Only: CG + ILU(0) preconditioner. If symmetry is guaranteed, any other choice is a waste.

Non-symmetric Momentum, Advection-Diffusion Mixture: BiCGSTAB is the default. If convergence is unstable or stalled, switch to GMRES(mm). mm is the restart cycle and determines the memory-speed trade-off. The industry standard is around m=30m=30.

Parallel CFD: Gauss-Seidel cannot perform well on GPUs due to its sequential nature. The standard for the last decade is a combination of Jacobi-based smoothers and algebraic multigrid (AMG).

SU2's LU-SGS: Besides Krylov, SU2 also features a stationary iteration method called LU-SGS (Lower-Upper Symmetric Gauss-Seidel). When combined with implicit time stepping in compressible flows, it has a much smaller memory footprint, although it is not as robust as Krylov.

Tip: In practice, the choice of "preconditioner" changes the convergence speed much more significantly than the "solver choice." It is common for the combination of PCG (Preconditioned CG) + AMG to be 100 times faster than using CG alone.

3-Line Summary#

  • Jacobi and Gauss-Seidel are simple, but their convergence rate in real CFD problems is poor at 1O(1/N2)1-O(1/N^2). As the problem grows, the number of iterations increases quadratically.
  • Krylov-type methods (CG, BiCGSTAB, GMRES) reach machine precision within NN iterations using only matrix-vector products. CG for SPD, and BiCGSTAB or GMRES for non-symmetric matrices are the practical defaults.
  • The preconditioner is more important than the solver. The combination of ILU(0), AMG, and Jacobi smoothers determines speed differences of 10 to 100 times.

Slide κ in the demo below to see each solver hit its scaling wall.

조건수 κ를 키우면 (그림에서 오른쪽으로) Jacobi는 거의 멈추고 CG/GMRES는 √κ 한계를 따른다. BiCGSTAB는 비대칭에 강하지만 잔차가 출렁인다.

Share if you found it helpful.