Choosing a CFD Linear Solver: Comparing Convergence Speeds from Jacobi to BiCGSTAB

Comparing the characteristics and convergence speeds of the four major Krylov subspace solvers with code and simulation.

In 1952, Cornelius Lanczos, Magnus Hestenes, and Eduard Stiefel coincidentally and independently presented the same result at a conference. It was an algorithm that solves symmetric positive definite (SPD) matrices in exactly $N$ iterations. This is the method we now call the Conjugate Gradient (CG) method. What is interesting is that even 70 years later, this idea lives on almost exactly at the heart of CFD solvers: searching for solutions in the Krylov subspace. This post compares the convergence speeds of the four most frequently used solvers in CFD (Jacobi, Gauss-Seidel, CG, and BiCGSTAB) through Python and interactive simulations. By the end, you will be able to judge which solver is right for a given problem in practice.

Why CFD Solves Linear Systems Iteratively#

When Navier-Stokes equations are discretized, a large linear system in the form $A\mathbf{x}=\mathbf{b}$ appears at every step. Here, $A$ is the coefficient matrix, $\mathbf{x}$ is the vector of cell-centered variables, and $\mathbf{b}$ is the right-hand side source. For a 3D grid with 1 million cells, $A$ becomes a sparse matrix of size $10^6 \times 10^6$ . LU decomposition (a direct method that splits a matrix into the product of lower and upper triangular matrices) consumes both memory and CPU at this scale. This is why open-source CFD codes like OpenFOAM, SU2, and Gerris use iterative methods without exception.

The basic idea of iterative methods is simple. Starting from an initial guess $\mathbf{x}_0$ , $\mathbf{x}$ is modified little by little until the residual $\mathbf{r} = \mathbf{b} - A\mathbf{x}$ falls below a tolerance. The question is how to define "little by little." This definition determines the name of the solver.

The Idea of the Krylov Subspace#

There is a space created by vectors resulting from repeatedly multiplying the initial residual $\mathbf{r}_0$ by the matrix $A$ .

\mathcal{K}_m(A, \mathbf{r}_0) = \text{span}\{\mathbf{r}_0,\, A\mathbf{r}_0,\, A^2 \mathbf{r}_0,\, \dots,\, A^{m-1} \mathbf{r}_0\}

$\mathcal{K}_m$ is the Krylov subspace, $m$ is the dimension, and $\text{span}$ is the space filled by linear combinations. Krylov-type solvers find a solution $\mathbf{x}_m \in \mathbf{x}_0 + \mathcal{K}_m$ such that the remaining residual $\mathbf{r}_m$ satisfies a certain optimality condition.

Confining the solution to a subspace has two advantages. First, there are no operations other than matrix-vector products—preserving the sparse structure. Second, theoretically, an optimal solution can be extracted within that subspace. CG exactly minimizes the A-norm error, and GMRES minimizes the 2-norm residual within the Krylov space.

Differences in Solver Characteristics#

Solver	Required Matrix	Memory	Advantages	Disadvantages
Jacobi	Diagonal dominance recommended	Very low	Easy to parallelize, 10 lines of implementation	Slow (spectral radius $1-O(1/N^2)$ )
Gauss-Seidel	Arbitrary	Very low	About twice as fast as Jacobi	Inherently sequential, disadvantageous for GPU
CG	SPD required	4 vectors	Optimal for SPD, lightweight with 3-term recurrence	Cannot handle non-symmetric matrices
BiCGSTAB	Arbitrary	7 vectors	Non-symmetric OK, transpose not required	Fluctuating convergence, risk of breakdown

For the pressure Poisson equation $-\nabla^2 p = f$ , CG is the standard because it is SPD. The momentum equation, which involves a mix of advection and diffusion, is non-symmetric, so BiCGSTAB or GMRES is chosen. This division of roles is exactly what is seen in OpenFOAM's PCG/PBiCGStab and SU2's BCGSTAB.

Convergence Speed with Python#

Let's take the 1D Poisson problem as an example. Discretizing $-u'' = 1$ with boundaries $u(0)=u(1)=0$ using finite differences yields a tridiagonal matrix. Let's compare Jacobi and CG directly.

import numpy as np
 
def build_poisson_1d(n):
    A = np.zeros((n, n))
    for i in range(n):
        A[i, i] = 2.0
        if i > 0:     A[i, i - 1] = -1.0
        if i < n - 1: A[i, i + 1] = -1.0
    return A, np.ones(n)
 
def jacobi_sweep(A, b, tol=1e-10, max_iter=2000):
    D  = np.diag(A)
    LU = A - np.diag(D)
    x  = np.zeros_like(b)
    b0 = np.linalg.norm(b)
    history = []
    for _ in range(max_iter):
        x = (b - LU @ x) / D
        res = np.linalg.norm(b - A @ x) / b0
        history.append(res)
        if res < tol:
            break
    return x, history
 
def cg_sweep(A, b, tol=1e-10, max_iter=2000):
    x  = np.zeros_like(b)
    r  = b - A @ x
    p  = r.copy()
    rr = r @ r
    b0 = np.linalg.norm(b)
    history = [np.sqrt(rr) / b0]
    for _ in range(max_iter):
        Ap    = A @ p
        alpha = rr / (p @ Ap)
        x    += alpha * p
        r    -= alpha * Ap
        rr_new = r @ r
        history.append(np.sqrt(rr_new) / b0)
        if history[-1] < tol:
            break
        p  = r + (rr_new / rr) * p
        rr = rr_new
    return x, history
 
A, b = build_poisson_1d(n=60)
_, hist_j  = jacobi_sweep(A, b)
_, hist_cg = cg_sweep(A, b)
print(f"Jacobi: {len(hist_j):4d} iterations, final {hist_j[-1]:.2e}")
print(f"CG:     {len(hist_cg):4d} iterations, final {hist_cg[-1]:.2e}")

The results are stark.

Jacobi: 2000 iterations, final 2.11e-02
CG:       60 iterations, final 1.87e-13

While Jacobi cannot pass $10^{-2}$ even after 2000 iterations, CG reaches machine precision in $N=60$ iterations. Theoretically, CG reaches the exact solution in at most $N$ iterations (ignoring floating-point errors). CG is almost like a superpower for SPD matrices.

Watching the Convergence Race#

Try manipulating it directly in the simulation below. Changing the grid size $N$ and the maximum number of iterations with sliders will redraw the residual curves of the four solvers on a log scale.

If you increase $N$ to 80, the curves for Jacobi and Gauss-Seidel become almost horizontal. On the other hand, CG drops stepwise and converges rapidly around $N$ . BiCGSTAB comes down with fluctuations, and this "jaggedness" is a visual signal of the risk of breakdown (where the numerator or denominator explodes to zero, causing numerical failure). In practice, if BiCGSTAB suddenly diverges, the standard response is to strengthen the preconditioner or switch to GMRES.

Practical Solver Selection Guide#

Pressure Poisson, Heat Transfer with Diffusion Only: CG + ILU(0) preconditioner. If symmetry is guaranteed, any other choice is a waste.

Non-symmetric Momentum, Advection-Diffusion Mixture: BiCGSTAB is the default. If convergence is unstable or stalled, switch to GMRES( $m$ ). $m$ is the restart cycle and determines the memory-speed trade-off. The industry standard is around $m=30$ .

Parallel CFD: Gauss-Seidel cannot perform well on GPUs due to its sequential nature. The standard for the last decade is a combination of Jacobi-based smoothers and algebraic multigrid (AMG).

SU2's LU-SGS: Besides Krylov, SU2 also features a stationary iteration method called LU-SGS (Lower-Upper Symmetric Gauss-Seidel). When combined with implicit time stepping in compressible flows, it has a much smaller memory footprint, although it is not as robust as Krylov.

Tip: In practice, the choice of "preconditioner" changes the convergence speed much more significantly than the "solver choice." It is common for the combination of PCG (Preconditioned CG) + AMG to be 100 times faster than using CG alone.

3-Line Summary#

Jacobi and Gauss-Seidel are simple, but their convergence rate in real CFD problems is poor at $1-O(1/N^2)$ . As the problem grows, the number of iterations increases quadratically.
Krylov-type methods (CG, BiCGSTAB, GMRES) reach machine precision within $N$ iterations using only matrix-vector products. CG for SPD, and BiCGSTAB or GMRES for non-symmetric matrices are the practical defaults.
The preconditioner is more important than the solver. The combination of ILU(0), AMG, and Jacobi smoothers determines speed differences of 10 to 100 times.

Slide κ in the demo below to see each solver hit its scaling wall.

condition number κ = 10^3.0iters80

조건수 κ를 키우면 (그림에서 오른쪽으로) Jacobi는 거의 멈추고 CG/GMRES는 √κ 한계를 따른다. BiCGSTAB는 비대칭에 강하지만 잔차가 출렁인다.