Sovling Linear Systems II: Iterative Methods

In this part, we discuss iterative methods for solving linear systems. We use the discrete Poisson Equation as an example and study the following methods:

Jocobi's method

Successive Overrelaxation (SOR) method

Conjugate Gradient method

Fast Fourier Transform (FFT)

Multigrid method

The first three methods are general. They can be used to solve general, but under some conditions for convergence, systems. Usually, they are used to sparse systems such as the discrete Poisson equation. The last two, FFT and multigrid, are special methods for descrete Poisson equation type of the systems.

An overview of performance

  Algorithm   Type  Serial time    PRAM time    Storage   # of Proc.
 --------------------------------------------------------------------
  Dense LU     D        N^3           N           N^2        N^2
  Band LU      D        N^2           N         N^(3/2)       N
  Jacobi       I        N^2           N            N          N
  Sparse LU    D      N^(3/2)       sqrt(N)     N*logN        N
  CG           I      N^(3/2)     sqrt(N)logN      N          N
  SOR          I      N^(3/2)       sqrt(N)        N          N
  FFT          D      N*logN         logN          N          N
  Multigrid    I        N          (logN)^2        N          N
 --------------------------------------------------------------------
  lower bound           N            logN          N

Issues in iterative methods

Convergence

Termination criteria

Initial guess

The 2D Poisson equation

        d^2 u(x,y)     d^2 u(x,y)
       ------------ + ------------ = f(x,y)
           dx^2           dy^2

Discrete form

  u(i-1,j)-2u(i,j)+u(i+1,j)     u(i,j-1)-2u(i,j)+u(i,j+1)
 --------------------------- + --------------------------- = f(i,j)
            h^2                          h^2

Boundary condition:

       u(0,y) = u(1,y) = u(x,0) = u(x,1) = 0

For example, n=3, rwo ordering

        [-4  1  0  1  0  0  0  0  0 ]
        [ 1 -4  1  0  1  0  0  0  0 ]
        [ 0  1 -4  0  0  1  0  0  0 ]
        [ 1  0  0 -4  1  0  1  0  0 ]
        [ 0  1  0  1 -4  1  0  1  0 ]
        [ 0  0  1  0  1 -4  0  0  1 ]
        [ 0  0  0  1  0  0 -4  1  0 ]
        [ 0  0  0  0  1  0  1 -4  1 ]
        [ 0  0  0  0  0  1  0  1 -4 ]

Jacobi Method (including boundary values)

  u(i,j,m+1) = (1/4)*(u(i-1,j,m)+u(i+1,j,m)+u(i,j-1,m)+u(i,j+1,m)+b(i,j))

Initial guess: any

Convergence: m ~ ((n+1)/pi)^2

Termination criteria: assuming discretization error 10^(-6) (20 bits), 20*((n+1)/pi)^2

Performance

Serial time: O(n)*O(N) = O(N^2)

PRAM: O(N)*O(1) = O(N), N processors

Parallel: 2D block, N/p grid points per processor, computation O(N/p), communication O(Ts + 4n/sqrt(p)*Tw) each step

Isoefficiency: N = O(p) or n = O(sqrt(p)) (n^2 = N)

SOR

Improvements on Jacobi. It is based on Guass-Seidel method. The basic idea is to use the latest results. It is easy on serial but hard on parallel. Usually, red-black (chess board) ordering is used in parallel.

The idea of SOR is

    u(i,j,m+1) = u(i,j,m) + correction

The correction is a scaled error of previous method.

Initial guess: any

Convergence: m ~ (n+1)/(2*pi)

Termination criteria: assuming discretization error 10^(-6) (20 bits), 20*(n+1)/(2*pi)

Performance

Serial time: O(N^(3/2))

Parallel: computation O(N/p), communication O(Ts = 4n/sqrt(p)*Tw) each step

Isoefficiency: N = O(p) or n = O(sqrt(p))

Fast Fourier Transform (FFT)

Define

            [ 2 -1         ]
            [-1  2 -1   0  ]
        T = [    .....     ]     U = [u(i,j)]
            [  0   -1  2 -1]
            [         -1  2]

then the (i,j)-entry of TU

  TU(i,j) = -(u(i-1,j)-2u(i,j)+u(i+1,j))

and the (i,j)-entry of UT

  UT(i,j) = -(u(i,j-1)-2(i,j)+u(i,j+1))

Thus the discrete Poisson equation becomes

-(1/h^2)*(TU+UT) = F  or TU+UT = B

How can FFT help? Since T has very special form, we can easily compute its eigenvalue decomposition

          T = Q*D*Q^(-1)

where D=diag(d(i)). In fact, we can find a Q such that Q=Q^(-1). Thus

      D*(Q*U*Q) + (Q*U*Q)*D = Q*B*Q

Algorithm
  0. Find Q and D: T=Q*D*Q;
  1. Compute C=Q*B*Q;                % O(n^2*log(n)) using FFT
  2. For each i,k
        V(j,k)=C(j,k)/(d(j)+d(k));   % 2n^2=2N, embarrassingly parallel
  3. Compute U=Q*V*Q.                % O(n^2*log(n)) using FFT

How to compute the eigenvalues and eigenvectors of T? We can verify that d(i) and the jth column of Q are respectively

  d(i)=2(1-cos(j*pi/(n+1)))
  q(j)=sqrt(2/(n+1))*[ sin(j*pi/(n+1))]
                     [sin(2j*pi/(n+1))]
                     [      ...       ]
                     [sin(nj*pi/(n+1))]