next up previous contents
Next: 2 Conjugate Gradient Algorithm Up: 2 Conjugate Gradient Iteration Previous: 2 Conjugate Gradient Iteration   Contents

1 Minimisation Problem & Steepest Descents

Suppose $ {\rm\bf A}$ is a real symmetric positive definite matrix (so that $ {\rm\bf x}^{\rm T}{\rm\bf A}{\rm\bf x}>0$ for all non-zero $ {\rm\bf x}$) and consider the function

$\displaystyle \phi ({\rm\bf x})={1\over 2}{\rm\bf x}^{\rm T}{\rm\bf A}{\rm\bf x}-{\rm\bf x}^{\rm T}{\rm\bf b},$ (170)

then

$\displaystyle \nabla\phi = {\rm\bf A}{\rm\bf x}-{\rm\bf b},$ (171)

so that $ \phi$ will be a minimum when $ {\rm\bf A}{\rm\bf x}-{\rm\bf b}={\rm\bf0}$. So to solve the matrix equation $ {\rm\bf A}{\rm\bf x}={\rm\bf b}$ we can try to find a minimum of the function $ \phi$. We can use this idea to generate an iterative method to determine an approximate solution of the original matrix equation.

If we have an approximation $ {\rm\bf x}_0$ then the residual,

$\displaystyle {\rm\bf r}_o={\rm\bf b}-A{\rm\bf x}_0 =-\nabla \phi ({\rm\bf x}_0),$ (172)

will give the direction of fastest descent from the curve $ \phi ({\rm\bf x}) =\phi ({\rm\bf x}_0)$ at the point $ {\rm\bf x}_0$. So we can find a value $ \alpha =\alpha_1$ for which $ \phi ({\rm\bf x}_0 +\alpha{\rm\bf r}_0 )$ is smallest. If we solve $ {{{\rm d}\phi}\over{{\rm d}\alpha}}=0,$ we obtain

$\displaystyle \alpha_1={{{\rm\bf r}_0^{\rm T}{\rm\bf r}_0}\over{{\rm\bf r}_0^{\rm T}{\rm\bf A}{\rm\bf r}_0}},$ (173)

so that an iteration algorithm is:

Steepest descents

Choose an initial guess $ {\rm\bf x}_0$ (e.g. $ {\rm\bf x}_0={\rm\bf b}$), tolerance $ \epsilon$ and calculate $ {\rm\bf r}_0={\rm\bf b}-{\rm\bf A}{\rm\bf x}_0$, in loop for $ s=1,2,\ldots$ $ \alpha_s = \displaystyle{{{{\rm\bf r}_{s-1}^{\rm T}{\rm\bf r}_{s-1}}\over{{\rm\bf r}_{s-1}^{\rm T}{\rm\bf A}{\rm\bf r}_{s-1}}}},$ $ {\rm\bf x}_s={\rm\bf x}_{s-1}+\alpha_s{\rm\bf r}_{s-1},$ $ {\rm\bf r}_s={\rm\bf b}-{\rm\bf A}{\rm\bf x}_s,$ until $ \vert\vert{\rm\bf r}_s\vert\vert_2<\epsilon.$

This algorithm can converge slowly because it may be repetitively calculating the same few search directions with only small changes of components in other directions.

Convergence

If the true solution is $ {\rm\bf s}$ and the error is $ {\rm\bf e}_r={\rm\bf s}-{\rm\bf x}_s$, the iteration formula gives

$\displaystyle {\rm\bf e}_s={\rm\bf e}_{s-1}+\alpha_s{\rm\bf A}{\rm\bf e}_{s-1}.$ (174)

If the eigenvalues and eigenvectors of $ {\rm\bf A}$ are $ \lambda_j$, $ {\rm\bf z}_j$ and the error is written in terms of $ {\rm\bf e}_{s-1}=\sum_j\xi_j{\rm\bf x}_j$ (for some numbers $ \xi_j$, leaving out the $ s-1$ index), algebra gives

$\displaystyle \alpha_s={{\sum\lambda_j^2\xi_j^2}\over{\sum\lambda_j^3\xi_j^2}},$ (175)

and we can find the error at the next step. Using a norm $ {\Vert {\rm\bf e}\Vert}_A={\rm\bf e}^T{\rm\bf A}{\rm\bf e}$,

$\displaystyle {\Vert {\rm\bf e}_s\Vert}_A=\beta^2{\Vert {\rm\bf e}_{s-1}\Vert}_A,$ (176)

where

$\displaystyle \beta^2= 1-{{(\sum\lambda_j^2\xi_j^2)^2}\over{(\sum\lambda_j^3\xi_j^2)(\sum\lambda_j\xi_j^2)}}.$ (177)

This formula is not much use without knowing the values of $ \xi_j$ but we can ask what is the largest value it can have over all possible combinations of $ \xi_j$. The method to answer this question is beyond this course, but if the condition number of the matrix $ {\rm\bf A}$ is $ \kappa$ then

$\displaystyle \beta\le {{\kappa -1}\over{\kappa +1}}.$ (178)

So convergence becomes slower as the condition number increases but is good if $ \kappa$ is less than around 10.


next up previous contents
Next: 2 Conjugate Gradient Algorithm Up: 2 Conjugate Gradient Iteration Previous: 2 Conjugate Gradient Iteration   Contents
Last changed 2000-11-21