## Franklin's Notes

Let $A\in\mathbb R^{m\times m}$ be a symmetric positive definite matrix, so that $x^\ast Ax>0$ for all $x\in\mathbb R^{m\times 1}\backslash{\vec 0}$. The matrix $A$ induces a matrix norm defined by $\lVert x \rVert_A = \sqrt{x^\ast A x}$, sometimes called the operator norm or the energy norm. The method of conjugate gradients build the Krylov subspaces $\mathcal K_n$ as GMRES and Arnoldi iteration do, but it doesn't minimize the residual at each step, instead minimizing the "energy norm" of the error $\lVert x-x_n\rVert_A$ when computing a solution to the system $Ax=b$.

The CG algorithm is as follows:

1. $x_0=0, r_0=b, p_0=r_0$
2. For $n=1,2,3,\cdots$
1. $\alpha_n = r_{n-1}^\ast r_{n-1}/(p_{n-1}^\ast Ap_{n-1})$
2. $x_n=x_{n-1}+\alpha_n p_{n-1}$
3. $r_n=r_{n-1}-\alpha_n A p_{n-1}$
4. if $\lVert r_n\rVert_2<\text{tolerance}$: halt
5. $\beta_n = r_n^\ast r_n/(r_{n-1}^\ast r_{n-1})$
6. $p_n=r_n+\beta_n p_{n-1}$

Here is a guide of what these steps intuitively represent:

1. Initializes variables
2. Loop...
1. Computes the "step length" $\alpha$
2. Update the solution estimate by stepping length $\alpha$ in the "search direction" $p$
3. Calculates the residual
4. Halts when the residual is sufficiently small
5. Computes a coefficient $\beta$, in order to...
6. ...calculate a new search direction $p$ orthogonal to the previous direction

If $A$ is dense, then one iteration is $\mathcal O(m^2)$, whereas if $A$ is sparse, then one iteration can be as cheap as $\mathcal O(m)$.