### Least squares problem

Consider a typical linear system with $Ax=b$ in which $A\in\mathbb C^{m\times n}$ with $m>n$. In general, $\text{range}(A)\ne\mathbb C^m$, so $b$ may very well not be in the range of $A$. This is called an **overdetermined** or **overconstrained** system, and it has no exact solution. However, we may instead attempt to "solve" this system by seeking $x$ that *minimizes* the magnitude of the error $r=b-Ax$, called the **residual**. That is, a **least squares solution** to the system $Ax=b$ is a value of $x$ that minimizes the 2-norm of the residual $\lVert b-Ax \rVert_2$.

To accomplish this, we must first find the point in $\text{range}(A)$ that is closest to $b$, so that $b-Ax$ is minimized, where $x$ is the preimage of that point under $A$. If we find an orthogonal projector $P$ onto the range of $A$, then we may instead solve the linear system $Ax=Pb$, which *does* have an exact solution. Since every column of $A$ is in the range of $A$, we would have that $PA=A$. Therefore, we would need $PAx=Pb$, or $P(Ax-b)=Pr=0$. Thus, $r$ should be orthogonal to the range of $A$, meaning that $A^\ast r = 0$, and therefore $A^\ast Ax = A^\ast b$. Since $A^\ast A$ is actually invertible (assuming $A$ has full rank), we see that $x=(A^\ast A)^{-1}A^\ast b$ is the minimizer, and $Ax=A(A^\ast A)^{-1}A^\ast b$, meaning that the projector is given by $P=A(A^\ast A)^{-1}A^\ast$.

This can be used for least-squares datafitting, such as least-squares polynomial interpolation to fit a low-degree polynomial to a large number of data points by solving for the polynomial coefficients using a least-squares system involving a Vandermonde Matrix .

# linear-algebra

# least-squares

back to home page