Franklin's Notes

Singular value decomposition

The singular value decomposition or SVD of an $m\times n$ matrix $A$ is a factorization of the form where $U,V$ are unitary matrices in $\mathbb{C}^{m\times m}$ and $\mathbb{C}^{n\times n}$ respectively and $\Sigma$ is a diagonal matrix in $\mathbb R^{m\times n}$. Geometrically, the SVD is motivated by the fact that the image of a unit sphere under a matrix transformation is always a hyperellipse. Thus, every matrix transformation can be represented by composing a rotation with a stretching along some orthonormal set of axes. The factors by which these axes are stretched are called the singular values of $A$ and are denoted $\sigma_1,...,\sigma_n$. They are conventionally listed in descending order $\sigma_1\geq ...\geq \sigma_n\geq 0$. The left singular vectors $u_1,...,u_n$ are the direction vectors of the largest principal semiaxes of the image of the unit sphere under $A$, so that $\sigma_i u_i$ is the ith largest principle semiaxis of this hyperellipse. The right singular vectors are the preimages of these semiaxes, so that $Av_i = \sigma u_i$ for each $i$. These are the columns of the matrices $U$ and $V$ respectively, so that

Theorem 1. Every matrix $A\in\mathbb C^{m\times n}$ has an SVD. The singular values $\sigma_i$ are uniquely determined, and if $A$ is a square matrix, then the left and right singular vectors are uniquely determined up to multiplication by a complex unit.

This can be proven by induction on the dimensions of $A$, starting by isolating the largest semiaxis of the image of the unit hypersphere transformed by $A$, and thereby recursively constructing an SVD that depends on the existence of the SVD for a space with one dimension fewer. It happens that if $A$ is a real-valued matrix, it also has an SVD decomposition into real-valued matrices.

The following theorem gives some examples of useful information provided by the SVD:

Theorem 2. For some $A\in\mathbb C^{m\times n}$, let $r$ be the number of nonzero singular values of $A$. Then the following statements hold:
1. The rank of $A$ is equal to $r$.
2. The range of $A$ is $\text{span}{u_1,...,u_r}$ and the null space of $A$ is $\text{null}{v_{r+1},...,v_n}$.
3. If $A$ is square, then $|\det(A)|=\sigma_1...\sigma_m$.
4. $\lVert A\rVert_2 = \sigma_1$ and $\lVert A\rVert_F = \sqrt{\sigma_1^2 + ... + \sigma_r^2}$.
5. The nonzero singular values of $A$ are the square roots of the nonzero eigenvalues of $A^\ast A$ or $AA^\ast$.

Proof. If $x\in\mathbb C^n$, then we may decompose it in the form so that we have Thus, we have that the range of $A$ is spanned by $u_1,...,u_r$. From this we can also see that $Ax=0$ iff $\alpha_1=...=\alpha_r=0$, so that the null space of $A$ is spanned by $v_{r+1},...,v_n$, as claimed. Thus, we have claim $(2)$, from which claim $(1)$ follows.

If $A$ is a square matrix, then $A=U\Sigma V^\ast$ where $U,\Sigma,V$ are also square matrices of the same dimension. Recall that unitary matrices have a unit determinant, meaning that and since $\Sigma$ is diagonal with the singular values along its diagonal, we have as claimed.

The equality $\lVert A\rVert_2=\sigma_1$ follows from the definition of the induced matrix 2-norm , since $\sigma_1$ is defined as the length of the longest principal semiaxis of the image of the unit hypersphere under $A$. Since multiplication by unitary matrices does not alter the Frobenius norm, we have that $\lVert A\rVert_\mathrm{F} = \lVert \Sigma\rVert_\mathrm{F}$. The Frobenius norm of a diagonal matrix equals the square root of the sum of the (nonzero) diagonal entries, so we have $\det A = \sqrt{\sigma_1^2 + ... + \sigma_r^2}$.

If $A=U\Sigma V^\ast$, then $AA^\ast = U\Sigma \Sigma^\ast U^\ast$, which is an eigenvalue decomposition of $AA^\ast$, with eigenvalues given by $\sigma_i^2$ along the diagonal of $\Sigma^2$. Similar reasoning applies for $A^\ast A=V\Sigma^\ast \Sigma V^\ast$, proving claim $(5)$. $\blacksquare$

Exercise 1. (Trefethen and Bau, 5.4) Suppose $A\in \mathbb C^{m\times m}$ has an SVD $A=U\Sigma V^\ast$. Find an eigenvalue decomposition of the $2m\times 2m$ block hermitian matrix

Solution. We may take and so that and as desired.




back to home page