Singular value decomposition.

A powerful mathematical tool that arises from linear algebra is the “dimension reduction” method using the so-called singular value decomposition (SVD). This example is written in a tutorial style.

Matlab/Octave

In this set of exercises you will be using either Matlab or its libre counterpart Octave. It is advised to open a script file (File -> open -> create a script), and code there. If you run the code by pressing the “run” button (play icon). You can inspect the declared variables in the command-line section. This document will involve some coding, but I hope you will be able to follow the code snippets in this document. You may copy/paste and extend them.

Data in a square matrix

We can store all kinds of information in columns and concatenate columns to form a data matrix. We start by considering an artificial example: A researcher in biomedical engineering analyses data from nn patients. It so happens to be that he/she is interested in nn quantities regarding these patients. We illustrate the first patient’s column vector 𝐩1\mathbf{p}_1.

𝐩1=(Age: 45 yearsLength: 180 cmAortic diameter: 2.2 cm)\mathbf{p}_1 = \begin{pmatrix} \text{Age: 45 years} \\ \text{Length: 180 cm} \\ \text{Aortic diameter: 2.2 cm} \end{pmatrix}

For 3 patients, we can place all the data in a matrix AA, The rows imply the meaning and units.

A=(4540551801701702.22.32.1)A = \begin{pmatrix}45 & 40 & 55 \\ 180 & 170 & 170 \\ 2.2 & 2.3 & 2.1 \end{pmatrix}

This matrix maybe diagonalizable (why?). We can use Octave/Matlab to quickly find its eigen values and eigen vectors:

% Initialize matrix
A = [[45, 40, 50],
    [180, 172, 173],
    [2.2 , 2.3 , 2.1]];
% Compute eigen vectors and eigen values
[V, l] = eig(A);

1) Using the code below, verify that V and l are the “eigen decomposition” of A=VlV1A = VlV^{-1}. Hint: use inv(A) to find the inverse of a matrix A.

% Compute the reconstructed matrix `Ar` using the eigen decomposition
Ar = V*l*inv(V); 

2) Using the code below, verify that the eigen vectors are normalized

% Lets have a look at the first eigen vector
eig1 = A(:,1); 
% We can square the components and sum them to get the squared length
% Mind the dot in the element-wise square operator `.^2`. 
leig1 = sum(eig1.^2); 

From inspection of the matrix l, the matrix AA has three distinct (real!) eigen values, from large to small:

λ1=216.85...,λ2=1.55...,λ3=0.06895\lambda_1 = 216.85..., \lambda_2 = 1.55..., \lambda_3 = 0.06895

It appears that Λ1\Lambda_1 is much larger than the others. Imagine if we would set all values in the diagonal matrix (l) to zero, except for λ1\lambda_1. We can call this a “low rank approximation” of the original matrix.

% Row-rank approximate diagonal matrix
l_lr = zeros(3,3);
l_lr(1,1) = l(1,1);

Instead of using the original decomposition (A=VlV1A = VlV^{-1}), we can attempt a low-rank approximation of the data set.

% Low rank approximation of A
A_lr = ...%fill in

Using the rules of matrix multiplication, we can see that the second and third column of V do not contribute to this product. Similarly, the second and third row of V1V^{-1} play no role in this approximation. As such, we can approximate the matrix AA, using fewer data. The approximation requires 3 + 1 + 3 = 7 non-zero data entries, instead of the original 9 (3 times 3). This is a “lossy” data compression method, as we are not able to reconstruct the original data matrix exactly.

3) Rewrite the low-rank approximation as a column vector - row vector multiplication.

4) Have a look at the component values in the first eigen vector (associated with λ1\lambda_1). What do these numbers tell you about an “Eigen person”?

Notice that we could make a better approximation if we would include the second largest eigen value, and its associated column and row in VV and V1V^{-1}, respectively.

Data compression in a rectangular matrix

Grey-scale image data consist of a grey-scale value for each pixel in the image. We can import an image and convert it to a grey-scale matrix using the following code in Octave/Matlab. (replace the link with a non-square (moderately sized) image/photo that you like).

% Load an image via a path or the internet
[Image,b] = imread ("https://assets.w3.tue.nl/w/fileadmin/_processed_/\
3/f/csm_Gemini_zuidzijde_396-fr2-a0002_web_0b845bfc46.jpg");
% The color Image is now represented as a `m x n x 3` matrix.
% We should average over the 3rd dimension (RGB-values) to obtain a grey-scale image
grey = mean(Image, 3); 
% Inspect the grey-scale image using,
size (grey)
imshow (grey, b);
An example image in grey-scale data, showing an impression of the entrance of a future renovated Gemini building

Since the image it not square, we cannot readily apply the eigen-decomposition method above. However, ATAA^TA and AATAA^T are square, and these will appear to encode (some) of the data of our grey-scale image.

We will strive to find the matrices U,S,VU, S, V that decompose a matrix AA, according to.

A=USVT.A = USV^{T}.

Matrix layout of a singular-value decomposition. (Image by Mercurysheet, CC BY-SA 4.0)

Similar to the eigen-value decomposition SS will be a diagonal matrix. For AA an m×nm \times n matrix, the matrix UU will be of size m×mm \times m, the diagonal matrix SS will be of size m×nm \times n and VV will be a n×nn \times n matrix. For an SVD (A=USVTA = USV^T) decomposition, we dictate that the columns of UU and VV must be normalized, this ensures that the diagonal entries of SS are the “weights” associated with the corresponding columns of UU and VV. Furthermore, the ansatz is that the columns in UU and VV are orthogonal. An auxiliary theorem (not part of this course) states that for an orthonormal matrix (NN), its inverse is equal to its transpose;

NT=N1.N^{T} = N^{-1}.

Lets derive an expression that relates AA, SS and UU.

A=USVT,A = USV^T, AAT=USVT(USVT)T,AA^{T} = USV^T(USV^T)^T, AAT=USVTVSTUT,AA^{T} = USV^TVS^TU^T, AAT=USSTUT,AA^{T} = USS^TU^T, AATU=USST.AA^{T}U = USS^T.

And analogous,

5) Proof that ATAV=VSTSA^TAV = V S^TS

We can see that the columns of UU and VV are the eigen vectors of AATAA^T and ATAA^TA, respectively, with the eigen values the associated entries of the diagonal square matrices SSTSS^T and STSS^TS, respectively.

Our ansatz that the columns of UU and VV are orthogonal is hereby closed. This is rooted in an auxiliary theorem (not part of this course) that states that the distinct eigen vectors of a symmetric matrix (MM) are orthogonal.

6) Show that ATAA^TA and AATAA^T are symmetric matrices.

A failing attempt

The next computer coding is related to finding the eigen values and eigen vectors for ATAA^TA and AATAA^T. We can use Octave’s eig function.

% Compute the two square matrices
AAT = grey*transpose(grey);
ATA = transpose(grey)*grey;
[U, STS] = eig(AAT)
[V, SST] = eig(ATA) 

7) Inspect the sizes of the matrices U, V, STS and SST

8) Inspect the eigenvalues using,

plot (abs(diag(STS)))
hold on
plot (diag(SST))
set(gca, 'YScale', 'log')
hold off; 

We should ensure that the eigen values appear in descending order and also reorder the eigen vectors accordingly.

% Sort eigen values and vectors
[sst, idx] = sort(diag(SST), "descend");
Us = U(:,idx);
[sts, idx] = sort(diag(STS), "descend");
Vs = V(:,idx);

9) Compare the eigen values sst and sts

Now we should compute SS.

10) Show that the diagonal entries of SS (i.e. the singular values) are the square roots of the eigen values stores in sts and/or sst.

We can now compute SS,

% Compute the singular-value diagonal m x n matrix (S)
s = sqrt(sst);
sg = size(grey);
S = diag(s, sg(1), sg(2));

We are now ready to test our singular value decomposition

11) Inspect the sizes of U, S, V, the singular values, and, using the code below, verify the decomposition.

gr = U*S*transpose(V); 
imshow (gr, []);
It seems our reconstructed image (gr) is not identical to the original (g). Altough it looks similar, something has gone wrong…

A successful decomposition

Indeed, when we discovered that both AATU=USSTAA^{T}U = USS^T and ATAV=VSTSA^TAV = V S^TS, we have wrongly assumed that these equations ensure that the original decomposition A=USVTA = USV^T is automatically satisfied. This would have worked if the singular value decomposition is unique. But there may exist many matrices (BB) that are smilar to AA in the sense that BBT=AATBB^{T} = AA^T etc.

Instead, we should consider the more intimate connection between UU, VV and the original matrix (AA) given by,

AV=US.AV = US.

12) Derive this expression.

Given that SS is a diagonal matrix, we see that the linear transformation of the the matrix T(V)=AVT(V) = AV, results in a matrix with scaled columns of UU. At this point we can choose to first compute an UU and SUS_U using the eig() function (as we already did), and then solve for the intimate relation above to obtain the associated VUV_U. Alternatively, we could compute VV and SVS_V using eig() and solve for the associated UVU_V. Both are fine, but it should be clear that if we have AA, VV and SS, we can readily compute the product UVUV, and since we know the diagonal matrix SS, the columns of UU are the columns of UVUV scaled by the diagonal entries of SS. Since SS is a diagonal matrix, apply the scaling operation to each column of AVAV in one operation in MATLAB/Octave.

% Compute Uv that will decompose A, using V and S
Uv = A*V/S;

Noting that this Matalb/Octave specific syntax should not be confused with proper matrix algebra (You can not divide by a matrix!).

Now we can test the decomposition

13) Inspect the sizes of U, S, V, the singular values, and, using the code below, verify the decomposition.

gr = Uv*S*transpose(V);
imshow (gr, []);
Much better…

Data compression.

A compressed representation of the matrix A, only includes the r most prominent singular vectors. (Image by Mercurysheet, CC BY-SA 4.0)

In a final step, we select only the rr most prominent columns of UU (UrU_r) and VV (VrV_r) that are associated with the largest singular values in SS (SrS_r).

14) What are the sizes of the matrices UrU_r, SrS_r, and VrV_r?

We can use some Matlab/Octave coding to shrink the matrices.

% set a low-rank value
r = 16;
% First r columns of U and V form Ur and Vr
Ur = U(:, 1:r)
Vr = V(:, 1:r)
% We should only keep the associated rows and columns of $S$ as well
% Such that we only keep the upper left r x r entries of S.
Sr = S(1:r, 1:r)
% Approximate the image using only $j$ singular vectors in both Ur and Vr
gs = Ur*Sr*Vr'; 
imshow(gs, []);

15) Try a few different values of rr and enjoy your low-rank approximations

A reconstruction using the 16 most prominent singular vectors.

16) What type of images would be well approximated with relatively small number of singular vectors?

Continue to the examples of week 5…

The marvelous design of this website is taken from Suckless.org