## Regularization of singular least squares problems

### P. Carrette

### Department of Electrical Engineering Linkping University, S-581 83 Linkping, Sweden

### WWW:

http://www.control.isy.liu.se### Email:

carrette@isy.liu.se### March 11, 1998

**REGLERTEKNIK**

**AUTOMATIC CONTROL**
**LINKÖPING**

### Report no.: LiTH-ISY-R-2019 Submitted to BIT journal

### Technical reports from the Automatic Control group in Linkping are available by anony-

### mous ftp at the address

ftp.control.isy.liu.se### . This report is contained in the com-

### pressed postscript le

^{2019.ps.Z}

### .

## Regularization of singular least squares problems

### P. Carrette

### Department of Electrical Engineering, Linkoping University S-58183 Linkoping, Sweden

Abstract

### In this note, we analyze the inuence of the regularization procedure applied to singular LS square problems.

### It appears that, due to nite numerical accuracy within the computer calculations, the regularization parameter has to belong to a particular range of values in order to have the regularized solution close to that associated to the singular LS problem. Surprisingly enough, this range essentially depends on the square root of the computer precision while the deciency (or singularity) of the regularized LS problem is governed by this precision.

### The analysis is based on matrix perturbation theory for which the paper 12] is an utmost reference.

Keywords

### : Matrix perturbation, Tikhonov regularization, Singular value decompo- sition.

## 1 Introduction

### In this contribution, we present results concerning the use of Tikhonov regularization (see 13, 10] and references therein) while solving singular least squares (LS) problems, i.e.

x

0

### = min

x

kAx;bk

2

### subject to min

^{kxk}

^{2}

### (1)

### for which the matrix

^{A}

^{2}

^{R}

^{N}

^{n}

### (for

^{N}

^{}

^{n}

### ) is (column) rank decient. The corresponding regularized LS problem is as follows

### ~

x

### = min

x

kAx;bk 2

2

### +

^{}

^{2}

^{kxk}

^{2}

^{2}

^{}

### (2)

### for some

^{}

### value.

### Here, we intend to investigate the inuence of the regularization parameter

^{}

### upon the deviation between the regularized solution ~

^{x}

^{}

### and the vector

^{x}

^{0}

### (solution to problem (1)), i.e. ~

^{x}

^{}

^{;}

^{x}

^{0}

### . The reason for this study is that we want to nd a good approximation of

x

0

### without solving the singular LS problem itself (by, e.g., performing the pseudo-inverse

### 1

10^{−12} 10^{−10} 10^{−8} 10^{−6} 10^{−4} 10^{−2} 10^{0}
10^{−8}

10^{−6}
10^{−4}
10^{−2}
10^{0}
10^{2}
10^{4}
10^{6}
10^{8}

### Figure 1: Quantities

^{k}

^{x}

### ~

^{}

^{;}

^{x}

^{0}

^{k}

^{2}

### (

^{|}

### ) and

k^{x}

### ~

^{}

^{k}

^{2}

### (

^{;;}

### ) as functions of .

2.2 2.25 2.3 2.35 2.4 2.45 2.5 2.55 2.6 2.65 2.7

10^{1}
10^{2}

kA^{x}~^{;}^{bk2}

k~xk2

### Figure 2:

^{L}

### -curve example with the three points appearing in Figure 1.

### 11] of the matrix

^{A}

### , i.e.

^{x}

^{0}

### =

^{A}

^{y}

^{b}

### ).

### As it appears in the literature (see 1, 5, 7, 4, 14] and the Matlab package presented in 8]), Tikhonov regularization gives us a grip for such an approximation while solving the ordinary (full column rank) LS problem (2). The available results about Tikhonov regularization generally consider the links between the two contributions to the cost of the regularized problem, i.e.

^{kAx}

^{;}

^{bk}

^{2}

### and

^{kxk}

^{2}

### , only. Indeed, roughly speaking, the solution to the original problem (1) can be obtained in making these two quantities simultaneously small for appropriate value of the regularization parameter

^{}

### . By this, we have in mind the reasoning that leads to the selection of \best"

^{}

### values by inspecting the

^{L}

### -curve that is the representation of these two quantities at regularized solutions, i.e.

^{k}

^{x}

### ~

^{}

^{k}

^{2}

### as a function of

^{kA}

^{x}

### ~

^{}

^{;}

^{bk}

^{2}

### for dierent values of

^{}

### , (see 7] for more details).

### Unfortunately, the \corner" in the

^{L}

### -curve does not provide

^{}

### values that are robust with respect to the 2-norm of the deviation we are interested in, i.e.

^{k}

^{x}

### ~

^{}

^{;}

^{x}

^{0}

^{k}

^{2}

### . Let us illustrate this fact by a simple example:

^{A}

### is a 50

^{}

### 5 matrix with rank 4 and unit nonzero singular values while the 50 elements of the column

^{b}

### are samples of an uniformly distributed (in 0

^{}

### 1]) random variable. In Figure 1, we have presented the 2-norm of the deviation between the regularized solution ~

^{x}

^{}

### and the original solution

^{x}

^{0}

### as well as the 2-norm of the regularized solution ~

^{x}

^{}

### as functions of the parameter

^{}

### . The graph of the primer deviation 2-norm can be divided into two parts in agreement with the decreasing and increasing behaviors of this error with

^{}

### . Three points have been enlightened on the straight line curve. In Figure 2, we have displayed the corresponding

^{L}

### -curve, i.e.

### (

^{kA}

^{x}

### ~

^{}

^{;}

^{bk}

^{2}

^{}

^{k}

^{x}

### ~

^{}

^{k}

^{2}

### ) for dierent values of

^{}

### . The preceding three points are concentrated in the lower-left corner of this curve so that it does not succeed in giving a high preference to the \o" point that is associated to the \best" approximation of the reference solution

x

0

### , i.e. for

^{}

^{}

### 5

^{:}

### 6 10

^{;5}

### .

### In order to overcome the poor capability of the

^{L}

### -curve (as well as of the usual studies of Tikhonov regularization) to end up with a

^{}

### value corresponding to a solution ~

^{x}

^{}

### close

### 2

### to the reference vector

^{x}

^{0}

### , we here provide the analysis of the deviation between these two solutions, i.e. ~

^{x}

^{}

^{;}

^{x}

^{0}

### , on the basis of matrix perturbation theory. Therefore, we make an intensive use of the paper of Stewart 12] that presents a complete analysis of the perturbation of the singular value decomposition (SVD) of matrices. Note that in 5], Hansen provides expressions for the upperbound on the relative deviation between a perturbed and the unperturbed solution, denoted

^{x}

^{}

### , of a LS problem similar to (2), i.e.

k^{x}

### ~

^{}

^{;}

^{x}

^{}

^{k}

^{2}

^{=kx}

^{}

^{k}

^{2}

### .

### As a by-product of our analysis, intervals over the regularization parameter values are found for any given admissible accuracy imposed on the regularized solution ~

^{x}

^{}

### , i.e.

^{}

^{2}

^{}

^{;}

^{}

^{}

^{+}

### ] implying that

^{k}

^{x}

### ~

^{}

^{;}

^{x}

^{0}

^{k}

^{2}

^{}

^{}

### .

### The paper is organized as follows. In Section 2, we introduce the notations we will be dealing with in the sequel and we discuss on how numerical errors should be taken into account in order to make the analysis of the regularized solution deviation explain its behavior (as seen in Figure 1). In Section 3, we decompose the deviation between the regularized solution ~

^{x}

^{}

### and the original

^{x}

^{0}

### into two components, i.e. either in or out of the kernel of the problem matrix

^{A}

### . In Section 4, we gives expressions for the singular value decomposition (SVD) of the perturbed version of the matrix

^{A}

### that enters in the resolution of the regularized LS problem (2). This is achieved on the basis of results presented in 12].

### In Section 5, we end up with a closed form expression of the 2-norm of the regularized solution error. Its intrinsic characteristics with respect to the regularization parameter

^{}

### are commented in some details. Finally, simulation examples are presented in Section 6.

### They completely agree with the results derived in the preceding sections.

## 2 Notations and numerical error discussion

### First, let us introduce notations for the two LS problems, i.e. minimization (1) and (2), respectively.

### The matrix

^{A}

### has a column rank identical to

^{r}

^{<}

^{n}

### while

^{b}

### =

^{Ax}

^{0}

### +

^{e}

### where

^{e}

^{2}

### ker

^{A}

^{T}

### originates from the inconsistency of the LS problem (1). We also dene the matrix

A 2R

(N+n) n

### as

A

### =

A

### 0

### and

^{}

### =

b

### 0

### The SVD of

^{A}

### is written as

A

### =

^{U}

^{1}

^{U}

^{2}

### ]

S

1

### 0 0 0

V T

1

V T

2

### =

^{U}

^{1}

^{S}

^{1}

^{V}

^{1}

^{T}

### (3)

### where

^{S}

^{1}

### = diag(

^{s}

^{1}

^{}

^{}

^{}

^{}

^{}

^{s}

^{r}

### ) with

^{s}

^{i}

### , the

^{i}

### -th (in decreasing order) nonzero singular value of

^{A}

### while the matrices

^{U}

^{1}

^{U}

^{2}

### ] and

^{V}

^{1}

^{V}

^{2}

### ] are orthogonal of dimension (

^{N}

### +

^{n}

### ) and

^{n}

### , respectively. Then, the orthogonal projector onto the range of

^{A}

^{T}

### is

^{K}

^{k}

### =

^{V}

^{1}

^{V}

^{1}

^{T}

### while its orthogonal counterpart, i.e. the orthogonal projector onto the kernel of

^{A}

### , is denoted

### 3

K

?

### =

^{I}

^{;}

^{K}

^{k}

### (=

^{V}

^{2}

^{V}

^{2}

^{T}

### ).

### With these notations in mind, we can write the solution to the original LS problem as

x

0

### =

^{A}

^{y}

^{}

### where

^{A}

^{y}

### is the pseudo-inverse of

^{A}

### , i.e.

^{A}

^{y}

### =

^{V}

^{1}

^{S}

^{1}

^{;1}

^{U}

^{1}

^{T}

### (uniquely dened). It is worth mentioning that

^{x}

^{0}

### =

^{K}

^{k}

^{x}

^{0}

### so that

^{x}

^{0}

### all lies in the range of

^{A}

^{T}

### .

### Remark 1 The \

^{n}

### " operator designed in Matlab for solving LS problems, i.e.

^{An}

### , leads to solutions that generally contains a component in the kernel space of

^{A}

### (when it is non-trivial), i.e.

x

LS

### =

^{x}

^{0}

### +

^{K}

^{?}

^{ }

### with nonzero

^{ }

^{2}

^{R}

^{n}

### . The reason for this is that the procedure uses a

^{QR}

### decompo- sition of the matrix

^{A}

### and, in case

^{R}

### has its (

^{n}

^{;}

^{r}

### ) last rows that contain negligible elements regarding the numerical accuracy of the computer, the (

^{n}

^{;}

^{r}

### ) last elements of the associated LS solution are xed to zero. In fact, this corresponds to choosing

^{ }

### so that

^{K}

^{?}

^{ }

### exactly compensates the (

^{n}

^{;}

^{r}

### ) last elements of the reference solution

^{x}

^{0}

### (after permutations if necessary in the

^{QR}

### decomposition process).

^{2}

### Now, let us turn to the solution of the regularized LS problem presented in (2). This problem can be expressed in terms of a particular perturbation of the matrix

^{A}

### . Obviously, the second term in the brackets of (2) makes the bottom part of the corresponding matrix

A

### , denoted

^{A}

^{}

### , become an identity matrix scaled by the regularization parameter

^{}

### , i.e.

A

### =

A

I

### The corresponding solution is

x

### =

^{A}

^{y}

^{}

^{}

### (or

^{A}

^{}

^{n}

### )

### It is unique because

^{A}

^{}

### is full column rank for

^{}

^{>}

### 0. From a numerical point of view, we must ask for

^{}

^{ }

^{}

### (e.g.

^{}

^{>}

### 10

^{2}

^{}

### ) where

^{}

### denotes the accuracy of the computer.

### Unfortunately, when we simulate the deviation between this \theoretically" regularized solution

^{x}

^{}

### and the solution

^{x}

^{0}

### , it is impossible to explain the deviation generally associ- ated with the \practically" regularized solution (see the straight line in Figure 8 compared to that in Figure 1). This means that our assumption concerning the inuence of the reg- ularization upon the matrix

^{A}

### (leading to

^{A}

^{}

### ) is not correct. This can be viewed as a bad model of the regularization eect because it cannot reveal its intrinsic characteristics. It actually appears that additional perturbations must be considered: namely, the numerical errors associated to the computation of the regularized solution. More precisely, we deal with the following regularized matrix

### ~

A

### =

^{A}

^{}

### +

^{E}

### (4)

### 4

### where

^{}

### denotes the numerical accuracy, i.e.

^{}

### = 2

^{:}

### 2 10

^{;16}

### in Matlab 5.1, and

^{E}

^{2}

R

( N+n) n

### stands for the (normalized) numerical error matrix (whose structure is detailed below). The SVD of ~

^{A}

^{}

### is denoted

### ~

A

### = ~

^{U}

^{1}

^{U}

### ~

^{2}

^{U}

### ~

^{3}

### ]

2

4

### ~

S

1

### 0 0 ~

^{S}

^{2}

### 0 0

3

5

^{V}

### ~

1^{T}

### ~

V T

2

### = ~

^{U}

^{1}

^{S}

### ~

^{1}

^{V}

### ~

1^{T}

### + ~

^{U}

^{2}

^{S}

### ~

^{2}

^{V}

### ~

2^{T}

### where ~

^{S}

^{1}

### is a diagonal containing the

^{r}

### largest singular values of ~

^{A}

^{}

### (in increasing order!) and ~

^{S}

^{2}

### is a diagonal containing its (

^{n}

^{;}

^{r}

### ) last nonzero singular values. The matrices ~

^{U}

^{1}

^{U}

### ~

2^{U}

### ~

3### ] and ~

^{V}

^{1}

^{V}

### ~

2### ] are orthogonal of dimension (

^{N}

### +

^{n}

### ) and

^{n}

### , respectively.

### The solution of the corresponding LS problem is written as

### ~

x

### = ~

^{A}

^{y}

^{}

^{}

### (or ~

^{A}

^{}

^{n}

### ) (5)

### It is unique in case ~

^{A}

^{}

### is full column rank, i.e.

^{}

^{ }

^{}

### for instance. The purpose of the paper is then to analyze in details the deviation between this solution ~

^{x}

^{}

### and the reference solution

^{x}

^{0}

### dened for the original LS problem.

### Finally, let us give a reasonable structure for the numerical error matrix

^{E}

### . Therefore, we consider the left singular subspace associated to

^{U}

^{1}

### and we show how the associated singular values may induce a particular scaling of the elements of this matrix. We can write

### ~

A T

U

1

### =

^{V}

^{1}

^{S}

^{1}

### +

^{E}

^{T}

^{U}

^{1}

### (6)

### The

^{i}

### -th column of this matrix has a 2-norm that is approximately identical to

^{s}

^{i}

### . This means that the related numerical perturbation must take this scale into account, i.e.

### (

^{V}

^{1}

### )

^{i}

^{s}

^{i}

### +

^{}

### (

^{E}

^{T}

^{U}

^{1}

### )

^{i}

### =

^{s}

^{i}

^{;}

### (

^{V}

^{1}

### )

^{i}

### +

^{}

### (

^{X}

^{1}

^{T}

### )

^{i}

^{}

### for an appropriate matrix

^{X}

^{1}

### whose denition is made regardless of the singular values of

^{A}

### . This leads to

^{E}

^{T}

^{U}

^{1}

### =

^{X}

^{1}

^{T}

^{S}

^{1}

### . For what concerns the

^{E}

^{T}

^{U}

^{2}

### counterpart, it can only be said that the largest elements within

^{E}

### will induce large contributions to this matrix product. Hence, we globally propose that the numerical error matrix

^{E}

### can be written as

E

### =

^{U}

^{1}

^{S}

^{1}

^{X}

^{1}

### +

^{s}

^{1}

^{U}

^{2}

^{X}

^{2}

### (7) for which we point out that the matrices

^{X}

^{1}

### and

^{X}

^{2}

### have normalized elements (indepen- dently of the

^{s}

^{i}

### 's and of

^{}

### ).

## 3 Deviation of the regularized solution

### The deviation between the regularized solution ~

^{x}

^{}

### and the reference solution

^{x}

^{0}

### can be decomposed into two parts, i.e.

### ~

x

;x

0

### =

^{}

^{K}

^{k}

^{x}

### ~

^{}

^{;}

^{x}

^{0}

^{}

### +

^{K}

^{?}

^{x}

### ~

^{}

### (8)

### 5

### The rst term in the right hand side belongs to the range of the transpose of the matrix

^{A}

### while the second completely lies in its kernel (see denition of the orthogonal projector

^{K}

^{k}

### and

^{K}

^{?}

### , respectively). Because of these orthogonal projectors, each of these contributions is orthogonal to the other, i.e.

k^{x}

### ~

^{}

^{;}

^{x}

^{0}

^{k}

^{2}

^{2}

### =

^{kK}

^{k}

^{x}

### ~

^{}

^{;}

^{x}

^{0}

^{k}

^{2}

^{2}

### +

^{kK}

^{?}

^{x}

### ~

^{}

^{k}

^{2}

^{2}

### In order to analyze these contributions, let us develop the links that exist between the two solution vectors.

### From its denition in equation (5), the regularized solution ~

^{x}

^{}

### can be written as

### ~

x

### = ~

^{V}

^{1}

^{V}

### ~

^{2}

### ]

^{S}

### ~

^{1}

### 0 0 ~

^{S}

^{2}

;2

^{V}

### ~

1^{T}

### ~

V T

2

^{A}

### ~

^{T}

### =

^{ }

^{V}

### ~

^{1}

^{S}

### ~

1^{;2}

^{V}

### ~

1^{T}

### + ~

^{V}

^{2}

^{S}

### ~

2^{;2}

^{V}

### ~

2^{T}

;

V

1 S

2

1 V

T

1 x

0

### +

^{E}

^{T}

^{}

^{}

### where we have used the fact that

### ~

A T

### =

^{A}

^{T}

^{}

### +

^{E}

^{T}

^{}

### =

^{V}

^{1}

^{S}

^{1}

^{2}

^{V}

^{1}

^{T}

^{x}

^{0}

### +

^{E}

^{T}

^{}

### with

^{x}

^{0}

### =

^{V}

^{1}

^{S}

^{1}

^{;1}

^{U}

^{1}

^{T}

^{}

### . With the structure we have introduced for the numerical error matrix

^{E}

### , we also have that

E T

### = (

^{U}

^{1}

^{S}

^{1}

^{X}

^{1}

### +

^{s}

^{1}

^{U}

^{2}

^{X}

^{2}

### )

^{T}

^{}

### =

^{X}

^{1}

^{T}

^{S}

^{1}

^{2}

^{V}

^{1}

^{T}

^{x}

^{0}

### +

^{s}

^{1}

^{X}

^{2}

^{T}

### (

^{U}

^{2}

^{T}

^{}

### )

### where it is worth noting that

^{kU}

^{2}

^{T}

^{k}

^{2}

^{2}

### (identical to

^{kek}

^{2}

### ) is the value of the original LS cost at

^{x}

^{0}

### .

### While coming back to the two contributions to the deviation of this regularized solution, we can write

K

k^{x}

### ~

^{}

^{;}

^{x}

^{0}

### =

^{V}

^{1}

^{h}

### (

^{V}

^{1}

^{T}

^{V}

### ~

1### )~

^{S}

^{1}

^{;2}

### (~

^{V}

^{1}

^{T}

^{V}

^{1}

### )

^{S}

^{1}

^{2}

^{;}

^{I}

^{i}

^{V}

^{1}

^{T}

^{x}

^{0}

### +

V

1

### (

^{V}

^{1}

^{T}

^{V}

### ~

1### )~

^{S}

^{1}

^{;2}

^{}

^{h}

### (~

^{V}

^{1}

^{T}

^{X}

^{1}

^{T}

### )

^{S}

^{1}

^{2}

^{V}

^{1}

^{T}

^{x}

^{0}

### +

^{s}

^{1}

### (~

^{V}

^{1}

^{T}

^{X}

^{2}

^{T}

### ) (

^{U}

^{2}

^{T}

^{}

### )

^{i}

### +

V

1

### (

^{V}

^{1}

^{T}

^{V}

### ~

^{2}

### )~

^{S}

^{2}

^{;2}

^{h}

### (~

^{V}

^{2}

^{T}

^{V}

^{1}

### ) +

^{}

### (~

^{V}

^{2}

^{T}

^{X}

^{1}

^{T}

### )]

^{S}

^{1}

^{2}

^{V}

^{1}

^{T}

^{x}

^{0}

### +

^{}

^{s}

^{1}

### (~

^{V}

^{2}

^{T}

^{X}

^{2}

^{T}

### ) (

^{U}

^{2}

^{T}

^{}

### )

^{i}

### (9) and

K

?^{x}

### ~

^{}

### =

^{V}

^{2}

### (

^{V}

^{2}

^{T}

^{V}

### ~

1### )~

^{S}

^{1}

^{;2}

^{h}

### (~

^{V}

^{1}

^{T}

^{V}

^{1}

### ) +

^{}

### (~

^{V}

^{1}

^{T}

^{X}

^{1}

^{T}

### )]

^{S}

^{1}

^{2}

^{V}

^{1}

^{T}

^{x}

^{0}

### +

^{}

^{s}

^{1}

### (~

^{V}

^{1}

^{T}

^{X}

^{2}

^{T}

### ) (

^{U}

^{2}

^{T}

^{}

### )

^{i}

### +

V

2

### (

^{V}

^{2}

^{T}

^{V}

### ~

^{2}

### )~

^{S}

^{2}

^{;2}

^{h}

### (~

^{V}

^{2}

^{T}

^{V}

^{1}

### ) +

^{}

### (~

^{V}

^{2}

^{T}

^{X}

^{1}

^{T}

### )]

^{S}

^{1}

^{2}

^{V}

^{1}

^{T}

^{x}

^{0}

### +

^{}

^{s}

^{1}

### (~

^{V}

^{2}

^{T}

^{X}

^{2}

^{T}

### ) (

^{U}

^{2}

^{T}

^{}

### )

^{i}

### (10) Let us give an interpretation for these two expressions. Because of the regularization of the singular LS problem, the right singular vectors of

^{A}

### rotate a little bit, i.e. leading to ~

^{V}

^{1}

^{V}

### ~

^{2}

### ], and its two singular value subsets (i.e.

^{s}

^{i}

### for

^{i}

### = 1

^{}

^{}

^{}

^{r}

### as well as the remaining

### 6

### ~

K k

### ~

K

?

K k K

?

### ~

x

x

0

### Figure 3: Schematic representation of the compo- nents of the regularized solution ~

^{x}

^{}

### .

### zero singular values) are also slightly altered giving rise to ~

^{S}

^{1}

### and ~

^{S}

^{2}

### . The regularized solution ~

^{x}

^{}

### is naturally expressed in terms of these perturbed singular pairs, i.e. (~

^{S}

^{1}

^{}

^{V}

### ~

^{1}

### ) and (~

^{S}

^{2}

^{}

^{V}

### ~

^{2}

### ). In other words, two components are found for it according to the related sub- spaces for which the orthogonal projectors are ~

^{K}

^{k}

### = ~

^{V}

^{1}

^{V}

### ~

1^{T}

### and ~

^{K}

^{?}

### = ~

^{V}

^{2}

^{V}

### ~

2^{T}

### , respectively.

### Hence, the (above) expressions for the contributions to the deviation of the regularized solution show up the projection of this solution back to the subspaces associated to the original matrix

^{A}

### , i.e. by use of the orthogonal projector

^{K}

^{k}

### and

^{K}

^{?}

### . In Figure 3, we have schematically drawn this back projection for a 2-dimensional case.

### It is also worth noticing that the components in the perturbed subspaces are inuenced by the inverse of the square of the corresponding singular values, i.e. ~

^{S}

^{1}

^{;2}

### and ~

^{S}

^{2}

^{;2}

### , respec- tively. As the latter will be seen to behave similarly to

^{}

### , the corresponding component will show an extreme sensitivity to this regularization parameter, i.e.

^{}

### 1

^{=}

^{2}

### .

### In order to analyze these expressions, we must evaluate the role of the regularization parameter

^{}

### and of the numerical error matrix

^{E}

### on the following quantities

V T

1 ^{V}

### ~

^{1}

^{}

^{V}1

^{T}

^{V}

### ~

^{2}

^{}

^{V}2

^{T}

^{V}

### ~

^{1}

### and

^{V}

^{2}

^{T}

^{V}

### ~

^{2}

### as well as ~

^{S}

^{1}

### and ~

^{S}

^{2}

### . A simple way to achieve this goal is to use results concerning the SVD of perturbed matrices.

## 4 SVD of perturbed matrices

### In 12], Stewart shows that the right singular vectors of a perturbed matrix, ~

^{A}

^{}

### say, can be expressed in terms of those of the original matrix,

^{A}

### say, as follows

### ~

^{V}

^{1}

^{V}

### ~

2### ] =

^{V}

^{1}

^{V}

^{2}

### ]

I ;P T

P I

### (

^{I}

### +

^{P}

^{T}

^{P}

### )

^{;1=2}

### 0 0 (

^{I}

### +

^{P}

^{P}

^{T}

### )

^{;1=2}

### (11) while

^{P}

^{2}

^{R}

^{( n;r)}

^{r}

### is a matrix satisfying the equation system

(

Q

### (

^{S}

^{1}

### +

^{E}

^{11}

### )

^{;}

### (

^{}

^{22}

### +

^{E}

^{22}

### )

^{P}

### = (

^{}

^{21}

### +

^{E}

^{21}

### )

^{;}

^{}

^{QE}

^{12}

^{P}

P

### (

^{S}

^{1}

### +

^{E}

^{11}

^{T}

### )

^{;}

### (

^{}

^{T}

^{22}

### +

^{E}

^{22}

^{T}

### )

^{Q}

### =

^{E}

^{12}

^{T}

^{;}

^{P}

### (

^{}

^{T}

^{21}

### +

^{E}

^{21}

^{T}

### )

^{Q}

### (12)

### 7

### for

^{Q}

^{2}

^{R}

^{( N+n;r)}

^{r}

### and

^{2j}

### =

^{U}

^{2}

^{T}

### 0

^{I}

### ]

^{T}

^{V}

^{j}

### while

^{E}

^{ij}

### =

^{U}

^{i}

^{T}

^{EV}

^{j}

### .

### Note that

^{T}

^{2j}

^{2j}

### =

^{I}

### and

^{T}

^{21}

^{22}

### = 0 because of (0

^{I}

### ]

^{U}

^{2}

### )(0

^{I}

### ]

^{U}

^{2}

### )

^{T}

### =

^{I}

### together with the orthogonality of the original right singular vectors, i.e.

^{V}

^{1}

^{V}

^{2}

### ].

### From expression (11), we immediately have that

V T

1 ^{V}

### ~

^{1}

### = (

^{I}

### +

^{P}

^{T}

^{P}

### )

^{;1=2}

^{V}

^{1}

^{T}

^{V}

### ~

^{2}

### =

^{;P}

^{T}

### (

^{I}

### +

^{P}

^{P}

^{T}

### )

^{;1=2}

V T

2

### ~

V

1

### =

^{P}

### (

^{I}

### +

^{P}

^{T}

^{P}

### )

^{;1=2}

^{V}

^{2}

^{T}

^{V}

### ~

2### = (

^{I}

### +

^{P}

^{P}

^{T}

### )

^{;1=2}

### (13) He also states that the perturbed singular values belong to disjoint sets so that

i

### (~

^{S}

^{1}

### ) =

^{}

^{i}

^{;}

### (

^{I}

### +

^{Q}

^{T}

^{Q}

### )

^{1=2}

### (

^{S}

^{1}

### +

^{}

### (

^{E}

^{11}

### +

^{E}

^{12}

^{P}

### ))(

^{I}

### +

^{P}

^{T}

^{P}

### )

^{;1=2}

^{}

i

### (~

^{S}

^{2}

### ) =

^{}

^{i}

^{;}

### (

^{I}

### +

^{QQ}

^{T}

### )

^{;1=2}

### (

^{}

^{22}

### +

^{}

### (

^{E}

^{22}

^{;}

^{QE}

^{12}

### ))(

^{I}

### +

^{P}

^{P}

^{T}

### )

^{1=2}

^{}

### (14) where

^{}

^{i}

### (

^{X}

### ) denotes the

^{i}

### -th singular value of the matrix

^{X}

### (in increasing order).

### After straightforward derivations, we end up with the following result.

### Proposition 1 Under the assumption that

^{}

^{}

^{}

^{}

^{s}

^{r}

### , we have that the solution for the

P

### and

^{Q}

### matrices in the equation system (12) respectively satisfy

P V T

2 X

T

1

### and

^{Q}

^{}

^{}

^{21}

^{S}

^{1}

^{;1}

### where the symbol \

^{}

### " should be understood in the spectral sense, i.e.

^{B}

^{}

^{C}

### is equivalent to

^{kB}

^{;}

^{Ck}

^{2}

^{}

### 1. Furthermore, we have that

V T

1

### ~

V

1

I V T

2

### ~

V

2

I

### and

^{V}

^{2}

^{T}

^{V}

### ~

1;

^{V}

^{1}

^{T}

^{V}

### ~

2### ]

^{T}

^{}

^{}

^{V}

^{2}

^{T}

^{X}

^{1}

^{T}

### as well as

^{}

^{i}

### (~

^{S}

^{1}

### )

^{=s}

^{i}

^{}

### 1 +

^{}

^{2}

^{=}

### 2

^{s}

^{2}

^{i}

### and

^{}

^{i}

### (~

^{S}

^{2}

### )

^{}

^{}

### for appropriate

^{i}

### .

### Proof: In case of small perturbations, the last quadratic terms in equations (12) are generally not considered. Thus, we only have to solve

(

Q

### (

^{S}

^{1}

### +

^{E}

^{11}

### )

^{;}

### (

^{}

^{22}

### +

^{E}

^{22}

### )

^{P}

### = (

^{}

^{21}

### +

^{E}

^{21}

### )

P

### (

^{S}

^{1}

### +

^{E}

^{11}

^{T}

### )

^{;}

### (

^{}

^{T}

^{22}

### +

^{E}

^{22}

^{T}

### )

^{Q}

### =

^{E}

^{12}

^{T}

### As the numerical accuracy

^{}

### is negligible compared to the diagonal elements in

^{S}

^{1}

### , we get

(

QS

1

;

### (

^{}

^{22}

### +

^{E}

^{22}

### )

^{P}

### = (

^{}

^{21}

### +

^{E}

^{21}

### )

P

### =

^{E}

^{12}

^{T}

^{S}

^{1}

^{;1}

### + (

^{}

^{T}

^{22}

### +

^{E}

^{22}

^{T}

### )

^{QS}

^{1}

^{;1}

### Then, we can develop the rst equation as follows

QS 2

1

;

### (

^{}

^{22}

### +

^{E}

^{22}

### )(

^{}

^{T}

^{22}

### +

^{E}

^{22}

^{T}

### )

^{Q}

### =

^{}

### (

^{21}

^{S}

^{1}

### +

^{}

^{22}

^{E}

^{12}

^{T}

### ) +

^{}

### (

^{E}

^{21}

^{S}

^{1}

### +

^{E}

^{22}

^{E}

^{12}

^{T}

### ) leading to

QS 2

1

;

2

^{22}

^{T}

^{22}

^{Q}

^{}

### (

^{}

^{21}

### +

^{E}

^{21}

### )

^{S}

^{1}

### 8

### while considering

^{}

^{}

^{}

### . Hence, for

^{}

^{}

^{s}

^{r}

### , we end up with

^{Q}

^{}

### (

^{}

^{21}

### +

^{E}

^{21}

### )

^{S}

^{1}

^{;1}

^{}

^{21}

^{S}

^{1}

^{;1}

### as well as

P E T

12 S

;1

1

### + (

^{}

^{T}

^{22}

### +

^{E}

^{22}

^{T}

### )(

^{}

^{21}

### +

^{E}

^{21}

### )

^{S}

^{1}

^{;2}

^{}

^{E}

^{12}

^{T}

^{S}

^{1}

^{;1}

### where we used the fact that

^{T}

^{22}

^{21}

### = 0. So, from the structure of the numerical error matrix

^{E}

### in expression (7), we get that

P

### (

^{V}

^{2}

^{T}

^{X}

^{1}

^{T}

^{S}

^{1}

^{U}

^{1}

^{T}

### +

^{s}

^{1}

^{X}

^{2}

^{T}

^{U}

^{2}

^{T}

### ]

^{U}

^{1}

### )

^{S}

^{1}

^{;1}

### =

^{}

^{V}

^{2}

^{T}

^{X}

^{1}

^{T}

### which means that the approximation of

^{P}

### is expressed regardless of the singular values of the original matrix

^{A}

### .

### Finally, we are able to evaluate the expressions (13) and (14), respectively.

### Therefore, we rst consider approximations of the square root matrices as follows (

^{I}

### +

^{Q}

^{T}

^{Q}

### )

^{1=2}

^{}

^{I}

### +

^{Q}

^{T}

^{Q=}

### 2

^{}

^{I}

### +

^{}

^{2}

^{S}

^{1}

^{;2}

^{=}

### 2

### (

^{I}

### +

^{QQ}

^{T}

### )

^{1=2}

^{}

^{I}

### +

^{QQ}

^{T}

^{=}

### 2

^{}

^{I}

### +

^{}

^{2}

^{21}

^{S}

^{1}

^{;2}

^{T}

^{21}

^{=}

### 2 as

^{T}

^{21}

^{21}

### =

^{I}

### , and

### (

^{I}

### +

^{P}

^{T}

^{P}

### )

^{1=2}

^{}

^{I}

### +

^{P}

^{T}

^{P}

^{=}

### 2

^{}

^{I}

### +

^{}

^{2}

^{X}

^{1}

### (

^{V}

^{2}

^{V}

^{2}

^{T}

### )

^{X}

^{1}

^{T}

^{=}

### 2

^{}

^{I}

### (

^{I}

### +

^{P}

^{P}

^{T}

### )

^{1=2}

^{}

^{I}

### +

^{P}

^{P}

^{T}

^{=}

### 2

^{}

^{I}

### +

^{}

^{2}

^{V}

^{2}

^{T}

### (

^{X}

^{1}

^{T}

^{X}

^{1}

### )

^{V}

^{2}

^{=}

### 2

^{}

^{I}

### Moreover, we have

### (

^{I}

### +

^{Q}

^{T}

^{Q}

### )

^{1=2}

^{S}

^{1}

### (

^{I}

### +

^{P}

^{T}

^{P}

### )

^{;1=2}

^{}

### (

^{I}

### +

^{}

^{2}

^{S}

^{1}

^{;2}

^{=}

### 2)

^{S}

^{1}

### and (

^{I}

### +

^{QQ}

^{T}

### )

^{;1=2}

^{}

^{22}

### (

^{I}

### +

^{P}

^{P}

^{T}

### )

^{1=2}

^{}

^{}

### (

^{I}

### +

^{}

^{2}

^{21}

^{S}

^{1}

^{;2}

^{T}

^{21}

^{=}

### 2)

^{22}

### =

^{}

^{22}

### because

^{T}

^{21}

^{22}

### = 0. Hence, from the denition (14), these expressions lead to approxi- mations of the diagonal elements of the perturbed singular value matrices, i.e.

i

### (~

^{S}

^{1}

### )

^{=s}

^{i}

^{}

### 1 +

^{}

^{2}

^{=}

### 2

^{s}

^{2}

^{i}

### and

^{}

^{i}

### (~

^{S}

^{2}

### )

^{}

^{}

### as the diagonal elements of ~

^{S}

^{1}

### are increasingly ordered and

^{T}

^{22}

^{22}

### =

^{I}

### .

### From this proposition, it is immediately seen that the right singular vectors are robust regarding the perturbation related to the regularization parameter

^{}

### . In fact, this is a well known result concerning the robustness of eigenvectors of symmetric matrices additively perturbed by a scaled identity matrix, e.g.

^{}

^{2}

^{I}