QR decomposition

From Wikipedia, the free encyclopedia
(Redirected from QR factorization)
Jump to navigation Jump to search

In linear algebra, a QR decomposition, also known as a QR factorization or QU factorization, is a decomposition of a matrix A into a product A = QR of an orthonormal matrix Q and an upper triangular matrix R. QR decomposition is often used to solve the linear least squares (LLS) problem and is the basis for a particular eigenvalue algorithm, the QR algorithm.

Cases and definitions

[edit | edit source]

Square matrix

[edit | edit source]

Any real square matrix A may be decomposed as

A=QR,

where Q is an orthogonal matrix (its columns are orthogonal unit vectors meaning Q๐–ณ=Qโˆ’1) and R is an upper triangular matrix (also called right triangular matrix). If A is invertible, then the factorization is unique if we require the diagonal elements of R to be positive.

If instead A is a complex square matrix, then there is a decomposition A = QR where Q is a unitary matrix (so the conjugate transpose Qโ€ =Qโˆ’1).

If A has n linearly independent columns, then the first n columns of Q form an orthonormal basis for the column space of A. More generally, the first k columns of Q form an orthonormal basis for the span of the first k columns of A for any 1 โ‰ค k โ‰ค n.[1] The fact that any column k of A only depends on the first k columns of Q corresponds to the triangular form of R.[1]

Rectangular matrix

[edit | edit source]

More generally, we can factor a complex mร—n matrix A, with m โ‰ฅ n, as the product of an mร—m unitary matrix Q and an mร—n upper triangular matrix R. As the bottom (mโˆ’n) rows of an mร—n upper triangular matrix consist entirely of zeroes, it is often useful to partition R, or both R and Q:

A=QR=Q[R10]=[Q1Q2][R10]=Q1R1,

where R1 is an nร—n upper triangular matrix, 0 is an (m โˆ’ n)ร—n zero matrix, Q1 is mร—n, Q2 is mร—(m โˆ’ n), and Q1 and Q2 both have orthogonal columns.

Golub & Van Loan (1996, ยง5.2) call Q1R1 the thin QR factorization of A; Trefethen and Bau call this the reduced QR factorization.[1] If A is of full rank n and we require that the diagonal elements of R1 are positive then R1 and Q1 are unique, but in general Q2 is not. R1 is then equal to the upper triangular factor of the Cholesky decomposition of A* A (= ATA if A is real).

QL, RQ and LQ decompositions

[edit | edit source]

Analogously, we can define QL, RQ, and LQ decompositions, with L being a lower triangular matrix.

Computing the QR decomposition

[edit | edit source]

There are several methods for actually computing the QR decomposition, such as the Gramโ€“Schmidt process, Householder transformations, or Givens rotations. Each has a number of advantages and disadvantages.

Using the Gramโ€“Schmidt process

[edit | edit source]

Consider the Gramโ€“Schmidt process applied to the columns of the full column rank matrix A=[๐š1โ‹ฏ๐šn], with inner product โŸจ๐ฏ,๐ฐโŸฉ=๐ฏ๐–ณ๐ฐ (or โŸจ๐ฏ,๐ฐโŸฉ=๐ฏโ€ ๐ฐ for the complex case).

Define the projection:

proj๐ฎ๐š=โŸจ๐ฎ,๐šโŸฉโŸจ๐ฎ,๐ฎโŸฉ๐ฎ

then:

๐ฎ1=๐š1,๐ž1=๐ฎ1โ€–๐ฎ1โ€–๐ฎ2=๐š2โˆ’proj๐ฎ1๐š2,๐ž2=๐ฎ2โ€–๐ฎ2โ€–๐ฎ3=๐š3โˆ’proj๐ฎ1๐š3โˆ’proj๐ฎ2๐š3,๐ž3=๐ฎ3โ€–๐ฎ3โ€–โ‹ฎโ‹ฎ๐ฎk=๐škโˆ’โˆ‘j=1kโˆ’1proj๐ฎj๐šk,๐žk=๐ฎkโ€–๐ฎkโ€–

We can now express the ๐šis over our newly computed orthonormal basis:

๐š1=โŸจ๐ž1,๐š1โŸฉ๐ž1๐š2=โŸจ๐ž1,๐š2โŸฉ๐ž1+โŸจ๐ž2,๐š2โŸฉ๐ž2๐š3=โŸจ๐ž1,๐š3โŸฉ๐ž1+โŸจ๐ž2,๐š3โŸฉ๐ž2+โŸจ๐ž3,๐š3โŸฉ๐ž3โ‹ฎ๐šk=โˆ‘j=1kโŸจ๐žj,๐škโŸฉ๐žj

where โŸจ๐ži,๐šiโŸฉ=โ€–๐ฎiโ€–. This can be written in matrix form:

A=QR

where:

Q=[๐ž1โ‹ฏ๐žn]

and

R=[โŸจ๐ž1,๐š1โŸฉโŸจ๐ž1,๐š2โŸฉโŸจ๐ž1,๐š3โŸฉโ‹ฏโŸจ๐ž1,๐šnโŸฉ0โŸจ๐ž2,๐š2โŸฉโŸจ๐ž2,๐š3โŸฉโ‹ฏโŸจ๐ž2,๐šnโŸฉ00โŸจ๐ž3,๐š3โŸฉโ‹ฏโŸจ๐ž3,๐šnโŸฉโ‹ฎโ‹ฎโ‹ฎโ‹ฑโ‹ฎ000โ‹ฏโŸจ๐žn,๐šnโŸฉ].

Example

[edit | edit source]

Consider the decomposition of

A=[12โˆ’5146167โˆ’68โˆ’424โˆ’41].

Recall that an orthonormal matrix Q has the property Q๐–ณQ=I.

Then, we can calculate Q by means of Gramโ€“Schmidt as follows:

U=[๐ฎ1๐ฎ2๐ฎ3]=[12โˆ’69โˆ’58/561586/5โˆ’430โˆ’33];Q=[๐ฎ1โ€–๐ฎ1โ€–๐ฎ2โ€–๐ฎ2โ€–๐ฎ3โ€–๐ฎ3โ€–]=[6/7โˆ’69/175โˆ’58/1753/7158/1756/175โˆ’2/76/35โˆ’33/35].

Thus, we have

Q๐–ณA=Q๐–ณQR=R;R=Q๐–ณA=[1421โˆ’140175โˆ’700035].

Relation to RQ decomposition

[edit | edit source]

The RQ decomposition transforms a matrix A into the product of an upper triangular matrix R (also known as right-triangular) and an orthogonal matrix Q. The only difference from QR decomposition is the order of these matrices.

QR decomposition is Gramโ€“Schmidt orthogonalization of columns of A, started from the first column.

RQ decomposition is Gramโ€“Schmidt orthogonalization of rows of A, started from the last row.

Advantages and disadvantages

[edit | edit source]

The Gram-Schmidt process is inherently numerically unstable. While the application of the projections has an appealing geometric analogy to orthogonalization, the orthogonalization itself is prone to numerical error. A significant advantage is the ease of implementation.

Using Householder reflections

[edit | edit source]
Householder reflection for QR-decomposition: The goal is to find a linear transformation that changes the vector ๐ฑ into a vector of the same length which is collinear to ๐ž1. We could use an orthogonal projection (Gram-Schmidt) but this will be numerically unstable if the vectors ๐ฑ and ๐ž1 are close to orthogonal. Instead, the Householder reflection reflects through the dotted line (chosen to bisect the angle between ๐ฑ and ๐ž1). The maximum angle with this transform is 45 degrees.

A Householder reflection (or Householder transformation) is a transformation that takes a vector and reflects it about some plane or hyperplane. We can use this operation to calculate the QR factorization of an m-by-n matrix A with m โ‰ฅ n.

Q can be used to reflect a vector in such a way that all coordinates but one disappear.

Let ๐ฑ be an arbitrary real m-dimensional column vector of A such that โ€–๐ฑโ€–=|ฮฑ| for a scalar ฮฑ. ฮฑ should get the same sign as the k-th coordinate of ๐ฑ, where xk is to be the pivot coordinate after which all entries are 0 in matrix A's final upper triangular form. If the algorithm is implemented using floating-point arithmetic, then ฮฑ should get the opposite sign to avoid loss of significance (for example, when ๐ฑ is almost collinear with ๐ž1, โ€–๐ฎโ€– becomes "small" and ๐ฎ/โ€–๐ฎโ€– is numerically unstable; the extreme case is โ€–๐ฎโ€–=0, which causes the previous division to result in NaN).

In the complex case, set[2]

ฮฑ=โˆ’eiargxkโ€–๐ฑโ€–

and substitute transposition by conjugate transposition in the construction of Q below.

Then, where ๐ž1 is the vector [1 0 โ‹ฏ 0]T, || ยท || is the Euclidean norm and I is an mร—m identity matrix, set

๐ฎ=๐ฑโˆ’ฮฑ๐ž1,๐ฏ=๐ฎโ€–๐ฎโ€–,Q=Iโˆ’2๐ฏ๐ฏ๐–ณ.

Or, if A is complex

Q=Iโˆ’2๐ฏ๐ฏโ€ .

Q is an m-by-m Householder matrix, which is both symmetric and orthogonal (Hermitian and unitary in the complex case), and

Q๐ฑ=[ฮฑ0โ‹ฎ0].

This can be used to gradually transform an m-by-n matrix A to upper triangular form. First, we multiply A with the Householder matrix Q1 we obtain when we choose the first matrix column for x. This results in a matrix Q1A with zeros in the left column (except for the first row).

Q1A=[ฮฑ1โ‹†โ‹ฏโ‹†0โ‹ฎA0]

This can be repeated for Aโ€ฒ (obtained from Q1A by deleting the first row and first column), resulting in a Householder matrix Qโ€ฒ2. Note that Qโ€ฒ2 is smaller than Q1. Since we want it really to operate on Q1A instead of Aโ€ฒ we need to expand it to the upper left, filling in a 1, or in general:

Qk=[Ikโˆ’100Qk].

After t iterations of this process, t=min(mโˆ’1,n),

R=Qtโ‹ฏQ2Q1A

is an upper triangular matrix. So, with

Q๐–ณ=Qtโ‹ฏQ2Q1,Q=Q1๐–ณQ2๐–ณโ‹ฏQt๐–ณ

A=QR is a QR decomposition of A.

This method has greater numerical stability than the Gramโ€“Schmidt method above.

In numerical tests the computed factors Qc and Rc satisfy โ€–QRโˆ’QcRcโ€–โˆžโ€–Aโ€–โˆž=O(ฮต) at machine precision. Also, orthogonality is preserved: โ€–Qc๐–ณQcโˆ’Iโ€–โˆž=O(ฮต). However, the accuracy of Qc and Rc decrease with condition number: โ€–Qโˆ’Qcโ€–โˆž=O(ฮตฮบโˆž(A)),โ€–Rโˆ’Rcโ€–โˆžโ€–Rโ€–โˆž=O(ฮตฮบโˆž(A)).

For a well-conditioned example (n=4000, ฮบโˆž(A)โ‰ˆ3ร—103): โ€–QRโˆ’QcRcโ€–โˆžโ€–Aโ€–โˆžโ‰ˆ1.6ร—10โˆ’15, โ€–Qโˆ’Qcโ€–โˆžโ‰ˆ1.6ร—10โˆ’15, โ€–Rโˆ’Rcโ€–โˆžโ€–Rโ€–โˆžโ‰ˆ4.3ร—10โˆ’14, โ€–Qc๐–ณQcโˆ’Iโ€–โˆžโ‰ˆ1.1ร—10โˆ’13.

In an ill-conditioned test (n=4000, ฮบโˆž(A)โ‰ˆ4ร—1018): โ€–QRโˆ’QcRcโ€–โˆžโ€–Aโ€–โˆžโ‰ˆ1.3ร—10โˆ’15, โ€–Qโˆ’Qcโ€–โˆžโ‰ˆ5.2ร—10โˆ’4, โ€–Rโˆ’Rcโ€–โˆžโ€–Rโ€–โˆžโ‰ˆ1.2ร—10โˆ’4, โ€–Qc๐–ณQcโˆ’Iโ€–โˆžโ‰ˆ1.1ร—10โˆ’13.[3]

The following table gives the number of operations in the k-th step of the QR-decomposition by the Householder transformation, assuming a square matrix with size n.

Operation Number of operations in the k-th step
Multiplications 2(nโˆ’k+1)2
Additions (nโˆ’k+1)2+(nโˆ’k+1)(nโˆ’k)+2
Division 1
Square root 1

Summing these numbers over the n โˆ’ 1 steps (for a square matrix of size n), the complexity of the algorithm (in terms of floating point multiplications) is given by

23n3+n2+13nโˆ’2=O(n3).

Example

[edit | edit source]

Let us calculate the decomposition of

A=[12โˆ’5146167โˆ’68โˆ’424โˆ’41].

First, we need to find a reflection that transforms the first column of matrix A, vector ๐š1=[126โˆ’4]๐–ณ, into โ€–๐š1โ€–๐ž1=[ฮฑ00]๐–ณ.

Now,

๐ฎ=๐ฑโˆ’ฮฑ๐ž1,

and

๐ฏ=๐ฎโ€–๐ฎโ€–.

Here,

ฮฑ=14 and ๐ฑ=๐š1=[126โˆ’4]๐–ณ

Therefore

๐ฎ=[โˆ’26โˆ’4]๐–ณ=2[โˆ’13โˆ’2]๐–ณ and ๐ฏ=114[โˆ’13โˆ’2]๐–ณ, and then
Q1=Iโˆ’21414[โˆ’13โˆ’2][โˆ’13โˆ’2]=Iโˆ’17[1โˆ’32โˆ’39โˆ’62โˆ’64]=[6/73/7โˆ’2/73/7โˆ’2/76/7โˆ’2/76/73/7].

Now observe:

Q1A=[1421โˆ’140โˆ’49โˆ’140168โˆ’77],

so we already have almost a triangular matrix. We only need to zero the (3, 2) entry.

Take the (1, 1) minor, and then apply the process again to

A=M11=[โˆ’49โˆ’14168โˆ’77].

By the same method as above, we obtain the matrix of the Householder transformation

Q2=[1000โˆ’7/2524/25024/257/25]

after performing a direct sum with 1 to make sure the next step in the process works properly.

Now, we find

Q=Q1๐–ณQ2๐–ณ=[6/7โˆ’69/17558/1753/7158/175โˆ’6/175โˆ’2/76/3533/35].

Or, to four decimal digits,

Q=Q1๐–ณQ2๐–ณ=[0.8571โˆ’0.39430.33140.42860.9029โˆ’0.0343โˆ’0.28570.17140.9429]R=Q2Q1A=Q๐–ณA=[1421โˆ’140175โˆ’7000โˆ’35].

The matrix Q is orthogonal and R is upper triangular, so A = QR is the required QR decomposition.

Advantages and disadvantages

[edit | edit source]

The use of Householder transformations is inherently the most simple of the numerically stable QR decomposition algorithms due to the use of reflections as the mechanism for producing zeroes in the R matrix. However, the Householder reflection algorithm is bandwidth heavy and difficult to parallelize, as every reflection that produces a new zero element changes the entirety of both Q and R matrices.

Parallel implementation of Householder QR

[edit | edit source]

The Householder QR method can be implemented in parallel with algorithms such as the TSQR algorithm (which stands for Tall Skinny QR). This algorithm can be applied in the case when the matrix A has m >> n.[4] This algorithm uses a binary reduction tree to compute local householder QR decomposition at each node in the forward pass, and re-constitute the Q matrix in the backward pass. The binary tree structure aims at decreasing the amount of communication between processor to increase performance.

Using Givens rotations

[edit | edit source]

QR decompositions can also be computed with a series of Givens rotations. Each rotation zeroes an element in the subdiagonal of the matrix, forming the R matrix. The concatenation of all the Givens rotations forms the orthogonal Q matrix.

In practice, Givens rotations are not actually performed by building a whole matrix and doing a matrix multiplication. A Givens rotation procedure is used instead which does the equivalent of the sparse Givens matrix multiplication, without the extra work of handling the sparse elements. The Givens rotation procedure is useful in situations where only relatively few off-diagonal elements need to be zeroed, and is more easily parallelized than Householder transformations.

Example

[edit | edit source]

Let us calculate the decomposition of

A=[12โˆ’5146167โˆ’68โˆ’424โˆ’41].

First, we need to form a rotation matrix that will zero the lowermost left element, a31=โˆ’4. We form this matrix using the Givens rotation method, and call the matrix G1. We will first rotate the vector [12โˆ’4], to point along the X axis. This vector has an angle ฮธ=arctan(โˆ’(โˆ’4)12). We create the orthogonal Givens rotation matrix, G1:

G1=[cos(ฮธ)0โˆ’sin(ฮธ)010sin(ฮธ)0cos(ฮธ)]โ‰ˆ[0.948680โˆ’0.316220100.3162200.94868]

And the result of G1A now has a zero in the a31 element.

G1Aโ‰ˆ[12.64911โˆ’55.9723116.760076167โˆ’6806.64078โˆ’37.6311]

We can similarly form Givens matrices G2 and G3, which will zero the sub-diagonal elements a21 and a32, forming a triangular matrix R. The orthogonal matrix Q๐–ณ is formed from the product of all the Givens matrices Q๐–ณ=G3G2G1. Thus, we have G3G2G1A=Q๐–ณA=R, and the QR decomposition is A=QR.

Advantages and disadvantages

[edit | edit source]

The QR decomposition via Givens rotations is the most involved to implement, as the ordering of the rows required to fully exploit the algorithm is not trivial to determine. However, it has a significant advantage in that each new zero element aij affects only the row with the element to be zeroed (i) and a row above (j). This makes the Givens rotation algorithm more bandwidth efficient and parallelizable than the Householder reflection technique.

Connection to a determinant or a product of eigenvalues

[edit | edit source]

We can use QR decomposition to find the determinant of a square matrix. Suppose a matrix is decomposed as A=QR. Then we have detA=detQdetR.

Q can be chosen such that detQ=1. Thus, detA=detR=โˆirii

where the rii are the entries on the diagonal of R. Furthermore, because the determinant equals the product of the eigenvalues, we have โˆirii=โˆiฮปi

where the ฮปi are eigenvalues of A.

We can extend the above properties to a non-square complex matrix A by introducing the definition of QR decomposition for non-square complex matrices and replacing eigenvalues with singular values.

Start with a QR decomposition for a non-square matrix A:

A=Q[R0],Qโ€ Q=I

where 0 denotes the zero matrix and Q is a unitary matrix.

From the properties of the singular value decomposition (SVD) and the determinant of a matrix, we have

|โˆirii|=โˆiฯƒi,

where the ฯƒi are the singular values of A.

Note that the singular values of A and R are identical, although their complex eigenvalues may be different. However, if A is square, then

โˆiฯƒi=|โˆiฮปi|.

It follows that the QR decomposition can be used to efficiently calculate the product of the eigenvalues or singular values of a matrix.

Column pivoting

[edit | edit source]

Pivoted QR differs from ordinary Gram-Schmidt in that it takes the largest remaining column at the beginning of each new stepโ€”column pivotingโ€”[5] and thus introduces a permutation matrix P:

AP=QRA=QRP๐–ณ

Column pivoting is useful when A is (nearly) rank deficient, or is suspected of being so. It can also improve numerical accuracy. P is usually chosen so that the diagonal elements of R are non-increasing: |r11|โ‰ฅ|r22|โ‰ฅโ‹ฏโ‰ฅ|rnn|. This can be used to find the (numerical) rank of A at lower computational cost than a singular value decomposition, forming the basis of so-called rank-revealing QR algorithms.

Using for solution to linear inverse problems

[edit | edit source]

Compared to the direct matrix inverse, inverse solutions using QR decomposition are more numerically stable as evidenced by their reduced condition numbers.[6]

To solve the underdetermined (m<n) linear problem A๐ฑ=๐› where the matrix A has dimensions mร—n and rank m, first find the QR factorization of the transpose of A: A๐–ณ=QR, where Q is an orthogonal matrix (i.e. Q๐–ณ=Qโˆ’1), and R has a special form: R=[R10]. Here R1 is a square mร—m right triangular matrix, and the zero matrix has dimension (nโˆ’m)ร—m. After some algebra, it can be shown that a solution to the inverse problem can be expressed as: ๐ฑ=Q[(R1๐–ณ)โˆ’1๐›0] where one may either find R1โˆ’1 by Gaussian elimination or compute (R1๐–ณ)โˆ’1๐› directly by forward substitution. The latter technique enjoys greater numerical accuracy and lower computations.

To find a solution ๐ฑ^ to the overdetermined (mโ‰ฅn) problem A๐ฑ=๐› which minimizes the norm โ€–A๐ฑ^โˆ’๐›โ€–, first find the QR factorization of A: A=QR. The solution can then be expressed as ๐ฑ^=R1โˆ’1(Q1๐–ณ๐›), where Q1 is an mร—n matrix containing the first n columns of the full orthonormal basis Q and where R1 is as before. Equivalent to the underdetermined case, back substitution can be used to quickly and accurately find this ๐ฑ^ without explicitly inverting R1. (Q1 and R1 are often provided by numerical libraries as an "economic" QR decomposition.)

Generalizations

[edit | edit source]

Iwasawa decomposition generalizes QR decomposition to semi-simple Lie groups.

See also

[edit | edit source]

References

[edit | edit source]
  1. ^ a b c Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  2. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  3. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  4. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  5. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  6. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).

Further reading

[edit | edit source]
  • Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value)..
  • Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  • Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
[edit | edit source]