Schur complement

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

The Schur complement is a key tool in the fields of linear algebra, the theory of matrices, numerical analysis, and statistics.

It is defined for a block matrix. Suppose p, q are nonnegative integers such that p + q > 0, and suppose A, B, C, D are respectively p × p, p × q, q × p, and q × q matrices of complex numbers. Let M=[ABCD] so that M is a (p + q) × (p + q) matrix.

If D is invertible, then the Schur complement of the block D of the matrix M is the p × p matrix defined by M/D:=ABD1C. If A is invertible, the Schur complement of the block A of the matrix M is the q × q matrix defined by M/A:=DCA1B. In the case that A or D is singular, substituting a generalized inverse for the inverses on M/A and M/D yields the generalized Schur complement.

The Schur complement is named after Issai Schur[1] who used it to prove Schur's lemma, although it had been used previously.[2] Emilie Virginia Haynsworth was the first to call it the Schur complement.[3] The Schur complement is sometimes referred to as the Feshbach map after a physicist Herman Feshbach.[4]

Background

[edit | edit source]

The Schur complement arises when performing a block Gaussian elimination on the matrix M. In order to eliminate the elements below the block diagonal, one multiplies the matrix M by a block lower triangular matrix on the right as follows: M=[ABCD][ABCD][Ip0D1CIq]=[ABD1CB0D], where Ip denotes a p×p identity matrix. As a result, the Schur complement M/D=ABD1C appears in the upper-left p×p block.

Continuing the elimination process beyond this point (i.e., performing a block Gauss–Jordan elimination), [ABD1CB0D][IpBD10Iq][ABD1CB0D]=[ABD1C00D], leads to an LDU decomposition of M, which reads M=[ABCD]=[IpBD10Iq][ABD1C00D][Ip0D1CIq]. Thus, the inverse of M may be expressed involving D−1 and the inverse of Schur's complement, assuming it exists, as M1=[ABCD]1=([IpBD10Iq][ABD1C00D][Ip0D1CIq])1=[Ip0D1CIq][(ABD1C)100D1][IpBD10Iq]=[(ABD1C)1(ABD1C)1BD1D1C(ABD1C)1D1+D1C(ABD1C)1BD1]=[(M/D)1(M/D)1BD1D1C(M/D)1D1+D1C(M/D)1BD1]. The above relationship comes from the elimination operations that involve D−1 and M/D. An equivalent derivation can be done with the roles of A and D interchanged. By equating the expressions for M−1 obtained in these two different ways, one can establish the matrix inversion lemma, which relates the two Schur complements of M: M/D and M/A (see "Derivation from LDU decomposition" in Woodbury matrix identity § Alternative proofs).

Properties

[edit | edit source]
  • If p and q are both 1 (i.e., A, B, C and D are all scalars), we get the familiar formula for the inverse of a 2-by-2 matrix:
M1=1ADBC[DBCA]
provided that AD − BC is non-zero.
  • In general, if A is invertible, then
M=[ABCD]=[Ip0CA1Iq][A00DCA1B][IpA1B0Iq],M1=[A1+A1B(M/A)1CA1A1B(M/A)1(M/A)1CA1(M/A)1]
whenever this inverse exists.
  • (Schur's formula) When A, respectively D, is invertible, the determinant of M is also clearly seen to be given by
det(M)=det(A)det(DCA1B), respectively
det(M)=det(D)det(ABD1C),
which generalizes the determinant formula for 2 × 2 matrices.
  • (Guttman rank additivity formula) If D is invertible, then the rank of M is given by
rank(M)=rank(D)+rank(ABD1C)

Application to solving linear equations

[edit | edit source]

The Schur complement arises naturally in solving a system of linear equations such as[7]

[ABCD][xy]=[uv].

Assuming that the submatrix A is invertible, we can eliminate x from the equations, as follows.

x=A1(uBy).

Substituting this expression into the second equation yields

(DCA1B)y=vCA1u.

We refer to this as the reduced equation obtained by eliminating x from the original equation. The matrix appearing in the reduced equation is called the Schur complement of the first block A in M:

S =def DCA1B.

Solving the reduced equation, we obtain

y=S1(vCA1u).

Substituting this into the first equation yields

x=(A1+A1BS1CA1)uA1BS1v.

We can express the above two equation as:

[xy]=[A1+A1BS1CA1A1BS1S1CA1S1][uv].

Therefore, a formulation for the inverse of a block matrix is:

[ABCD]1=[A1+A1BS1CA1A1BS1S1CA1S1]=[IpA1BIq][A1S1][IpCA1Iq].

In particular, we see that the Schur complement is the inverse of the 2,2 block entry of the inverse of M.

In practice, one needs A to be well-conditioned in order for this algorithm to be numerically accurate.

This method is useful in electrical engineering to reduce the dimension of a network's equations. It is especially useful when element(s) of the output vector are zero. For example, when u or v is zero, we can eliminate the associated rows of the coefficient matrix without any changes to the rest of the output vector. If v is null then the above equation for x reduces to x=(A1+A1BS1CA1)u, thus reducing the dimension of the coefficient matrix while leaving u unmodified. This is used to advantage in electrical engineering where it is referred to as node elimination or Kron reduction.

Applications to probability theory and statistics

[edit | edit source]

Suppose the random column vectors X, Y live in Rn and Rm respectively, and the vector (X, Y) in Rn + m has a multivariate normal distribution whose covariance is the symmetric positive-definite matrix

Σ=[ABBTC],

where An×n is the covariance matrix of X, Cm×m is the covariance matrix of Y and Bn×m is the covariance matrix between X and Y.

Then the conditional covariance of X given Y is the Schur complement of C in Σ:[8]

Cov(XY)=ABC1BTE(XY)=E(X)+BC1(YE(Y))

If we take the matrix Σ above to be, not a covariance of a random vector, but a sample covariance, then it may have a Wishart distribution. In that case, the Schur complement of C in Σ also has a Wishart distribution.[citation needed]

Conditions for positive definiteness and semi-definiteness

[edit | edit source]

Let X be a symmetric matrix of real numbers given by X=[ABBTC]. Then by the Haynsworth inertia additivity formula, we find

  • If A is invertible, then X is positive definite if and only if A and its complement X/A are both positive definite:[2]: 34 
X0A0,X/A=CBTA1B0.
  • If C is invertible, then X is positive definite if and only if C and its complement X/C are both positive definite:
X0C0,X/C=ABC1BT0.
  • If A is positive definite, then X is positive semi-definite if and only if the complement X/A is positive semi-definite:[2]: 34 
If A0, then X0X/A=CBTA1B0.
  • If C is positive definite, then X is positive semi-definite if and only if the complement X/C is positive semi-definite:
If C0, then X0X/C=ABC1BT0.

The first and third statements can also be derived[7] by considering the minimizer of the quantity uTAu+2vTBTu+vTCv, as a function of v (for fixed u).

Furthermore, since [ABBTC]0[CBTBA]0 and similarly for positive semi-definite matrices, the second (respectively fourth) statement is immediate from the first (resp. third) statement.

There is also a sufficient and necessary condition for the positive semi-definiteness of X in terms of a generalized Schur complement.[2] Precisely,

  • X0A0,CBTAgB0,(IAAg)B=0 and
  • X0C0,ABCgBT0,(ICCg)BT=0,

where Ag denotes a generalized inverse of A.

See also

[edit | edit source]

References

[edit | edit source]
  1. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  2. ^ a b c d Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  3. ^ Haynsworth, E. V., "On the Schur Complement", Basel Mathematical Notes, #BNB 20, 17 pages, June 1968.
  4. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  5. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  6. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  7. ^ a b Boyd, S. and Vandenberghe, L. (2004), "Convex Optimization", Cambridge University Press (Appendix A.5.5)
  8. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).