Definite matrix

From Wikipedia, the free encyclopedia
(Redirected from Definiteness of a matrix)
Jump to navigation Jump to search

In mathematics, a symmetric matrix M with real entries is positive-definite if the real number ๐ฑ๐–ณM๐ฑ is positive for every nonzero real column vector ๐ฑ, where ๐ฑ๐–ณ is the row vector transpose of ๐ฑ.[1] More generally, a Hermitian matrix (that is, a complex matrix equal to its conjugate transpose) is positive-definite if the real number ๐ณ*M๐ณ is positive for every nonzero complex column vector ๐ณ, where ๐ณ* denotes the conjugate transpose of ๐ณ.

Positive semi-definite matrices are defined similarly, except that the scalars ๐ฑ๐–ณM๐ฑ and ๐ณ*M๐ณ are required to be positive or zero (that is, nonnegative). Negative-definite and negative semi-definite matrices are defined analogously. A matrix that is not positive semi-definite and not negative semi-definite is sometimes called indefinite.

Some authors use more general definitions of definiteness, permitting the matrices to be non-symmetric or non-Hermitian. The properties of these generalized definite matrices are explored in ยง Extension for non-Hermitian square matrices, below, but are not the main focus of this article.

Definitions

[edit | edit source]

In the following definitions, ๐ฑ๐–ณ is the transpose of ๐ฑ, ๐ณ* is the conjugate transpose of ๐ณ, and ๐ŸŽ denotes the n dimensional zero-vector.

Definitions for real matrices

[edit | edit source]

An nร—n symmetric real matrix M is said to be positive-definite if ๐ฑ๐–ณM๐ฑ>0 for all non-zero ๐ฑ in โ„n. Formally,

M positive-definite๐ฑ๐–ณM๐ฑ>0 for all ๐ฑโˆˆโ„nโˆ–{๐ŸŽ}

An nร—n symmetric real matrix M is said to be positive-semidefinite or non-negative-definite if ๐ฑ๐–ณM๐ฑโ‰ฅ0 for all ๐ฑ in โ„n. Formally,

M positive semi-definite๐ฑ๐–ณM๐ฑโ‰ฅ0 for all ๐ฑโˆˆโ„n

An nร—n symmetric real matrix M is said to be negative-definite if ๐ฑ๐–ณM๐ฑ<0 for all non-zero ๐ฑ in โ„n. Formally,

M negative-definite๐ฑ๐–ณM๐ฑ<0 for all ๐ฑโˆˆโ„nโˆ–{๐ŸŽ}

An nร—n symmetric real matrix M is said to be negative-semidefinite or non-positive-definite if ๐ฑ๐–ณM๐ฑโ‰ค0 for all ๐ฑ in โ„n. Formally,

M negative semi-definite๐ฑ๐–ณM๐ฑโ‰ค0 for all ๐ฑโˆˆโ„n

An nร—n symmetric real matrix which is neither positive semidefinite nor negative semidefinite is called indefinite.

Definitions for complex matrices

[edit | edit source]

The following definitions all involve the term ๐ณ*M๐ณ. Notice that this is always a real number for any Hermitian square matrix M.

An nร—n Hermitian complex matrix M is said to be positive-definite if ๐ณ*M๐ณ>0 for all non-zero ๐ณ in โ„‚n. Formally,

M positive-definite๐ณ*M๐ณ>0 for all ๐ณโˆˆโ„‚nโˆ–{๐ŸŽ}

An nร—n Hermitian complex matrix M is said to be positive semi-definite or non-negative-definite if ๐ณ*M๐ณโ‰ฅ0 for all ๐ณ in โ„‚n. Formally,

M positive semi-definite๐ณ*M๐ณโ‰ฅ0 for all ๐ณโˆˆโ„‚n

An nร—n Hermitian complex matrix M is said to be negative-definite if ๐ณ*M๐ณ<0 for all non-zero ๐ณ in โ„‚n. Formally,

M negative-definite๐ณ*M๐ณ<0 for all ๐ณโˆˆโ„‚nโˆ–{๐ŸŽ}

An nร—n Hermitian complex matrix M is said to be negative semi-definite or non-positive-definite if ๐ณ*M๐ณโ‰ค0 for all ๐ณ in โ„‚n. Formally,

M negative semi-definite๐ณ*M๐ณโ‰ค0 for all ๐ณโˆˆโ„‚n

An nร—n Hermitian complex matrix which is neither positive semidefinite nor negative semidefinite is called indefinite.

Consistency between real and complex definitions

[edit | edit source]

Since every real matrix is also a complex matrix, the definitions of "definiteness" for the two classes must agree.

For complex matrices, the most common definition says that M is positive-definite if and only if ๐ณ*M๐ณ is real and positive for every non-zero complex column vectors ๐ณ. This condition implies that M is Hermitian (i.e. its transpose is equal to its conjugate), since ๐ณ*M๐ณ being real, it equals its conjugate transpose ๐ณ*M*๐ณ for every ๐ณ, which implies M=M*.

By this definition, a positive-definite real matrix M is Hermitian, hence symmetric; and ๐ณ๐–ณM๐ณ is positive for all non-zero real column vectors ๐ณ. However the last condition alone is not sufficient for M to be positive-definite. For example, if M=[11โˆ’11],

then for any real vector ๐ณ with entries a and b we have ๐ณ๐–ณM๐ณ=(a+b)a+(โˆ’a+b)b=a2+b2, which is always positive if ๐ณ is not zero. However, if ๐ณ is the complex vector with entries 1 and i, one gets

๐ณ*M๐ณ=[1โˆ’i]M[1i]=[1+i1โˆ’i][1i]=2+2i.

which is not real. Therefore, M is not positive-definite.

On the other hand, for a symmetric real matrix M, the condition "๐ณ๐–ณM๐ณ>0 for all nonzero real vectors ๐ณ" does imply that M is positive-definite in the complex sense.

Notation

[edit | edit source]

If a Hermitian matrix M is positive semi-definite, one sometimes writes Mโชฐ0 and if M is positive-definite one writes Mโ‰ป0. To denote that M is negative semi-definite one writes Mโชฏ0 and to denote that M is negative-definite one writes Mโ‰บ0.

The notion comes from functional analysis where positive semidefinite matrices define positive operators. If two matrices A and B satisfy Bโˆ’Aโชฐ0, we can define a non-strict partial order BโชฐA that is reflexive, antisymmetric, and transitive; It is not a total order, however, as Bโˆ’A, in general, may be indefinite.

A common alternative notation is Mโ‰ฅ0, M>0, Mโ‰ค0, and M<0 for positive semi-definite and positive-definite, negative semi-definite and negative-definite matrices, respectively. This may be confusing, as sometimes nonnegative matrices (respectively, nonpositive matrices) are also denoted in this way.

Ramifications

[edit | edit source]

It follows from the above definitions that a Hermitian matrix is positive-definite if and only if it is the matrix of a positive-definite quadratic form or Hermitian form. In other words, a Hermitian matrix is positive-definite if and only if it defines an inner product.

Positive-definite and positive-semidefinite matrices can be characterized in many ways, which may explain the importance of the concept in various parts of mathematics. A Hermitian matrix M is positive-definite if and only if it satisfies any of the following equivalent conditions.

A matrix is positive semi-definite if it satisfies similar equivalent conditions where "positive" is replaced by "nonnegative", "invertible matrix" is replaced by "matrix", and the word "leading" is removed.

Positive-definite and positive-semidefinite real matrices are at the basis of convex optimization, since, given a function of several real variables that is twice differentiable, then if its Hessian matrix (matrix of its second partial derivatives) is positive-definite at a point p, then the function is convex near p, and, conversely, if the function is convex near p, then the Hessian matrix is positive-semidefinite at p.

The set of positive definite matrices is an open convex cone, while the set of positive semi-definite matrices is a closed convex cone.[2]

Examples

[edit | edit source]
  • The identity matrix I=[1001] is positive-definite (and as such also positive semi-definite). It is a real symmetric matrix, and, for any non-zero column vector z with real entries a and b, one has

    ๐ณ๐–ณI๐ณ=[ab][1001][ab]=a2+b2. Seen as a complex matrix, for any non-zero column vector z with complex entries a and b one has ๐ณ*I๐ณ=[aโ€พbโ€พ][1001][ab]=aโ€พa+bโ€พb=|a|2+|b|2.

    Either way, the result is positive since ๐ณ is not the zero vector (that is, at least one of a and b is not zero).
  • The real symmetric matrix M=[2โˆ’10โˆ’12โˆ’10โˆ’12] is positive-definite since for any non-zero column vector z with entries a, b and c, we have ๐ณ๐–ณM๐ณ=(๐ณ๐–ณM)๐ณ=[(2aโˆ’b)(โˆ’a+2bโˆ’c)(โˆ’b+2c)][abc]=(2aโˆ’b)a+(โˆ’a+2bโˆ’c)b+(โˆ’b+2c)c=2a2โˆ’baโˆ’ab+2b2โˆ’cbโˆ’bc+2c2=2a2โˆ’2ab+2b2โˆ’2bc+2c2=a2+a2โˆ’2ab+b2+b2โˆ’2bc+c2+c2=a2+(aโˆ’b)2+(bโˆ’c)2+c2 This result is a sum of squares, and therefore non-negative; and is zero only if a=b=c=0, that is, when ๐ณ is the zero vector.
  • For any real invertible matrix A, the product A๐–ณA is a positive definite matrix (if the means of the columns of A are 0, then this is also called the covariance matrix). A simple proof is that for any non-zero vector ๐ณ, the condition ๐ณ๐–ณA๐–ณA๐ณ=(A๐ณ)๐–ณ(A๐ณ)=โ€–A๐ณโ€–2>0, since the invertibility of matrix A means that A๐ณโ‰ 0.
  • The example M above shows that a matrix in which some elements are negative may still be positive definite. Conversely, a matrix whose entries are all positive is not necessarily positive definite, as for example N=[1221], for which [โˆ’11]N[โˆ’11]๐–ณ=โˆ’2<0.

Eigenvalues

[edit | edit source]

Let M be an nร—n Hermitian matrix (this includes real symmetric matrices). All eigenvalues of M are real, and their sign characterize its definiteness:

  • M is positive definite if and only if all of its eigenvalues are positive.
  • M is positive semi-definite if and only if all of its eigenvalues are non-negative.
  • M is negative definite if and only if all of its eigenvalues are negative.
  • M is negative semi-definite if and only if all of its eigenvalues are non-positive.
  • M is indefinite if and only if it has both positive and negative eigenvalues.

Let PDPโˆ’1 be an eigendecomposition of M, where P is a unitary complex matrix whose columns comprise an orthonormal basis of eigenvectors of M, and D is a real diagonal matrix whose main diagonal contains the corresponding eigenvalues. The matrix M may be regarded as a diagonal matrix D that has been re-expressed in coordinates of the (eigenvectors) basis P. Put differently, applying M to some vector ๐ณ, giving M๐ณ, is the same as changing the basis to the eigenvector coordinate system using Pโˆ’1, giving Pโˆ’1๐ณ, applying the stretching transformation D to the result, giving DPโˆ’1๐ณ, and then changing the basis back using P, giving PDPโˆ’1๐ณ.

With this in mind, the one-to-one change of variable ๐ฒ=P๐ณ shows that ๐ณ*M๐ณ is real and positive for any complex vector ๐ณ if and only if ๐ฒ*D๐ฒ is real and positive for any y; in other words, if D is positive definite. For a diagonal matrix, this is true only if each element of the main diagonal โ€“ that is, every eigenvalue of M โ€“ is positive. Since the spectral theorem guarantees all eigenvalues of a Hermitian matrix to be real, the positivity of eigenvalues can be checked using Descartes' rule of alternating signs when the characteristic polynomial of a real, symmetric matrix M is available.

Decomposition

[edit | edit source]

Let M be an nร—n Hermitian matrix. M is positive semidefinite if and only if it can be decomposed as a product M=B*B of a matrix B with its conjugate transpose.

When M is real, B can be real as well and the decomposition can be written as M=B๐–ณB.

M is positive definite if and only if such a decomposition exists with B invertible. More generally, M is positive semidefinite with rank k if and only if a decomposition exists with a kร—n matrix B of full row rank (i.e. of rank k). Moreover, for any decomposition M=B*B, rank(M)=rank(B).[3]

Proof

If M=B*B, then x*Mx=(x*B*)(Bx)=โ€–Bxโ€–2โ‰ฅ0, so M is positive semidefinite. If moreover B is invertible then the inequality is strict for xโ‰ 0, so M is positive definite. If B is kร—n of rank k, then rank(M)=rank(B*)=k.

In the other direction, suppose M is positive semidefinite. Since M is Hermitian, it has an eigendecomposition M=Qโˆ’1DQ where Q is unitary and D is a diagonal matrix whose entries are the eigenvalues of M Since M is positive semidefinite, the eigenvalues are non-negative real numbers, so one can define D12 as the diagonal matrix whose entries are non-negative square roots of eigenvalues. Then M=Qโˆ’1DQ=Q*DQ=Q*D12D12Q=Q*D12*D12Q=B*B for B=D12Q. If moreover M is positive definite, then the eigenvalues are (strictly) positive, so D12 is invertible, and hence B=D12Q is invertible as well. If M has rank k, then it has exactly k positive eigenvalues and the others are zero, hence in B=D12Q all but k rows are all zeroed. Cutting the zero rows gives a kร—n matrix B such that B'*B=B*B=M.

The columns b1,,bn of B can be seen as vectors in the complex or real vector space โ„k, respectively. Then the entries of M are inner products (that is dot products, in the real case) of these vectors Mij=โŸจbi,bjโŸฉ. In other words, a Hermitian matrix M is positive semidefinite if and only if it is the Gram matrix of some vectors b1,,bn. It is positive definite if and only if it is the Gram matrix of some linearly independent vectors. In general, the rank of the Gram matrix of vectors b1,,bn equals the dimension of the space spanned by these vectors.[4]

Uniqueness up to unitary transformations

[edit | edit source]

The decomposition is not unique: if M=B*B for some kร—n matrix B and if Q is any unitary kร—k matrix (meaning Q*Q=QQ*=I), then M=B*B=B*Q*QB=A*A for A=QB.

However, this is the only way in which two decompositions can differ: The decomposition is unique up to unitary transformations. More formally, if A is a kร—n matrix and B is a โ„“ร—n matrix such that A*A=B*B, then there is a โ„“ร—k matrix Q with orthonormal columns (meaning Q*Q=Ikร—k) such that B=QA.[5] When โ„“=k this means Q is unitary.

This statement has an intuitive geometric interpretation in the real case: let the columns of A and B be the vectors a1,,an and b1,,bn in โ„k. A real unitary matrix is an orthogonal matrix, which describes a rigid transformation (an isometry of Euclidean space โ„k) preserving the 0 point (i.e. rotations and reflections, without translations). Therefore, the dot products aiโ‹…aj and biโ‹…bj are equal if and only if some rigid transformation of โ„k transforms the vectors a1,,an to b1,,bn (and 0 to 0).

Square root

[edit | edit source]

A Hermitian matrix M is positive semidefinite if and only if there is a positive semidefinite matrix B (in particular B is Hermitian, so B*=B) satisfying M=BB. This matrix B is unique,[6] is called the non-negative square root of M, and is denoted with B=M12. When M is positive definite, so is M12, hence it is also called the positive square root of M.

The non-negative square root should not be confused with other decompositions M=B*B. Some authors use the name square root and M12 for any such decomposition, or specifically for the Cholesky decomposition, or any decomposition of the form M=BB; others only use it for the non-negative square root.

If Mโ‰ปNโ‰ป0 then M12โ‰ปN12โ‰ป0.

Cholesky decomposition

[edit | edit source]

A Hermitian positive semidefinite matrix M can be written as M=LL*, where L is lower triangular with non-negative diagonal (equivalently M=B*B where B=L* is upper triangular); this is the Cholesky decomposition. If M is positive definite, then the diagonal of L is positive and the Cholesky decomposition is unique. Conversely if L is lower triangular with nonnegative diagonal then LL* is positive semidefinite. The Cholesky decomposition is especially useful for efficient numerical calculations. A closely related decomposition is the LDL decomposition, M=LDL*, where D is diagonal and L is lower unitriangular.

Williamson theorem

[edit | edit source]

Any 2nร—2n positive definite Hermitian real matrix M can be diagonalized via symplectic (real) matrices. More precisely, Williamson's theorem ensures the existence of symplectic Sโˆˆ๐’๐ฉ(2n,โ„) and diagonal real positive Dโˆˆโ„nร—n such that SMST=DโŠ•D.

Other characterizations

[edit | edit source]

Let M be an nร—n real symmetric matrix, and let B1(M)โ‰ก{๐ฑโˆˆโ„n:๐ฑ๐–ณM๐ฑโ‰ค1} be the "unit ball" defined by M. Then we have the following

  • B1(๐ฏ๐ฏ๐–ณ) is a solid slab sandwiched between ยฑ{๐ฐ:โŸจ๐ฐ,๐ฏโŸฉ=1}.
  • Mโชฐ0 if and only if B1(M) is an ellipsoid, or an ellipsoidal cylinder.
  • Mโ‰ป0 if and only if B1(M) is bounded, that is, it is an ellipsoid.
  • If Nโ‰ป0, then MโชฐN if and only if B1(M)โІB1(N); Mโ‰ปN if and only if B1(M)โІint(B1(N)).
  • If Nโ‰ป0, then Mโชฐ๐ฏ๐ฏ๐–ณ๐ฏ๐–ณN๐ฏ for all vโ‰ 0 if and only if B1(M)โŠ‚โ‹‚๐ฏ๐–ณN๐ฏ=1B1(๐ฏ๐ฏ๐–ณ). So, since the polar dual of an ellipsoid is also an ellipsoid with the same principal axes, with inverse lengths, we have B1(Nโˆ’1)=โ‹‚๐ฏ๐–ณN๐ฏ=1B1(๐ฏ๐ฏ๐–ณ)=โ‹‚๐ฏ๐–ณN๐ฏ=1{๐ฐ:|โŸจ๐ฐ,๐ฏโŸฉ|โ‰ค1}. That is, if N is positive-definite, then Mโชฐ๐ฏ๐ฏ๐–ณ๐ฏ๐–ณN๐ฏ for all ๐ฏโ‰ ๐ŸŽ if and only if MโชฐNโˆ’1.

Let M be an nร—n Hermitian matrix. The following properties are equivalent to M being positive definite:

The associated sesquilinear form is an inner product
The sesquilinear form defined by M is the function โŸจโ‹…,โ‹…โŸฉ from โ„‚nร—โ„‚n to โ„‚n such that โŸจ๐ฑ,๐ฒโŸฉโ‰ก๐ฒ*M๐ฑ for all ๐ฑ and ๐ฒ in โ„‚n, where ๐ฒ* is the conjugate transpose of ๐ฒ. For any complex matrix M, this form is linear in x and semilinear in ๐ฒ. Therefore, the form is an inner product on โ„‚n if and only if โŸจ๐ณ,๐ณโŸฉ is real and positive for all nonzero ๐ณ; that is if and only if M is positive definite. (In fact, every inner product on โ„‚n arises in this fashion from a Hermitian positive definite matrix.)
Its leading principal minors are all positive
The kth leading principal minor of a matrix M is the determinant of its upper-left kร—k sub-matrix. It turns out that a matrix is positive definite if and only if all these determinants are positive. This condition is known as Sylvester's criterion, and provides an efficient test of positive definiteness of a symmetric real matrix. Namely, the matrix is reduced to an upper triangular matrix by using elementary row operations, as in the first part of the Gaussian elimination method, taking care to preserve the sign of its determinant during pivoting process. Since the kth leading principal minor of a triangular matrix is the product of its diagonal elements up to row k, Sylvester's criterion is equivalent to checking whether its diagonal elements are all positive. This condition can be checked each time a new row k of the triangular matrix is obtained.

A positive semidefinite matrix is positive definite if and only if it is invertible.[7] A matrix M is negative (semi)definite if and only if โˆ’M is positive (semi)definite.

Quadratic forms

[edit | edit source]

The (purely) quadratic form associated with a real nร—n matrix M is the function Q:โ„nโ†’โ„ such that Q(๐ฑ)=๐ฑ๐–ณM๐ฑ for all ๐ฑ. M can be assumed symmetric by replacing it with 12(M+M๐–ณ), since any asymmetric part will be zeroed-out in the double-sided product.

A symmetric matrix M is positive definite if and only if its quadratic form is a strictly convex function.

More generally, any quadratic function from โ„n to โ„ can be written as ๐ฑ๐–ณM๐ฑ+๐›๐–ณ๐ฑ+c where M is a symmetric nร—n matrix, ๐› is a real n vector, and c a real constant. In the n=1 case, this is a parabola, and just like in the n=1 case, we have

Theorem: This quadratic function is strictly convex, and hence has a unique finite global minimum, if and only if M is positive definite.

Proof: If M is positive definite, then the function is strictly convex. Its gradient is zero at the unique point of Mโˆ’1๐›, which must be the global minimum since the function is strictly convex. If M is not positive definite, then there exists some vector ๐ฏ such that ๐ฏ๐–ณM๐ฏโ‰ค0, so the function f(t)โ‰ก(t๐ฏ)๐–ณM(t๐ฏ)+b๐–ณ(t๐ฏ)+c is a line or a downward parabola, thus not strictly convex and not having a global minimum.

For this reason, positive definite matrices play an important role in optimization problems.

Simultaneous diagonalization

[edit | edit source]

One symmetric matrix and another matrix that is both symmetric and positive definite can be simultaneously diagonalized. This is so although simultaneous diagonalization is not necessarily performed with a similarity transformation. This result does not extend to the case of three or more matrices. In this section we write for the real case. Extension to the complex case is immediate.

Let M be a symmetric and N a symmetric and positive definite matrix. Write the generalized eigenvalue equation as (Mโˆ’ฮปN)๐ฑ=0 where we impose that ๐ฑ be normalized, i.e. ๐ฑ๐–ณN๐ฑ=1. Now we use Cholesky decomposition to write the inverse of N as Q๐–ณQ. Multiplying by Q and letting ๐ฑ=Q๐–ณ๐ฒ, we get Q(Mโˆ’ฮปN)Q๐–ณ๐ฒ=0, which can be rewritten as (QMQ๐–ณ)๐ฒ=ฮป๐ฒ where ๐ฒ๐–ณ๐ฒ=1. Manipulation now yields MX=NXฮ› where X is a matrix having as columns the generalized eigenvectors and ฮ› is a diagonal matrix of the generalized eigenvalues. Now premultiplication with X๐–ณ gives the final result: X๐–ณMX=ฮ› and X๐–ณNX=I, but note that this is no longer an orthogonal diagonalization with respect to the inner product where ๐ฒ๐–ณ๐ฒ=1. In fact, we diagonalized M with respect to the inner product induced by N.[8]

Note that this result does not contradict what is said on simultaneous diagonalization in the article Diagonalizable matrix, which refers to simultaneous diagonalization by a similarity transformation. Our result here is more akin to a simultaneous diagonalization of two quadratic forms, and is useful for optimization of one form under conditions on the other.

Properties

[edit | edit source]

Induced partial ordering

[edit | edit source]

For arbitrary square matrices M, N we write Mโ‰ฅN if Mโˆ’Nโ‰ฅ0 i.e., Mโˆ’N is positive semi-definite. This defines a partial ordering on the set of all square matrices. One can similarly define a strict partial ordering M>N. The ordering is called the Loewner order.

Inverse of positive definite matrix

[edit | edit source]

Every positive definite matrix is invertible and its inverse is also positive definite.[9] If Mโ‰ฅN>0 then Nโˆ’1โ‰ฅMโˆ’1>0.[10] Moreover, by the min-max theorem, the kth largest eigenvalue of M is greater than or equal to the kth largest eigenvalue of N.

Scaling

[edit | edit source]

If M is positive definite and r>0 is a real number, then rM is positive definite.[11]

Addition

[edit | edit source]
  • If M and N are positive-definite, then the sum M+N is also positive-definite.[11]
  • If M and N are positive-semidefinite, then the sum M+N is also positive-semidefinite.
  • If M is positive-definite and N is positive-semidefinite, then the sum M+N is also positive-definite.

Multiplication

[edit | edit source]
  • If M and N are positive definite, then the products MNM and NMN are also positive definite. If MN=NM, then MN is also positive definite.
  • If M is positive semidefinite, then A*MA is positive semidefinite for any (possibly rectangular) matrix A. If M is positive definite and A has full column rank, then A*MA is positive definite.[12]

Trace

[edit | edit source]

The diagonal entries mii of a positive-semidefinite matrix are real and non-negative. As a consequence the trace, tr(M)โ‰ฅ0. Furthermore,[13] since every principal sub-matrix (in particular, 2-by-2) is positive semidefinite, |mij|โ‰คmiimjjโˆ€i,j and thus, when nโ‰ฅ1, maxi,j|mij|โ‰คmaximii

An nร—n Hermitian matrix M is positive definite if it satisfies the following trace inequalities:[14] tr(M)>0and(tr(M))2tr(M2)>nโˆ’1.

Another important result is that for any M and N positive-semidefinite matrices, tr(MN)โ‰ฅ0. This follows by writing tr(MN)=tr(M12NM12). The matrix M12NM12 is positive-semidefinite and thus has non-negative eigenvalues, whose sum, the trace, is therefore also non-negative.

Hadamard product

[edit | edit source]

If M,Nโ‰ฅ0, although MN is not necessary positive semidefinite, the Hadamard product is, Mโˆ˜Nโ‰ฅ0 (this result is often called the Schur product theorem).[15]

Regarding the Hadamard product of two positive semidefinite matrices M=(mij)โ‰ฅ0, Nโ‰ฅ0, there are two notable inequalities:

  • Oppenheim's inequality: det(Mโˆ˜N)โ‰ฅdet(N)โˆimii.[16]
  • det(Mโˆ˜N)โ‰ฅdet(M)det(N).[17]

Kronecker product

[edit | edit source]

If M,Nโ‰ฅ0, although MN is not necessary positive semidefinite, the Kronecker product MโŠ—Nโ‰ฅ0.

Frobenius product

[edit | edit source]

If M,Nโ‰ฅ0, although MN is not necessary positive semidefinite, the Frobenius inner product M:Nโ‰ฅ0 (Lancasterโ€“Tismenetsky, The Theory of Matrices, p. 218).

Convexity

[edit | edit source]

The set of positive semidefinite symmetric matrices is convex. That is, if M and N are positive semidefinite, then for any ฮฑ between 0 and 1, ฮฑM+(1โˆ’ฮฑ)N is also positive semidefinite. For any vector ๐ฑ: ๐ฑ๐–ณ(ฮฑM+(1โˆ’ฮฑ)N)๐ฑ=ฮฑ๐ฑ๐–ณM๐ฑ+(1โˆ’ฮฑ)๐ฑ๐–ณN๐ฑโ‰ฅ0.

This property guarantees that semidefinite programming problems converge to a globally optimal solution.

Relation with cosine

[edit | edit source]

The positive-definiteness of a matrix A expresses that the angle ฮธ between any vector ๐ฑ and its image A๐ฑ is always โˆ’ฯ€/2<ฮธ<+ฯ€/2:

cosฮธ=๐ฑ๐–ณA๐ฑโ€–๐ฑโ€–โ€–A๐ฑโ€–=โŸจ๐ฑ,A๐ฑโŸฉโ€–๐ฑโ€–โ€–A๐ฑโ€–,ฮธ=ฮธ(๐ฑ,A๐ฑ)โ‰ก(๐ฑ,A๐ฑ)^โ‰ก the angle between ๐ฑ and A๐ฑ.

Further properties

[edit | edit source]
  1. If M is a symmetric Toeplitz matrix, i.e. the entries mij are given as a function of their absolute index differences: mij=h(|iโˆ’j|), and the strict inequality โˆ‘jโ‰ 0|h(j)|<h(0) holds, then M is strictly positive definite.
  2. Let M>0 and N Hermitian. If MN+NMโ‰ฅ0 (resp., MN+NM>0) then Nโ‰ฅ0 (resp., N>0).[18]
  3. If M>0 is real, then there is a ฮด>0 such that M>ฮดI, where I is the identity matrix.
  4. If Mk denotes the leading kร—k minor, det(Mk)/det(Mkโˆ’1) is the kth pivot during LU decomposition.
  5. A matrix is negative definite if its kth order leading principal minor is negative when k is odd, and positive when k is even.
  6. If M is a real positive definite matrix, then there exists a positive real number m such that for every vector ๐ฏ, ๐ฏ๐–ณM๐ฏโ‰ฅmโ€–๐ฏโ€–22.
  7. A Hermitian matrix is positive semidefinite if and only if all of its principal minors are nonnegative. It is however not enough to consider the leading principal minors only, as is checked on the diagonal matrix with entries 0 and โˆ’1 .

Block matrices and submatrices

[edit | edit source]

A positive 2nร—2n matrix may also be defined by blocks: M=[ABCD]

where each block is nร—n, By applying the positivity condition, it immediately follows that A and D are hermitian, and C=B*.

We have that ๐ณ*M๐ณโ‰ฅ0 for all complex ๐ณ, and in particular for ๐ณ=[๐ฏ,0]๐–ณ. Then [๐ฏ*0][ABB*D][๐ฏ0]=๐ฏ*A๐ฏโ‰ฅ0.

A similar argument can be applied to D, and thus we conclude that both A and D must be positive definite. The argument can be extended to show that any principal submatrix of M is itself positive definite.

Converse results can be proved with stronger conditions on the blocks, for instance, using the Schur complement.

Local extrema

[edit | edit source]

A general quadratic form f(๐ฑ) on n real variables x1,โ€ฆ,xn can always be written as ๐ฑ๐–ณM๐ฑ where ๐ฑ is the column vector with those variables, and M is a symmetric real matrix. Therefore, the matrix being positive definite means that f has a unique minimum (zero) when ๐ฑ is zero, and is strictly positive for any other ๐ฑ.

More generally, a twice-differentiable real function f on n real variables has local minimum at arguments x1,โ€ฆ,xn if its gradient is zero and its Hessian (the matrix of all second derivatives) is positive semi-definite at that point. Similar statements can be made for negative definite and semi-definite matrices.

Covariance

[edit | edit source]

In statistics, the covariance matrix of a multivariate probability distribution is always positive semi-definite; and it is positive definite unless one variable is an exact linear function of the others. Conversely, every positive semi-definite matrix is the covariance matrix of some multivariate distribution.

Extension for non-Hermitian square matrices

[edit | edit source]

The definition of positive definite can be generalized by designating any complex matrix M (e.g. real non-symmetric) as positive definite if โ„›โ„ฏ{๐ณ*M๐ณ}>0 for all non-zero complex vectors ๐ณ, where โ„›โ„ฏ{c} denotes the real part of a complex number c.[19] Only the Hermitian part 12(M+M*) determines whether the matrix is positive definite, and is assessed in the narrower sense above. Similarly, if ๐ฑ and M are real, we have ๐ฑ๐–ณM๐ฑ>0 for all real nonzero vectors ๐ฑ if and only if the symmetric part 12(M+M๐–ณ) is positive definite in the narrower sense. It is immediately clear that ๐ฑ๐–ณM๐ฑ=โˆ‘ijxiMijxjis insensitive to transposition of M.

A non-symmetric real matrix with only positive eigenvalues may have a symmetric part with negative eigenvalues, in which case it will not be positive (semi)definite. For example, the matrix M=[4914] has positive eigenvalues 1 and 7, yet ๐ฑ๐–ณM๐ฑ=โˆ’2 with the choice ๐ฑ=[โˆ’11].

In summary, the distinguishing feature between the real and complex case is that, a bounded positive operator on a complex Hilbert space is necessarily Hermitian, or self adjoint. The general claim can be argued using the polarization identity. That is no longer true in the real case.

Applications

[edit | edit source]

Heat conductivity matrix

[edit | edit source]

Fourier's law of heat conduction, giving heat flux ๐ช in terms of the temperature gradient ๐ =โˆ‡T is written for anisotropic media as ๐ช=โˆ’K๐ , in which K is the thermal conductivity matrix. The negative is inserted in Fourier's law to reflect the expectation that heat will always flow from hot to cold. In other words, since the temperature gradient ๐  always points from cold to hot, the heat flux ๐ช is expected to have a negative inner product with ๐  so that ๐ช๐–ณ๐ <0. Substituting Fourier's law then gives this expectation as ๐ ๐–ณK๐ >0, implying that the conductivity matrix should be positive definite. Ordinarily K should be symmetric, however it becomes nonsymmetric in the presence of a magnetic field as in a thermal Hall effect.

More generally in thermodynamics, the flow of heat and particles is a fully coupled system as described by the Onsager reciprocal relations, and the coupling matrix is required to be positive semi-definite (possibly non-symmetric) in order that entropy production be nonnegative.

See also

[edit | edit source]

References

[edit | edit source]
  1. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value). Print ed. Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  2. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  3. ^ Horn & Johnson (2013), p. 440, Theorem 7.2.7
  4. ^ Horn & Johnson (2013), p. 441, Theorem 7.2.10
  5. ^ Horn & Johnson (2013), p. 452, Theorem 7.3.11
  6. ^ Horn & Johnson (2013), p. 439, Theorem 7.2.6 with k=2
  7. ^ Horn & Johnson (2013), p. 431, Corollary 7.1.7
  8. ^ Horn & Johnson (2013), p. 485, Theorem 7.6.1
  9. ^ Horn & Johnson (2013), p. 438, Theorem 7.2.1
  10. ^ Horn & Johnson (2013), p. 495, Corollary 7.7.4(a)
  11. ^ a b Horn & Johnson (2013), p. 430, Observation 7.1.3
  12. ^ Horn & Johnson (2013), p. 431, Observation 7.1.8
  13. ^ Horn & Johnson (2013), p. 430
  14. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  15. ^ Horn & Johnson (2013), p. 479, Theorem 7.5.3
  16. ^ Horn & Johnson (2013), p. 509, Theorem 7.8.16
  17. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value)., Corollary 3.6, p. 227
  18. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  19. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).

Sources

[edit | edit source]
  • Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  • Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  • Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
[edit | edit source]
  • Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  • Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).

de:Definitheit#Definitheit von Matrizen