Block matrix pseudoinverse

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

In mathematics, a block matrix pseudoinverse is a formula for the pseudoinverse of a partitioned matrix. This is useful for decomposing or approximating many algorithms updating parameters in signal processing, which are based on the least squares method.

Derivation

[edit | edit source]

Consider a column-wise partitioned matrix:

[𝐀𝐁],𝐀m×n,𝐁m×p,mn+p.

If the above matrix is full column rank, the Moore–Penrose inverse matrices of it and its transpose are

[𝐀𝐁]+=([𝐀𝐁]𝖳[𝐀𝐁])1[𝐀𝐁]𝖳,[𝐀𝖳𝐁𝖳]+=[𝐀𝐁]([𝐀𝐁]𝖳[𝐀𝐁])1.

This computation of the pseudoinverse requires (n + p)-square matrix inversion and does not take advantage of the block form.

To reduce computational costs to n- and p-square matrix inversions and to introduce parallelism, treating the blocks separately, one derives [1]

[𝐀𝐁]+=[𝐏B𝐀(𝐀𝖳𝐏B𝐀)1𝐏A𝐁(𝐁𝖳𝐏A𝐁)1]=[(𝐏B𝐀)+(𝐏A𝐁)+],[𝐀𝖳𝐁𝖳]+=[𝐏B𝐀(𝐀𝖳𝐏B𝐀)1,𝐏A𝐁(𝐁𝖳𝐏A𝐁)1]=[(𝐀𝖳𝐏B)+(𝐁𝖳𝐏A)+],

where orthogonal projection matrices are defined by

𝐏A=𝐈𝐀(𝐀𝖳𝐀)1𝐀𝖳,𝐏B=𝐈𝐁(𝐁𝖳𝐁)1𝐁𝖳.

The above formulas are not necessarily valid if [𝐀𝐁] does not have full rank – for example, if 𝐀0, then

[𝐀𝐀]+=12[𝐀+𝐀+][(𝐏A𝐀)+(𝐏A𝐀)+]=0

Application to least squares problems

[edit | edit source]

Given the same matrices as above, we consider the following least squares problems, which appear as multiple objective optimizations or constrained problems in signal processing. Eventually, we can implement a parallel algorithm for least squares based on the following results.

Column-wise partitioning in over-determined least squares

[edit | edit source]

Suppose a solution 𝐱=[𝐱1𝐱2] solves an over-determined system:

[𝐀,𝐁][𝐱1𝐱2]=𝐝,𝐝m×1.

Using the block matrix pseudoinverse, we have

𝐱=[𝐀,𝐁]+𝐝=[(𝐏B𝐀)+(𝐏A𝐁)+]𝐝.

Therefore, we have a decomposed solution:

𝐱1=(𝐏B𝐀)+𝐝,𝐱2=(𝐏A𝐁)+𝐝.

Row-wise partitioning in under-determined least squares

[edit | edit source]

Suppose a solution 𝐱 solves an under-determined system:

[𝐀𝖳𝐁𝖳]𝐱=[𝐞𝐟],𝐞n×1,𝐟p×1.

The minimum-norm solution is given by

𝐱=[𝐀𝖳𝐁𝖳]+[𝐞𝐟].

Using the block matrix pseudoinverse, we have

𝐱=[(𝐀𝖳𝐏B)+(𝐁𝖳𝐏A)+][𝐞𝐟]=(𝐀𝖳𝐏B)+𝐞+(𝐁𝖳𝐏A)+𝐟.

Comments on matrix inversion

[edit | edit source]

Instead of ([𝐀𝐁]𝖳[𝐀𝐁])1, we need to calculate directly or indirectly[citation needed][original research?]

(𝐀𝖳𝐀)1,(𝐁𝖳𝐁)1,(𝐀𝖳𝐏B𝐀)1,(𝐁𝖳𝐏A𝐁)1.

In a dense and small system, we can use singular value decomposition, QR decomposition, or Cholesky decomposition to replace the matrix inversions with numerical routines. In a large system, we may employ iterative methods such as Krylov subspace methods.

Considering parallel algorithms, we can compute (𝐀𝖳𝐀)1 and (𝐁𝖳𝐁)1 in parallel. Then, we finish to compute (𝐀𝖳𝐏B𝐀)1 and (𝐁𝖳𝐏A𝐁)1 also in parallel.

See also

[edit | edit source]

References

[edit | edit source]
  1. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
[edit | edit source]