Buckets:

|
download
raw
26 kB

Title: The Pseudoinverse of 𝐴=𝐢⁒𝑅 is 𝐴⁺=𝑅⁺⁒𝐢⁺ (?)

URL Source: https://arxiv.org/html/2305.01716

Markdown Content: Back to arXiv

This is experimental HTML to improve accessibility. We invite you to report rendering errors. Use Alt+Y to toggle on accessible reporting links and Alt+Shift+Y to toggle off. Learn more about this project and help improve conversions.

Why HTML? Report Issue Back to Abstract Download PDF License: arXiv.org perpetual non-exclusive license arXiv:2305.01716v3 [math.NA] 26 Mar 2024 The Pseudoinverse of 𝐴

𝐢 ⁒ 𝑅 is 𝐴 +

𝑅 + ⁒ 𝐢 + (?) MichaΕ‚ P. Karpowicz Gilbert Strang michal.karpowicz@nask.pl gilstrang@gmail.com NASK PIB MIT

Abstract

This paper gives three formulas for the pseudoinverse of a matrix product 𝐴

𝐢 ⁒ 𝑅 . The first is sometimes correct, the second is always correct, and the third is almost never correct. But that third randomized pseudoinverse 𝐴 π‘Ÿ + may be very useful when 𝐴 is a very large matrix.

1.

𝐴 +

𝑅 + ⁒ 𝐢 + when 𝐴

𝐢 ⁒ 𝑅 and 𝐢 has independent columns and 𝑅 has independent rows.

2.

𝐴 +

( 𝐢 + ⁒ 𝐢 ⁒ 𝑅 ) + ⁒ ( 𝐢 ⁒ 𝑅 ⁒ 𝑅 + ) + is always correct.

3.

𝐴 π‘Ÿ +

( 𝑃 𝑇 ⁒ 𝐢 ⁒ 𝑅 ) + ⁒ 𝑃 𝑇 ⁒ 𝐢 ⁒ 𝑅 ⁒ 𝑄 ⁒ ( 𝐢 ⁒ 𝑅 ⁒ 𝑄 ) +

𝐴 + only when rank ⁒ ( 𝑃 𝑇 ⁒ 𝐴 )

rank ⁒ ( 𝐴 ⁒ 𝑄 )

rank ⁒ ( 𝐴 ) with 𝐴

𝐢 ⁒ 𝑅 .

The statement in the title is not generally true, as the following example shows:

𝐢

[ 1
0
] ⁒ 𝑅

[ 1

1
] ⁒ ( 𝐢 ⁒ 𝑅 ) +

[ 1 ] ⁒ 𝑅 + ⁒ 𝐢 +

[ 1 2

1 2

] ⁒ [ 1

0
]

[ 1 2 ] .

Here, 𝐢 has a full row rank and 𝑅 has a full column rank. We need to have it the other way around in Theorem 1. Then Theorem 2 will give a formula for the pseudoinverse of 𝐴

𝐢 ⁒ 𝑅 that applies in every case.

When the π‘š by π‘Ÿ matrix 𝐢 has π‘Ÿ independent columns (full column rank  π‘Ÿ ), and the π‘Ÿ by 𝑛 matrix 𝑅 has π‘Ÿ independent rows (full row rank π‘Ÿ ), the pseudoinverse 𝐢 + is the left inverse of 𝐢 , and the pseudoinverse 𝑅 + is the right inverse of 𝑅  :

𝐢 +

( 𝐢 T ⁒ 𝐢 ) βˆ’ 1 ⁒ 𝐢 T ⁒  has  ⁒ 𝐢 + ⁒ 𝐢

𝐼 π‘Ÿ and 𝑅 +

𝑅 T ⁒ ( 𝑅 ⁒ 𝑅 T ) βˆ’ 1 ⁒  has  ⁒ 𝑅 ⁒ 𝑅 +

𝐼 π‘Ÿ .

In this case, the π‘š by 𝑛 matrix 𝐴

𝐢 ⁒ 𝑅 will also have rank π‘Ÿ and the statement correctly claims that its 𝑛 by π‘š pseudoinverse is 𝐴 +

𝑅 + ⁒ 𝐢 + .

Theorem 1.

The pseudoinverse 𝐴 + of a product 𝐴

𝐢 ⁒ 𝑅 is the product of the pseudoinverses 𝑅 + ⁒ 𝐢 + (as for inverses) when 𝐢 has full column rank and 𝑅 has full row rank π‘Ÿ .

The simplest proof verifies the four Penrose identities [4] that determinethe pseudoinverse 𝐴 + of any matrix 𝐴  :

𝐴 ⁒ 𝐴 + ⁒ 𝐴

𝐴 𝐴 + ⁒ 𝐴 ⁒ 𝐴 +

𝐴 + ( 𝐴 + ⁒ 𝐴 ) T

𝐴 + ⁒ 𝐴 ( 𝐴 ⁒ 𝐴 + ) T

𝐴 ⁒ 𝐴 +

(1)

Our goal is a different proof of 𝐴 +

𝑅 + ⁒ 𝐢 + , starting from first principles. We begin with the β€œfour fundamental subspaces” associated with any π‘š by 𝑛 matrix 𝐴 of rank π‘Ÿ . Those are the column space C and nullspace N of 𝐴 and 𝐴 T .

Note that every matrix 𝐴 gives an invertible map (Figure 1) from its row space C ⁒ ( 𝐴 T ) to its column space C ⁒ ( 𝐴 ) . If 𝒙 and π’š are in the row space and 𝐴 ⁒ 𝒙

𝐴 ⁒ π’š , then 𝐴 ⁒ ( 𝒙 βˆ’ π’š )

𝟎 . Therefore 𝒙 βˆ’ π’š is in the nullspace N ⁒ ( 𝐴 ) as well as the row space. So 𝒙 βˆ’ π’š is orthogonal to itself and 𝒙

π’š .

The pseudoinverse 𝐴 + in Figure 2 inverts the row space to column space map in Figure 1 when N ⁒ ( 𝐴 𝑇 )

N ⁒ ( 𝐴 + ) . If 𝒃 𝑐 and 𝒅 𝑐 are in the column space C ⁒ ( 𝐴 ) and 𝒙

𝐴 + ⁒ 𝒃 𝑐

𝐴 + ⁒ 𝒅 𝑐 , then 𝐴 + ⁒ ( 𝒃 𝑐 βˆ’ 𝒅 𝑐 )

𝟎 and 𝒃 𝑐 βˆ’ 𝒅 𝑐 is in the nullspace N ⁒ ( 𝐴 + ) . Therefore, if N ⁒ ( 𝐴 + )

N ⁒ ( 𝐴 𝑇 ) , then 𝒃 𝑐 βˆ’ 𝒅 𝑐 is in the nullspace of 𝐴 𝑇 and the column space of 𝐴 , it is orthogonal to itself and ( 𝒃 𝑐 βˆ’ 𝒅 𝑐 ) 𝑇 ⁒ ( 𝒃 𝑐 βˆ’ 𝒅 𝑐 )

0 with 𝒃 𝑐

𝒅 𝑐 . For all vectors 𝒃 𝑛 in the orthogonal complement of C ⁒ ( 𝐴 ) , we have 𝐴 T ⁒ 𝒃 𝑛

𝟎 and 𝐴 + ⁒ 𝒃 𝑛

𝟎 . Theorem 1 shows why N ⁒ ( 𝐴 𝑇 )

N ⁒ ( 𝐴 + ) and 𝐴 + is the inverse map.

That upper map from the row space C ⁒ ( 𝐴 T ) to the column space C ⁒ ( 𝐴 ) is inverted by the pseudoinverse 𝐴 + in Figure 2. And the nullspace of 𝑨 + is the same as the nullspace of 𝑨 𝐓 β€”the orthogonal complement of C ⁒ ( 𝐴 ) . Thus 𝟏 / 𝟎

𝟎 for 𝐴 + . In the extreme case, the pseudoinverse of 𝐴

zero matrix ( π‘š by 𝑛 ) is 𝐴 +

zero matrix ( 𝑛 by π‘š ).

dim 𝒓

C ⁒ ( 𝑨 𝐓 )

dim 𝒏 βˆ’ 𝒓

N ⁒ ( 𝑨 )

R 𝒏

all combinations of the rows of 𝐴 𝒙 𝒓 𝐴 ⁒ 𝒙 𝒓

𝒃 𝒄 𝒙

𝒙 𝒓 + 𝒙 𝒏 𝐴 ⁒ 𝒙

𝒃 𝒄 𝒙 𝒏 𝐴 ⁒ 𝒙 𝒏

𝟎 all vectors orthogonal to the rows dim 𝒓

C ⁒ ( 𝑨 )

dim π’Ž βˆ’ 𝒓

N ⁒ ( 𝑨 𝐓 )

R π’Ž

all combinations of the columns of 𝑨 𝒃 𝒄 all vectors orthogonal to the columns Figure 1: 𝐴 ⁒ 𝒙 𝒓

𝒃 is in the column space of 𝐴 and 𝐴 ⁒ 𝒙 𝒏

𝟎 . The complete solution to 𝐴 ⁒ 𝒙

𝒃 is 𝒙

 one  𝒙 𝒓 +  any  𝒙 𝒏 . dim 𝒓

C ⁒ ( 𝑨 𝐓 )

dim 𝒏 βˆ’ 𝒓

N ⁒ ( 𝑨 )

R 𝒏

all combinations of the rows of 𝑨 (columns of 𝑨 𝐓 ) 𝒙 𝒓 𝐴 + ⁒ 𝒃 𝒄

𝒙 𝒓 𝒃

𝒃 𝒄 + 𝒃 𝒏 𝐴 + ⁒ 𝒃

𝒙 𝒓 𝒃 𝒏 𝐴 + ⁒ 𝒃 𝒏

𝟎 all vectors orthogonal to the rows of 𝑨 dim 𝒓 dim π’Ž βˆ’ 𝒓

R π’Ž

C ⁒ ( 𝑨 )

N ⁒ ( 𝑨 + )

N ⁒ ( 𝑨 𝐓 )

all combinations of the columns of 𝑨 (rows of 𝑨 𝐓 ) 𝒃 𝒄 all vectors orthogonal to the columns of 𝑨 Figure 2:The four subspaces for 𝐴 + are the four subspaces for 𝐴 T .

The proof of Theorem 1 begins with a simple Lemma.

Lemma 1.

We have 𝐂 ⁒ ( 𝐴 )

𝐂 ⁒ ( 𝐴 ⁒ 𝐡 ) if and only if rank ⁒ ( 𝐴 )

rank ⁒ ( 𝐴 ⁒ 𝐡 ) , and 𝐍 ⁒ ( 𝐡 )

𝐍 ⁒ ( 𝐴 ⁒ 𝐡 ) if and only if rank ⁒ ( 𝐡 )

rank ⁒ ( 𝐴 ⁒ 𝐡 ) .

Proof : Always 𝐂 ⁒ ( 𝐴 ) βŠƒ 𝐂 ⁒ ( 𝐴 ⁒ 𝐡 ) and 𝐍 ⁒ ( 𝐴 ⁒ 𝐡 ) βŠƒ 𝐍 ⁒ ( 𝐡 ) . In each pair, equality of dimensions guarantees equality of spaces. The column space can only become smaller, and the nullspace larger.

Proof of Theorem 1: We will show that 𝐍 ⁒ ( 𝐴 T ) βŠ‚ 𝐍 ⁒ ( 𝐴 + ) and 𝐍 ⁒ ( 𝐴 T ) βŠƒ 𝐍 ⁒ ( 𝐴 + ) , and that 𝐂 ⁒ ( 𝐴 T ) βŠ‚ 𝐂 ⁒ ( 𝐴 + ) and 𝐂 ⁒ ( 𝐴 T ) βŠƒ 𝐂 ⁒ ( 𝐴 + ) when 𝐴 +

𝑅 + ⁒ 𝐢 + .

Step 1 :  N ⁒ ( 𝐴 T ) βŠ‚ N ⁒ ( 𝐴 + ) . Every vector 𝒃 𝒏 in the nullspace of 𝐴 T is also in the nullspace of 𝐴 + .

If 𝐴 T ⁒ 𝒃 𝒏

0 , then 𝑅 T ⁒ 𝐢 T ⁒ 𝒃 𝒏

0 . Since 𝐢 T has full column rank, we conclude that 𝐢 T ⁒ 𝒃 𝒏

( 𝐢 T ⁒ 𝐢 ) βˆ’ 1 ⁒ 𝐢 T ⁒ 𝒃 𝒏

𝐢 + ⁒ 𝒃 𝒏

0 and 𝒃 𝒏 is in the nullspace of 𝐢 + . However, by Lemma 1, we also have N ⁒ ( 𝐢 + )

N ⁒ ( 𝑅 + ⁒ 𝐢 + )

N ⁒ ( 𝐴 + ) . Every 𝒃 𝒏 in the nullspace of 𝐴 T is also in the nullspace of 𝐴 + .

Step 2 :  N ⁒ ( 𝐴 T ) βŠƒ N ⁒ ( 𝐴 + ) . Every vector 𝒃 𝒏 in the nullspace of 𝐴 + is also in the nullspace of 𝐴 T .

Suppose that 𝐴 + ⁒ 𝒃 𝒏

𝑅 + ⁒ 𝐢 + ⁒ 𝒃 𝒏

𝟎 . Since 𝑅 ⁒ 𝑅 +

𝐼 π‘Ÿ , we have 𝐴 ⁒ 𝐴 + ⁒ 𝒃 𝒏

𝐢 ⁒ 𝑅 ⁒ 𝑅 + ⁒ 𝐢 + ⁒ 𝒃 𝒏

𝐢 ⁒ ( 𝐢 T ⁒ 𝐢 ) βˆ’ 1 ⁒ 𝐢 T ⁒ 𝒃 𝒏

𝟎 . Since 𝐢 ⁒ ( 𝐢 T ⁒ 𝐢 ) βˆ’ 1 has full column rank, we see that 𝐢 T ⁒ 𝒃 𝒏

𝟎 , so 𝒃 𝒏 is orthogonal to every column of 𝐢 . Therefore 𝒃 𝒏 ∈ N ⁒ ( 𝐴 T ) .

Step 3 :  C ⁒ ( 𝐴 T ) βŠ‚ C ⁒ ( 𝐴 + ) . Every vector 𝒙 in the row space of 𝐴 is also in the column space of 𝐴 + .

We have 𝒙

𝐴 T ⁒ π’š

𝑅 T ⁒ 𝐢 T ⁒ π’š

𝑅 T ⁒ [ ( 𝑅 ⁒ 𝑅 T ) βˆ’ 1 ⁒ 𝑅 ⁒ 𝑅 T ] ⁒ 𝐢 T ⁒ π’š

𝑅 + ⁒ 𝑅 ⁒ ( 𝑅 T ⁒ 𝐢 T ⁒ π’š )

𝑅 + ⁒ 𝑅 ⁒ 𝒙 . So by Lemma 1, we conclude that 𝒙 ∈ C ⁒ ( 𝑅 + )

C ⁒ ( 𝑅 + ⁒ 𝐢 + )

C ⁒ ( 𝐴 + ) .

Step 4 :  C ⁒ ( 𝐴 T ) βŠƒ C ⁒ ( 𝐴 + ) . Every vector 𝒙 in the column space of 𝐴 + is also in the row space of 𝐴 .

Given 𝒃

𝐴 ⁒ 𝒙

𝐢 ⁒ 𝑅 ⁒ 𝒙 , we have

𝒙

𝐴 + ⁒ 𝒃

𝑅 + ⁒ 𝐢 + ⁒ 𝐢 ⁒ 𝑅 ⁒ 𝒙

𝑅 T ⁒ ( 𝑅 ⁒ 𝑅 T ) βˆ’ 1 ⁒ ( 𝐢 T ⁒ 𝐢 ) βˆ’ 1 ⁒ 𝐢 T ⁒ 𝐢 ⁒ 𝑅 ⁒ 𝒙

𝑅 T ⁒ ( 𝑅 ⁒ 𝑅 T ) βˆ’ 1 ⁒ 𝑅 ⁒ 𝒙

So again by Lemma 1, we conclude that 𝒙 ∈ C ⁒ ( 𝑅 T )

C ⁒ ( 𝑅 T ⁒ 𝐢 T )

C ⁒ ( 𝐴 T ) .

An π‘š by 𝑛 matrix of rank π‘Ÿ has many factorizations. The simplest form of 𝐴

𝐢 ⁒ 𝑅 fills 𝐢 with the first π‘Ÿ independent columns of 𝐴 . Those columns are a basis for the column space of 𝐴 . Then column 𝑗 of 𝑅 specifies the combination of columns of 𝐢 that produces column 𝑗 of 𝐴 . This example has column 3 = column 1 + column 2:

𝐴

[ 1

4

5

2
3
5 ]

[ 1

4

2

3 ] ⁒ [ 1

0

1

0
1
1 ]

𝐢 ⁒ 𝑅 .

That matrix 𝑅 contains the nonzero rows of the reduced row echelon form of 𝐴 . It also reveals nullspace N ⁒ ( 𝐴 )

N ⁒ ( 𝑅 ) .

𝐴

𝐢 ⁒ 𝑅 expresses this classical β€œelimination by row operations” as a matrix factorization [5, 6]. It is slower and less stable numerically than the SVD, but rational 𝐴 produces rational 𝐢 and 𝑅 and 𝐢 + and 𝑅 + . The following useful formula shows that explicitly:

𝐴 +

𝑅 + ⁒ 𝐢 +

𝑅 T ⁒ ( 𝑅 ⁒ 𝑅 T ) βˆ’ 1 ⁒ ( 𝐢 T ⁒ 𝐢 ) βˆ’ 1 ⁒ 𝐢 T

𝑅 T ⁒ ( 𝐢 T ⁒ 𝐴 ⁒ 𝑅 T ) βˆ’ 1 ⁒ 𝐢 T .

(2)

The inverses of square rational matrices 𝑅 ⁒ 𝑅 𝑇 and 𝐢 𝑇 ⁒ 𝐢 are formed by elementary operations and division by a rational determinant. Therefore, the pseudoinverse of 𝐴 is rational when 𝐴 is rational.

The ranks of 𝐴 + and 𝐴 are equal when 𝐴 + is the inverse map. Then,

𝒃

𝐴 ⁒ 𝒙 + ( 𝐼 π‘š βˆ’ 𝐴 ⁒ 𝐴 + ) ⁒ π’˜

is a solution of 𝐴 + ⁒ 𝒃

𝒙 for arbitrary π’˜ . We have 𝐴 + ⁒ 𝒃

𝐴 + ⁒ 𝐴 ⁒ 𝒙

𝒙 , since 𝐴 + ⁒ 𝐴 ⁒ 𝐴 +

𝐴 + when 𝐴 + and 𝐴 have equal ranks.

That is not always the case and not always necessary. Consider the rank-deficient

𝐴

[ 1

0

0

0

0
0
]

[ 1

0

0
] ⁒ [ 1
0
]

𝐢 0 ⁒ 𝑅 0 .

When we complete 𝐢 0 and 𝑅 0 to get square invertible [ 𝐢 0 , 𝐢 1 ] and [ 𝑅 0 , 𝑅 1 ] 𝑇 , then the generalized 𝐴

𝐢 ⁒ 𝑅 becomes

𝐴

[ 𝐢 0

𝐢 1

] ⁒ [ 𝐼 π‘Ÿ

0

0

0

] ⁒ [ 𝑅 0

𝑅 1
]

𝐢 Β― ⁒ 𝑅 Β―

and its generalized inverse is

𝐺

[ 𝑅 0

𝑅 1

] βˆ’ 1 ⁒ [ 𝐼 π‘Ÿ

𝑍 11

𝑍 21

𝑍 22

] ⁒ [ 𝐢 0

𝐢 1

] βˆ’ 1

for arbitrary 𝑍 𝑖 ⁒ 𝑗 of proper size [8]. The first Penrose identity holds, 𝐴 ⁒ 𝐺 ⁒ 𝐴

𝐴 , but the second one gives

𝐺 ⁒ 𝐴 ⁒ 𝐺

[ 𝑅 0

𝑅 1

] βˆ’ 1 ⁒ [ 𝐼 π‘Ÿ

𝑍 12

𝑍 21

𝑍 21 ⁒ 𝑍 12

] ⁒ [ 𝐢 0

𝐢 1

] βˆ’ 1 .

We have 𝐺 ⁒ 𝐴 ⁒ 𝐺

𝐺 only if 𝑍 22

𝑍 21 ⁒ 𝑍 12 . Then 𝐴 and 𝐺 have equal ranks and 𝐺 is the inverse map. Otherwise, the nullspaces N ⁒ ( 𝐴 𝑇 ) and N ⁒ ( 𝐺 ) are different. For

𝐴 𝑇

[ 1

0

0

0
0
0 ]

[ 1

0
] ⁒ [ 1
0
0 ]

𝐢 2 ⁒ 𝑅 2

the nullspace N ⁒ ( 𝐴 𝑇 )

N ⁒ ( 𝑅 2 ) is spanned by ( 0 , 1 , 0 ) 𝑇 and ( 0 , 0 , 1 ) 𝑇 . For:

𝐺

[ 1

3

2

3
3
2 ]

[ 1

3

3

3

] ⁒ [ 1

0

0

0
1
2 / 3 ]

𝐢 3 ⁒ 𝑅 3 ,

we see that N ⁒ ( 𝐺 )

N ⁒ ( 𝑅 3 ) is spanned by ( 0 , βˆ’ 2 / 3 , 0 ) 𝑇 . Here 𝐴 ⁒ 𝐺 ⁒ 𝐴

𝐴 , but 𝐺 ⁒ 𝐴 ⁒ 𝐺 is not equal to 𝐺 with 1

rank ⁒ ( 𝐴 ) < rank ⁒ ( 𝐺 )

2 . Also, neither 𝐴 ⁒ 𝐺 nor 𝐺 ⁒ 𝐴 gives an identity matrix. Still,

𝒙

𝐺 ⁒ 𝒃 + ( 𝐼 𝑛 βˆ’ 𝐺 ⁒ 𝐴 ) ⁒ 𝒛

solves 𝐴 ⁒ 𝒙

𝐴 ⁒ 𝐺 ⁒ 𝒃 + ( 𝐴 βˆ’ 𝐴 ⁒ 𝐺 ⁒ 𝐴 ) ⁒ 𝒛

𝐴 ⁒ 𝐺 ⁒ 𝒃

𝒃 for arbitrary 𝒛 .

The necessary and sufficient conditions for 𝐴 +

( 𝐢 ⁒ 𝑅 ) +

𝑅 + ⁒ 𝐢 + are surprisingly complex. Greville formulated them in [7]. The reverse order law for pseudoinverse, ( 𝐢 ⁒ 𝑅 ) +

𝑅 + ⁒ 𝐢 + , holds if and only if

𝐂 ⁒ ( 𝑅 ⁒ 𝑅 𝑇 ⁒ 𝐢 𝑇 ) βŠ‚ 𝐂 ⁒ ( 𝐢 𝑇 ) and 𝐂 ⁒ ( 𝐢 𝑇 ⁒ 𝐢 ⁒ 𝑅 ) βŠ‚ 𝐂 ⁒ ( 𝑅 ) .

(3)

Matrix 𝑅 maps 𝒙 π‘Ÿ

𝑅 𝑇 ⁒ 𝐢 𝑇 ⁒ π’š from the row space C ⁒ ( 𝐴 𝑇 ) into the row space C ⁒ ( 𝐢 𝑇 ) . Then matrix 𝐢 takes 𝑅 ⁒ 𝒙 π‘Ÿ to the column space C ⁒ ( 𝐴 ) , so we have 𝐢 ⁒ 𝑅 ⁒ 𝒙 π‘Ÿ

𝐴 ⁒ 𝒙 π‘Ÿ

𝒃 𝑐 . Matrix 𝐢 𝑇 maps 𝒃 𝑐 into the column space C ⁒ ( 𝑅 ) . Then 𝑅 𝑇 ⁒ 𝐢 𝑇 ⁒ 𝒃 𝑐 is also in the row space C ⁒ ( 𝐴 𝑇 ) . The inverse map 𝐢 + identifies 𝑅 ⁒ 𝒙 π‘Ÿ in the column space C ⁒ ( 𝑅 ) and then 𝑅 + gives 𝒙 π‘Ÿ

𝑅 + ⁒ 𝐢 + ⁒ 𝒃 𝑐 in Figure  3. The reverse order law demands that

𝐢 + ⁒ 𝐢 ⁒ ( 𝑅 ⁒ 𝐴 𝑇 )

𝑅 ⁒ 𝐴 𝑇 and 𝑅 ⁒ 𝑅 + ⁒ ( 𝐢 𝑇 ⁒ 𝐴 )

𝐢 𝑇 ⁒ 𝐴 .

(4)

It follows that 𝐢 and 𝑅 solve the two-sided projection equation:

𝐢 + ⁒ 𝐢 ⁒ ( 𝑅 ⁒ 𝑅 𝑇 ⁒ 𝐢 𝑇 ⁒ 𝐢 ) ⁒ 𝑅 ⁒ 𝑅 +

𝑅 ⁒ 𝑅 𝑇 ⁒ 𝐢 𝑇 ⁒ 𝐢 .

(5)

For the full-rank factorization of Theorem 1 the equation holds with 𝐢 + ⁒ 𝐢

𝐼 π‘Ÿ

𝑅 ⁒ 𝑅 + .

𝑅 ⁒ 𝐂 ⁒ ( 𝐴 𝑇 )

𝐂 ⁒ ( 𝑅 ⁒ 𝑅 𝑇 ⁒ 𝐢 𝑇 ) βŠ‚ 𝐂 ⁒ ( 𝐢 𝑇 ) C ⁒ ( 𝐢 𝑇 ) 𝒙 π‘Ÿ ∈ C ⁒ ( 𝑅 𝑇 ⁒ 𝐢 𝑇 ) C ⁒ ( 𝐢 ⁒ 𝑅 ) βˆ‹ 𝒃 𝑐 C ⁒ ( 𝑅 ) 𝐢 𝑇 ⁒ 𝐂 ⁒ ( 𝐴 )

𝐂 ⁒ ( 𝐢 𝑇 ⁒ 𝐢 ⁒ 𝑅 ) βŠ‚ 𝐂 ⁒ ( 𝑅 ) 𝑅 ⁒ 𝒙 π‘Ÿ

π’˜ 𝐢 ⁒ 𝑅 ⁒ 𝒙 π‘Ÿ

𝐴 ⁒ 𝒙 π‘Ÿ

𝒃 𝑐 𝐴 𝑇 ⁒ 𝐴 ⁒ 𝒙 π‘Ÿ

𝐴 𝑇 ⁒ 𝒃 𝑐

𝒙 π‘Ÿ

𝑅 𝑇 ⁒ ( 𝑅 ⁒ 𝑅 𝑇 ) βˆ’ 1 ⁒ 𝑅 ⁒ 𝒙 π‘Ÿ

𝑅 + ⁒ 𝐢 + ⁒ 𝒃 𝑐 𝐢 𝑇 ⁒ 𝐢 ⁒ 𝑅 ⁒ 𝒙 π‘Ÿ

𝐢 𝑇 ⁒ 𝐴 ⁒ 𝒙 π‘Ÿ

𝐢 𝑇 ⁒ 𝒃 𝑐

𝑅 ⁒ 𝒙 π‘Ÿ

( 𝐢 𝑇 ⁒ 𝐢 ) βˆ’ 1 ⁒ 𝐢 𝑇 ⁒ 𝒃 𝑐

𝐢 + ⁒ 𝒃 𝑐 𝑅 + 𝑅 𝐢 𝐢 +
Figure 3:Given a full rank decomposition 𝐴

𝐢 ⁒ 𝑅 , matrix 𝑅 maps C ⁒ ( 𝐴 𝑇 ) into C ⁒ ( 𝐢 𝑇 ) and matrix 𝐢 𝑇 maps C ⁒ ( 𝐴 ) into C ⁒ ( 𝑅 ) when ( 𝐢 ⁒ 𝑅 ) +

𝑅 + ⁒ 𝐢 + .
C ⁒ ( ( 𝐢 ⁒ 𝑅 ⁒ 𝑅 + ) + )

𝑅 ⁒ 𝑅 + ⁒ C ⁒ ( 𝐢 𝑇 ) [ 10 ⁒ 𝑝 ⁒ 𝑑 ] ⁒ 𝒙 π‘Ÿ ∈ C ⁒ ( 𝑅 𝑇 ⁒ 𝐢 𝑇 ) C ⁒ ( 𝑅 ) ∩ C ⁒ ( 𝐢 𝑇 ) C ⁒ ( 𝐢 ⁒ 𝑅 ) βˆ‹ 𝒃 𝑐 C ⁒ ( 𝐴 𝑇 )

C ⁒ ( ( 𝐢 + ⁒ 𝐢 ⁒ 𝑅 ) + ) 𝑅 ⁒ 𝒙 π‘Ÿ

( 𝐢 ⁒ 𝑅 ⁒ 𝑅 + ) + ⁒ 𝒃 𝑐 𝒙 π‘Ÿ

( 𝐢 + ⁒ 𝐢 ⁒ 𝑅 ) + ⁒ 𝑅 ⁒ 𝒙 π‘Ÿ
Figure 4:The pseudoinverse 𝐴 +

( 𝐢 ⁒ 𝑅 ) +

( 𝐢 + ⁒ 𝐢 ⁒ 𝑅 ) + ⁒ ( 𝐢 ⁒ 𝑅 ⁒ 𝑅 + ) + decomposes into the product of the pseudoinverse of 𝑅 projected on the row space of 𝐢 𝑇 and the pseudoinverse of 𝐢 𝑇 projected on the column space of 𝑅 .

In general, the inverse map 𝐴 +

( 𝐢 ⁒ 𝑅 ) + takes 𝒃 𝑐 in the column space C ⁒ ( 𝐢 ) and projects it into 𝑅 ⁒ 𝒙 π‘Ÿ in the part of the column space C ⁒ ( 𝑅 ) intersecting with C ⁒ ( 𝐢 𝑇 ) . That is ensured by ( 𝐢 + ⁒ 𝑅 ⁒ 𝑅 + ) + , since by the first principles

C ⁒ ( ( 𝐢 ⁒ 𝑅 ⁒ 𝑅 + ) + )

C ⁒ ( ( 𝐢 ⁒ 𝑅 ⁒ 𝑅 + ) 𝑇 )

𝑅 ⁒ 𝑅 + ⁒ C ⁒ ( 𝐢 𝑇 )

𝑅 ⁒ 𝑅 + ⁒ C ⁒ ( 𝐢 + ) .

Then ( 𝐢 + ⁒ 𝐢 ⁒ 𝑅 ) + maps 𝑅 ⁒ 𝒙 π‘Ÿ into the row space C ⁒ ( 𝑅 𝑇 ⁒ 𝐢 𝑇 ) . We have

C ⁒ ( ( 𝐢 + ⁒ 𝐢 ⁒ 𝑅 ) + )

C ⁒ ( ( 𝐢 + ⁒ 𝐢 ⁒ 𝑅 ) 𝑇 )

𝑅 𝑇 ⁒ C ⁒ ( 𝐢 + ⁒ 𝐢 )

𝑅 𝑇 ⁒ C ⁒ ( 𝐢 𝑇 )

C ⁒ ( 𝑅 𝑇 ⁒ 𝐢 𝑇 ) .

Therefore, C ⁒ ( 𝑅 𝑇 ⁒ 𝐢 𝑇 )

C ⁒ ( 𝐴 + ) in Figure 4. Also, for any 𝒃 𝑛 in the nullspace of 𝑅 𝑇 ⁒ 𝐢 𝑇 we have 𝑅 ⁒ 𝑅 + ⁒ 𝐢 𝑇 ⁒ 𝒃 𝑛

𝟎 , which shows that 𝒃 𝑛 is in the nullspace

N ⁒ ( 𝑅 ⁒ 𝑅 + ⁒ 𝐢 𝑇 )

N ⁒ ( ( 𝐢 ⁒ 𝑅 ⁒ 𝑅 + ) + ) βŠ‚ N ⁒ ( ( 𝐢 + ⁒ 𝐢 ⁒ 𝑅 ) + ⁒ ( 𝐢 ⁒ 𝑅 ⁒ 𝑅 + ) + )

N ⁒ ( 𝐴 + ) .

Conversely, any 𝒃 𝑛 in the nullspace N ⁒ ( ( 𝐢 ⁒ 𝑅 ⁒ 𝑅 + ) + )

𝑅 ⁒ 𝑅 + ⁒ N ⁒ ( 𝐢 𝑇 ) is also in the nullspace of N ⁒ ( 𝐢 𝑇 ) . It is orthogonal to the columns of 𝐢 spanning C ⁒ ( 𝐴 ) . Therefore, it is also in the nullspace N ⁒ ( 𝐴 𝑇 ) . That proves the following formula for 𝐴 + :

Theorem 2.

The pseudoinverse 𝐴 + of a product 𝐴

𝐢 ⁒ 𝑅 is given by the product

𝐴 +

( 𝐢 ⁒ 𝑅 ) +

( 𝐢 + ⁒ 𝐢 ⁒ 𝑅 ) + ⁒ ( 𝐢 ⁒ 𝑅 ⁒ 𝑅 + ) + .

(6)

The statement in the title of this paper is not generally true. But the statement of Theorem 2 corrects the mistake as the following example shows:

𝐢

[ 1
0
] 𝑅

[ 1

1
] 𝐢 + ⁒ 𝐢

[ 1

0

0
0
] 𝑅 ⁒ 𝑅 +

1 2 ⁒ [ 1

1

1
1
]

( 𝐢 + ⁒ 𝐢 ⁒ 𝑅 ) + ⁒ ( 𝐢 ⁒ 𝑅 ⁒ 𝑅 + ) +

[ 1

0

] ⁒ [ 1

1
]

( 𝐢 ⁒ 𝑅 ) + .

Theorem 3 gives an even more general statement (following from [3]).

Theorem 3.

The pseudoinverse 𝐴 + of a product 𝐴

𝐢 ⁒ 𝑅 is given by the product

𝐴 +

( ( 𝑃 𝑇 ⁒ 𝐢 ⁒ 𝑅 ) + ⁒ 𝑃 𝑇 ⁒ 𝐢 ) ⁒ ( 𝑅 ⁒ 𝑄 ⁒ ( 𝐢 ⁒ 𝑅 ⁒ 𝑄 ) + ) ,

(7)

for any 𝑃 and 𝑄 satisfying

rank ⁒ ( 𝑃 𝑇 ⁒ 𝐴 )

rank ⁒ ( 𝐴 ⁒ 𝑄 )

rank ⁒ ( 𝐴 ) .

(8)

Proof: When rank ⁒ ( 𝑃 𝑇 ⁒ 𝐴 )

rank ⁒ ( 𝐴 ⁒ 𝑄 )

rank ⁒ ( 𝐴 ) for 𝑃 and 𝑄 of proper size, then

C ⁒ ( ( 𝑃 𝑇 ⁒ 𝐴 ) 𝑇 )

C ⁒ ( ( 𝐴 ) 𝑇 ) and C ⁒ ( 𝐴 ⁒ 𝑄 )

C ⁒ ( 𝐴 ) .

In that case, ( 𝑃 𝑇 ⁒ 𝐴 ) + ⁒ ( 𝑃 𝑇 ⁒ 𝐴 )

𝐴 + ⁒ 𝐴 and ( 𝐴 ⁒ 𝑄 ) ⁒ ( 𝐴 ⁒ 𝑄 ) +

𝐴 ⁒ 𝐴 + and (7) is true:

𝐴 +

( 𝑃 𝑇 ⁒ 𝐴 ) + ⁒ 𝑃 𝑇 ⁒ 𝐴 ⁒ 𝑄 ⁒ ( 𝐴 ⁒ 𝑄 ) + .
Figure 5: Computation time and relative approximation error of the randomized 𝐴 π‘Ÿ +

πš›πš™πš’πš—πšŸ ⁒ ( 𝐴 ) , randomized SVD-based 𝐴 𝑠 +

πš›πšœπšŸπšπš’ ⁒ ( 𝐴 ) and direct 𝐴 +

πš™πš’πš—πšŸ ⁒ ( 𝐴 ) method of calculating the pseudoinverse of 𝐴 . In the experiment, we generate a random test matrix 𝐴

πšπšŠπš•πš•πšŽπš›πš’ ( β€² πš›πšŠπš—πšπšœπšŸπš β€² , πš— , 𝟷 𝚎 𝟷𝟢𝟢 ) with 𝑛 ranging from 100 to 1000 . Then, we calculate the approximation 𝐴 π‘Ÿ + taking random π‘š by 𝑝 matrix 𝑃 and 𝑛 by π‘ž matrix 𝑄 (in Theorem 3) with 𝑝

π‘ž

⌈ 𝛼 β‹… 𝑛 βŒ‰ for 𝛼

0.4 (top) and 𝛼

0.1 (bottom). Randomized SVD-based pseudoinverse is calculated based on rank 𝑠

rank ⁒ ( 𝐴 ) approximation 𝐴 𝑠

𝑄 ⁒ π‘ˆ 𝑠 ⁒ Ξ£ 𝑠 ⁒ 𝑉 𝑠 𝑇 for [ 𝑄 , ∼ ]

qr ⁒ ( 𝐴 βˆ— πš›πšŠπš—πšπš— ⁒ ( 𝑛 , 𝑠 ) ) .

Matrices 𝑃 and 𝑄 must be rank-preserving to reconstruct 𝐴 + . But apart from that requirement, 𝑃 and 𝑄 can be random matrices. The pseudoinverse of

𝐴

[ 1

4

5

2
3
5 ] ⁒ is equal to ⁒ 𝐴 +

1 15 ⁒ [ βˆ’ 8

9

7

βˆ’ 6

βˆ’ 1

3

] .

Randomly selected matrices

𝑃

[ 2

2

2

1
2
2 ] and 𝑄

[ 1

1

0

2

0

0

]

preserve rank ⁒ ( 𝐴 )

rank ⁒ ( 𝑃 𝑇 ⁒ 𝐴 )

rank ⁒ ( 𝐴 ⁒ 𝑄 )

2 . Therefore,

( 𝑃 𝑇 ⁒ 𝐢 ⁒ 𝑅 ) + ⁒ 𝑃 𝑇 ⁒ 𝐢

1 3 ⁒ [ 2

βˆ’ 1

βˆ’ 1

2

1
1
] and 𝑅 ⁒ 𝑄 ⁒ ( 𝐢 ⁒ 𝑅 ⁒ 𝑄 ) +

1 5 ⁒ [ βˆ’ 3

4

2

βˆ’ 1

]

reconstruct

𝐴 +

1 3 ⁒ [ 2

βˆ’ 1

βˆ’ 1

2

1

1

] β‹… 1 5 ⁒ [ βˆ’ 3

4

2
βˆ’ 1
]

1 15 ⁒ [ βˆ’ 8

9

7

βˆ’ 6

βˆ’ 1

3

] .

Random sampling matrices 𝑃 and 𝑄 that are not rank-preserving produce a low-rank approximation of 𝐴 + from the samples of bases for the column space and row space of 𝐴 . Therefore, the formula in Theorem 3 describes a randomized algorithm approximating 𝐴 + :

function Arplus = rpinv(A,p,q) % rpinv: randomized pseudoinverse of A [m,n] = size(A); P = randn(m,p); Q = randn(n,q); PTA = P’A; AQ = AQ; Arplus = pinv(PTA)PTAQ*pinv(AQ); end

When 𝑃 and 𝑄 are small ( π‘š by 𝑝 and 𝑛 by π‘ž matrices), it is more time-efficient to compute 𝐴 π‘Ÿ + using two pseudoinverses, ( 𝑃 𝑇 ⁒ 𝐴 ) + and ( 𝐴 ⁒ 𝑄 ) + , than to calculate 𝐴 + directly. Also, that approach may lead to smaller relative errors (under the circumstances to be investigated) than the randomized SVD method while maintaining similar time efficiency. Figure 5 demonstrates these points by comparing each method.

We have given three formulas for the pseudoinverse of a matrix product 𝐴

𝐢 ⁒ 𝑅 and derived them from the four fundamental subspaces of 𝐴 . Every matrix 𝐴 gives an invertible map from its row space to its column space. The pseudoinverse 𝐴 + inverts the row space to the column space map when the nullspace of 𝐴 𝑇 and 𝐴 + match. The first formula in Theorem 1 is correct if 𝐢 has independent columns and 𝑅 has independent rows or when Greville’s conditions hold. The second in Theorem 2 is always correct. The third in Theorem 3 is correct when the sampling matrices 𝑃 and 𝑄 preserve rank. Otherwise, it is not correct but potentially very useful, especially for large 𝐴 .

References [1] ↑ Adi Ben-Israel and Thomas N.E. Greville, Generalized Inverses: Theory and Applications.Springer( 2003 ). [2] ↑ Keaton Hamm and Longxiu Huang, Perspectives on CUR Decompositions.Applied and Computational Harmonic Analysis, πŸ’πŸ– ( 2020 ) 1088 – 1099 . [3] ↑ MichaΕ‚ P. Karpowicz, A Theory of Meta-factorization.arXiv : 2111.14385 ( 2021 ). [4] ↑ Roger Penrose, A Generalized Inverse for Matrices.MathematicalProceedings of the Cambridge Philosophical Society, πŸ“πŸ ( 1955 ) 406 – 413 . [5] ↑ Gilbert Strang and Daniel Drucker, Three Matrix Factorizations from the Steps of Elimination.Analysis and Applications, 𝟐𝟎 ( 2022 ) 1147 – 1157 . [6] ↑ Gilbert Strang, Introduction to Linear Algebra, πŸ” th edition ( 2023 ),Wellesley-Cambridge Press. [7] ↑ Thomas N.E. Greville, Note on the Generalized Inverse of a Matrix Product,SIAM Review, πŸ– ( 1966 ) 518 – 521 . [8] ↑ George Marsaglia and George Styan, Equalities and inequalities for ranks of matrices,Linear and Multilinear Algebra, πŸ‘ ( 1974 ) 269 – 292 . Report Issue Report Issue for Selection Generated by L A T E xml Instructions for reporting errors

We are continuing to improve HTML versions of papers, and your feedback helps enhance accessibility and mobile support. To report errors in the HTML that will help us improve conversion and rendering, choose any of the methods listed below:

Click the "Report Issue" button. Open a report feedback form via keyboard, use "Ctrl + ?". Make a text selection and click the "Report Issue for Selection" button near your cursor. You can use Alt+Y to toggle on and Alt+Shift+Y to toggle off accessible reporting links at each section.

Our team has already identified the following issues. We appreciate your time reviewing and reporting rendering errors we may not have found yet. Your efforts will help us improve the HTML versions for all readers, because disability should not be a barrier to accessing research. Thank you for your continued support in championing open access for all.

Have a free development cycle? Help support accessibility at arXiv! Our collaborators at LaTeXML maintain a list of packages that need conversion, and welcome developer contributions.

Xet Storage Details

Size:
26 kB
Β·
Xet hash:
0316fdde0afc1e63a0a053491a81cf83b75373054d4249cca7b97756fa79890c

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.