appendix 1 matrix algebra review
TRANSCRIPT
Appendix
1 Matrix Algebra Review
We briefly review, without proof, the basic concepts and results taught inan elementary linear algebra course (see for details [1–3]).
1.1 Definition
An m × n matrix AAA over a complex field C is a rectangular array of m-rowand n-column of entries in C:
AAA = [aij ] =
a11 a12 · · · a1n
a21 a22 · · · a2n...
.... . .
...am1 am2 · · · amn
∈ C
m×n (1.1)
where aij ∈ C denotes the ith row and jth column—(i, j)th—element of thematrix.
1.2 Basic Matrix Operations, Functions, and Norms
1.2.1 Addition, Substraction, and Multiplication
AAA + BBB, AAA −BBB, ABABAB
1.2.2 Conjugate, Transpose and Conjugate Transpose
• (Complex) Conjugate:
AAA∗ =[a∗ij]∈ C
m×n (1.2)
• Transpose:
AAAT = [aji] ∈ Cn×m (1.3)
• Conjugate transpose (or Hermitian operation):
AAA† =(AAA∗)T
=[a∗ji]∈ C
n×m (1.4)
17
The transpose or conjugate transpose of a row vector yields a columnvector:
[a1 a2 · · · an
]†=
a∗1a∗2...
a∗n
∈ C
n. (1.5)
Similarly,
[AAA1 AAA2 · · · AAAp
]†=
AAA†1
AAA†2...
AAA†p
∈ C
np×m (1.6)
where AAAk ∈ Cm×n, k = 1, 2, . . . , p.
(ABCABCABC)† = CCC†BBB†AAA† (1.7)
1.2.3 Vectorization
The vec operator creates a column vector by staking the columns of AAA =[aaa1 aaa2 · · · aaan
]:
vec (AAA) =
aaa1
aaa2...
aaan
∈ C
mn (1.8)
where aaak ∈ Cm, k = 1, 2, . . . , n, is the m-dimensional column vector.
1.2.4 Kronecker Product, Hadamard Product, and Direct Sum
• Kronecker (or direct) product:
AAA ⊗BBB =
a11BBB a12BBB · · · a1nBBBa21BBB a22BBB · · · a2nBBB
......
. . ....
am1BBB am2BBB · · · amnBBB
∈ C
mp×nq (1.9)
for AAA ∈ Cm×n and BBB ∈ C
p×q
18
• Hadamard (or Schur) product:
AAA ◦BBB = [aijbij ] ∈ Cm×n (1.10)
for AAA = [aij ] ∈ Cm×n and BBB = [bij ] ∈ C
m×n
• Direct sum:
AAA ⊕BBB =
[AAA 000000 BBB
]
∈ C(m+p)×(n+q) (1.11)
for AAA ∈ Cm×n and BBB ∈ C
p×q
1.2.5 Trace
The trace of n-square matrix AAA = [aij ] is the sum of the diagonal elements:
tr (AAA) =n∑
i=1
aii. (1.12)
1.2.6 Frobenius Norm
The Frobenius (or Euclidean) norm of a matrix AAA = [aij ] ∈ Cm×n is
‖AAA‖F =
m∑
i=1
n∑
j=1
|aij |2
1/2
(1.13)
=√
tr(AAAAAA†
). (1.14)
• Summation using vector notation:
Let xxx =[x1 x2 · · · xn
]T∈ C
n. Then,
n∑
i=1
|xi|2 = xxx†xxx = tr
(xxxxxx†)
= ‖xxx‖2. (1.15)
1.2.7 Determinant
The determinant of n-square matrix AAA = [aij ] is
det (AAA) =∑
σσσ=(σ1,σ2,...,σn)
sgn (σσσ) a1σ1a2σ2 · · · anσn(1.16)
19
where σσσ = (σ1, σ2, . . . , σn) is the permutation of integers 1, 2, . . . , n, sgn (σσσ)denotes the sign of the permutation σσσ, and the summation is over all n!permutations of 1, 2, . . . , n.
Another method for computing the determinants is based on their re-duction to determinants of smaller sizes—cofactor expansion.
• AAAij = (n − 1) × (n − 1) submatrix of AAA obtained by deleting the ithrow and jth column
• (n − 1) × (n − 1) minor Mij = det(AAAij
)
• Cofactor Cij = (−1)i+j Mij
(Cofactor expansion):
det (AAA) =n∑
j=1
aij (−1)i+j Mij︸ ︷︷ ︸
Cij
for any i ∈ {1, 2, . . . , n} (1.17)
=n∑
i=1
aij (−1)i+j Mij︸ ︷︷ ︸
Cij
for any j ∈ {1, 2, . . . , n}. (1.18)
1.2.8 Adjoint and Inverse
• Adjoint:
adj (AAA) = [Cij ]T (1.19)
where Cij are the cofactors AAA.
• Inverse (when it exists, i.e., nonsingular):
AAA−1 =1
det (AAA)adj (AAA) (1.20)
If ABABAB = IIIn, then BBB = AAA−1.
(ABCABCABC)−1 = CCC−1BBB−1AAA−1 (1.21)(AAA−1
)†=(AAA†)−1
(1.22)
20
1.3 Eigenvalues and Eigenvectors
1.3.1 Definition
Let AAA be n-square matrix. Then, a scalar λ is defined as an eigenvalue of AAAif there exists a nonzero vector xxx ∈ C
n such that
AxAxAx = λxxx (1.23)
yielding
det (AAA − λIIIn) = 0 (1.24)
where xxx is called an eigenvector corresponding to the eigenvalue λ.
1.3.2 General Rules of Eigenvalues
Let λ1, λ2, . . . , λn be the eigenvalues of AAA ∈ Cn×n.
• Sum of all eigenvalues = trace:
tr (AAA) =
n∑
i=1
λi (1.25)
• Product of all eigenvalues = determinant:
det (AAA) =n∏
i=1
λi (1.26)
• The rank of AAA is equal to the number of nonzero eigenvalues.
• Eigenvalues of αAAA for a scalar α ⇒ αλ1, αλ2, . . . , αλn
• Eigenvalues of αIIIn + AAA for a scalar α ⇒ α + λ1, α + λ2, . . . , α + λn
• Eigenvalues of AAAT ⇒ λ1, λ2, . . . , λn
• Eigenvalues of AAA−1 ⇒ λ−11 , λ−1
2 , . . . , λ−1n
• For BBB ∈ Cm×m with the eigenvalues µ1, µ2, . . . , µm:
eigenvalues of AAA ⊗BBB ⇒ λiµj , i = 1, 2, . . . , n, j = 1, 2, . . . , m
eigenvalues of AAA ⊕BBB ⇒ λ1, λ2, . . . , λn, µ1, µ2, . . . , µm
• For AAA ∈ Cm×n and BBB ∈ C
n×m, ABABAB and BABABA have the same nonzeroeigenvalues.
21
1.4 Some Properties of Vectorization Operator
For any conformable matrices:
• vec (ABCABCABC) =(CCCT ⊗AAA
)vec (BBB)
• tr (ABABAB) = vec(AAA†)†
vec (BBB)
• tr (ABCDABCDABCD) = vec(BBB†)†(
AAAT ⊗CCC)vec (DDD)
1.5 Some Properties of Kronecker Product
For any conformable matrices and scalars α, β:
• AAA ⊗ (BBB ⊗CCC) = (AAA ⊗BBB) ⊗CCC
• AAA ⊗ (BBB + CCC) = (AAA ⊗BBB) + (AAA ⊗CCC)
• α ⊗AAA = AAA ⊗ α = αAAA
• αAAA ⊗ βBBB = αβAAA ⊗BBB
• (AAA ⊗BBB) (CCC ⊗DDD) = ACACAC ⊗BDBDBD
• (AAA ⊗BBB)T = AAAT ⊗BBBT or (AAA ⊗BBB)† = AAA† ⊗BBB†
• (AAA ⊗BBB)−1 = AAA−1 ⊗BBB−1
• tr (AAA ⊗BBB) = tr (AAA) tr (BBB)
• det (AAA ⊗BBB) = det (AAA)n det (BBB)m for AAA ∈ Cm×m and BBB ∈ C
n×n
1.6 Some Properties of Trace
• tr (AAA) =∑
eigenvalues
• tr (AAA + BBB) = tr (AAA) + tr (BBB)
• tr (αAAA) = α tr (AAA)
• tr(AAAT)
= tr (AAA)
• tr (ABABAB) = tr (BABABA)
22
1.7 Some Properties of Determinant
• The determinant changes sign if two rows (or columns) are inter-changed.
• The determinant is unchanged if a constant multiple of one row (orcolumn) is added to another row (or column).
• det (AAA) =∏
eigenvalues
• det (ABABAB) = det (AAA) det (BBB) for square matrices AAA and BBB
• det (αAAA) = αn det (AAA) for AAA ∈ Cn×n
• det(AAAT)
= det (AAA)
• det(AAA†)
= det (AAA)∗
• det(AAA−1
)= det (AAA)−1
• det (III + ABABAB) = det (III + BABABA)
•∂
∂zln detAAA (z) = tr
(
AAA−1 ∂AAA
∂z
)
1.8 Classes of Matrices
A square complex matrix AAA = [aij ] is said to be
diagonal if aij = 0, i 6= j,
symmetric if AAAT = AAA,
Hermitian if AAA† = AAA,
skew-Hermitian if AAA† = −AAA,
orthogonal if AAATAAA = AAAAAAT = III,
unitary if AAA†AAA = AAAAAA† = III
normal if AAA†AAA = AAAAAA†.
1.8.1 Hermitian Matrices: AAA† = AAA
• An n-square complex matrix AAA is Hermitian if and only if there existsa unitary matrix UUU such that
AAA = UUUdiag (λ1, λ2, . . . , λn)UUU † (1.27)
where λ1, λ2, . . . , λn are the eigenvalues of AAA and are real.
23
⇒ The eigenvalues of Hermitian matrices are always real.
• If AAA and BBB are Hermitian, then
tr (ABABAB) = tr (ABABAB)∗ . (1.28)
⇒ The trace of a product of two Hermitian matrices is always realalthough the product is generally not Hermitian.
• The eigenvalues of skew-Hermitian matrices are always purely imagi-nary or zero.
• The trace of a product of two skew-Hermitian matrices is always real.
• The trace of a product of a Hermitian matrix and skew-Hermitianmatrix is always purely imaginary or zero.
1.8.2 Positive semidefinite and definite Matrices: AAA ≥ 0, AAA > 0
• An n-square complex matrix AAA is positive semidefinite, denoted byAAA ≥ 0, if
xxx†AxAxAx ≥ 0 for all xxx ∈ Cn. (1.29)
• An n-square complex matrix AAA is positive semidefinite if and only ifthere exists a unitary matrix UUU such that
AAA = UUUdiag (λ1, λ2, . . . , λn)UUU † (1.30)
where λ1, λ2, . . . , λn are the eigenvalues of AAA and are nonnegative.
⇒ The eigenvalues of positive semidefinite matrices are nonnegativereal.
⇒ necessarily Hermitian
• For every AAA ≥ 0, there exists a unique BBB ≥ 0 such that
AAA = BBB2. (1.31)
• An n-square complex matrix AAA is positive definite, denoted by AAA > 0,if
xxx†AxAxAx > 0 for all xxx ∈ Cn. (1.32)
⇒ The eigenvalues of positive definite matrices are positive real.
24
1.8.3 Vandermonde Matrices
An n-square matrix AAA is called the Vandermonde matrix if it is of the form
AAA =[
ai−1j
]
=
1 1 1 · · · 1a1 a2 a3 · · · an
a21 a2
2 a23 · · · a2
n...
......
. . ....
an−11 an−1
2 an−13 · · · an−1
n
. (1.33)
The Vandermonde determinant is given by
det
1 1 1 · · · 1a1 a2 a3 · · · an
a21 a2
2 a23 · · · a2
n...
......
. . ....
an−11 an−1
2 an−13 · · · an−1
n
= (a2 − a1) (a3 − a1) (a4 − a1) · · · (an − a1)
× (a3 − a2) (a4 − a2) · · · (an − a2)
× (a4 − a3) · · · (an − a3)
...
× (an − an−1)
=∏
1≤i<j≤n
(aj − ai) . (1.34)
25
2 Frequently Used Theorems
2.1 Cauchy-Schwarz inequality
Let x1, x2, . . . , xn ∈ C and y1, y2, . . . , yn ∈ C be any two arbitrary sets overa complex field. Then,
∣∣∣∣∣
n∑
i=1
xiy∗i
∣∣∣∣∣
2
≤
(n∑
i=1
|xi|2
)(n∑
i=1
|yi|2
)
(2.1)
with equality if and only if xi and yi are proportional, i.e., xi = c yi wherec is an arbitrary constant.
Vector notation: Let
xxx =[x1 x2 · · · xn
]T∈ C
n
yyy =[y1 y2 · · · yn
]T∈ C
n.
Then,
xxx†yyyyyy†xxx ≤(xxx†xxx)(
yyy†yyy)
(2.2)
⇔ |xxx†yyy|2 ≤ ‖xxx‖2‖yyy‖2 (2.3)
with equality if and only if xxx = cyyy.
2.2 Spectral Decomposition Theorem
Let AAA ∈ Cn×n be a normal matrix with the eigenvalues λ1, λ2, . . . , λn. Then,
there exists an n × n unitary matrix UUU such that
AAA = UUUdiag (λ1, λ2, . . . , λn)UUU †. (2.4)
In particular,
• if AAA is positive semidefinite (or definite), then λi ≥ 0 (or λi > 0);
• if AAA is Hermitian, then λi are real;
• if AAA is unitary, then |λi| = 1.
26
2.3 Singular-Value Decomposition Theorem
• (Singular Value):
For an arbitrary AAA ∈ Cm×n, the n × n matrix AAA†AAA is positive semi-
definite. Therefore, the matrix AAA†AAA has a positive semidefinite squareroot BBB such that AAA†AAA = BBB2. The eigenvalues λ1, λ2, . . . , λn of BBB =(AAA†AAA
)1/2are called the singular values σ1, σ2, . . . , σn of AAA.
Let AAA ∈ Cm×n be of rank r. Then, there exist unitary matrices UUU ∈ C
m×m
and VVV ∈ Cn×n such that
AAA = UUU
[DDD 000000 000
]
VVV † (2.5)
where DDD = diag (σ1, σ2, . . . , σr) and σ1, σ2, . . . , σr are nonzero singular valuesof AAA.
2.4 Rayleigh-Ritz Theorem
Let AAA ∈ Cn×n be Hermitian. Then,
λmin (AAA) = minxxx†xxx=1xxx∈Cn
xxx†AAAxxx (2.6)
λmax (AAA) = maxxxx†xxx=1xxx∈Cn
xxx†AAAxxx. (2.7)
2.5 Integral-Type Cauchy-Binet Formula
Let fi and gi, i = 1, 2, . . . , n, be arbitrary integrable functions over [a, b].Then, for n × n matrices
FFF (x1, x2, . . . , xn) = [fj (xi)] =
f1 (x1) f2 (x1) · · · fn (x1)f1 (x2) f2 (x2) · · · fn (x2)
......
. . ....
f1 (xn) f2 (xn) · · · fn (xn)
(2.8)
GGG (x1, x2, . . . , xn) = [gj (xi)] =
g1 (x1) g2 (x1) · · · gn (x1)g1 (x2) g2 (x2) · · · gn (x2)
......
. . ....
g1 (xn) g2 (xn) · · · gn (xn)
, (2.9)
27
we have
∫ b
a· · ·
∫ b
a︸ ︷︷ ︸
n-fold
det (FFF ) det (GGG)n∏
`=1
w (x`) dx1dx2 · · · dxn
= n! det
([∫ b
afi (x) gj (x) w (x) dx
]
︸ ︷︷ ︸
SSS∈Cn×n
)
(2.10)
where w (·) is an arbitrary function.(Ordered region): Let D = {a ≤ x1 ≤ x2 ≤ . . . ≤ xn ≤ b}. Then,
∫
· · ·
∫
D
det (FFF ) det (GGG)n∏
`=1
w (x`) dx1dx2 · · · dxn = det (SSS) . (2.11)
28
3 Jacobians of Matrix Transformations
In studying the distribution theory of random matrices, we often need theJacobians of matrix transformations (see for details [4]).
3.1 Exterior Product
Let xxx = (x1, x2, . . . , xn)T ∈ Rn be transformed to yyy = (y1, y2, . . . , yn)T ∈
Rn by a one-to-one transformation. Then, the determinant of the matrix(∂xi
∂yj
)
of all first-order partial derivatives is known as the Jacobian of the
transformation from xxx to yyy, written as
J (xxx → yyy) = det
∂x1∂y1
∂x1∂y2
. . . ∂x1∂yn
∂x2∂y1
∂x2∂y2
. . . ∂x2∂yn
......
. . ....
∂xn
∂y1
∂xn
∂y2. . . ∂xn
∂yn
. (3.1)
Often in deriving the Jacobians of transformations involving many vari-ables, it is tedious to explicitly calculate the determinant of (3.1). Theexterior differential calculus give an equivalent way of obtaining the Jaco-bians, which is essential in the theory of integration on manifolds and isbased on anticommutative or skew-symmetric multiplication of differentials(see, e.g., [5, 6]).
Consider a multiple integral of a function f : Rn → R over a domain
D ⊂ Rn
I =
∫
D
f (x1, x2, . . . , xn) dx1dx2 · · · dxn. (3.2)
Making a change of variables
x1 = x1 (y1, y2, . . . , yn)x2 = x2 (y1, y2, . . . , yn)
...xn = xn (y1, y2, . . . , yn) ,
(3.3)
we have
I =
∫
D′
f (xxx (yyy)) det
(∂xi
∂yj
)
dy1dy2 · · · dyn (3.4)
29
where D′ denotes the image of D by the transformations (3.3). To determine
the Jacobian, instead of calculating the determinant of the matrix(
∂xi
∂yj
)
of
partial derivatives, we can evaluate it by the exterior calculus.Noting that the differentials of the transformations (3.3)
dxi =∂xi
∂y1dy1 +
∂xi
∂y2dy2 + . . . +
∂xi
∂yndyn (3.5)
and substituting the linear differential forms (3.5) in (3.2), we can rewrite(3.4) as
I =
∫
D′
f (xxx (yyy))
(n∑
j=1
∂x1
∂yjdyj
)
· · ·
(n∑
j=1
∂xn
∂yjdyj
)
. (3.6)
Comparing (3.4) and (3.6), it requires multiplying out the differential forms
in (3.6) in such a manner that the result is equal to det(
∂xi
∂yj
)
dy1 · · · dyn.
Therefore, we multiply the differential forms in (3.6) in a formal way usingthe associative and distributive laws. However, instead of the commutativelaw, we use an anticommutative rule for multiplying differentials, that is,
dyidyj =
{
−dyjdyi, i 6= j
0, i = j.(3.7)
Such a product is called the exterior product or wedge product, denoted bythe symbol “∧”. If yi = yj for some i, j = 1, 2, . . . , n, then the matrix(
∂xi
∂yj
)
has equal columns and thus the determinant vanishes. Also, if we
interchange yi and yj , the determinant changes sign. This motivates therules (3.7) for multiplication of differentials.
Example 3.1. Consider the case n = 2. We then have
I =
∫∫
f (x1 (y1, y2) , x2 (y1, y2)) det
([∂x1∂y1
∂x1∂y2
∂x2∂y1
∂x2∂y2
])
dy1dy2 (3.8)
=
∫∫
f (x1 (y1, y2) , x2 (y1, y2))
(∂x1
∂y1dy1 +
∂x1
∂y2dy2
)(∂x2
∂y1dy1 +
∂x2
∂y2dy2
)
.
(3.9)
Multiplying two differential forms in (3.9) in a formal way using the asso-
30
ciative and distributive laws, we get
(∂x1
∂y1dy1 +
∂x1
∂y2dy2
)(∂x2
∂y1dy1 +
∂x2
∂y2dy2
)
=∂x1
∂y1
∂x2
∂y1dy1dy1 +
∂x1
∂y1
∂x2
∂y2dy1dy2 +
∂x1
∂y2
∂x2
∂y1dy2dy1 +
∂x1
∂y2
∂x2
∂y2dy2dy2.
(3.10)
Comparing (3.10) with
det
([∂x1∂y1
∂x1∂y2
∂x2∂y1
∂x2∂y2
])
dy1dy2 =
(∂x1
∂y1
∂x2
∂y2−
∂x1
∂y2
∂x2
∂y1
)
dy1dy2, (3.11)
it is clear that
dyidyj = −dyjdyi (3.12)
and particularly, dyidyi = −dyidyi = 0. Hence, using the exterior product,
the right-hand side of (3.10) can be written as
(∂x1
∂y1
∂x2
∂y2−
∂x1
∂y2
∂x2
∂y1
)
dy1 ∧ dy2. (3.13)
The formal procedure of multiplying differential forms is equivalent tocalculating the Jacobian as is shown by the following theorem.
Theorem 3.1. Let dyyy = (dy1, dy2, . . . , dyn)T ∈ Rn be a column vector of n
differentials and dxxx = (dx1, dx2, . . . , dxn)T ∈ Rn = AAA dyyy, where AAA = (aij)
is an n × n nonsingular matrix and thus dxxx is a column vector of linear
differential forms. Then, we have
n∧
i=1
dxi = det (AAA)n∧
i=1
dyi. (3.14)
Proof. Note that
n∧
i=1
dxi =n∧
i=1
n∑
j=1
aijdyj
=
n∑
τ1=1
n∑
τ2=1
· · ·
n∑
τn=1
(n∏
i=1
aiτi
)(n∧
i=1
dyτi
)
. (3.15)
31
If any two of τ1, τ2, . . . , τn are equal, then∧n
i=1 dyτi= 0. Hence, letting
τ = (τ1, τ2, . . . , τn) be a permutation of {1, 2, . . . , n}, (3.16) can be writtenas
n∧
i=1
dxi =∑
τ
(n∏
i=1
aiτi
)(n∧
i=1
dyτi
)
=∑
τ
sgn (π)
(n∏
i=1
aiτi
)(n∧
i=1
dyi
)
= det (AAA)
n∧
i=1
dyi (3.16)
where the second equality follows from the fact that if τ is any permutationof {1, 2, . . . , n}, then
n∧
i=1
dyτi= sgn (τ)
n∧
i=1
dyi. (3.17)
Returning to the transformations (3.3), it follows from (3.5) that
dxxx =
∂x1∂y1
∂x1∂y2
. . . ∂x1∂yn
∂x2∂y1
∂x2∂y2
. . . ∂x2∂yn
......
. . ....
∂xn
∂y1
∂xn
∂y2. . . ∂xn
∂yn
dyyy (3.18)
and hence, by Theorem 3.1, we have
n∧
i=1
dxi = det
(∂xi
∂yj
) n∧
i=1
dyi. (3.19)
Since the elements of a matrix can be written as a vector containing allthese elements, we can treat matrix transformations in the same manner.For any m × n real matrix XXX = (xij) of mn functionally independent vari-ables xij , dXXX denotes the matrix of differentials dxij and the symbol (dXXX)denotes the exterior product of the mn elements of dXXX, i.e.,
(dXXX) ,
n∧
j=1
m∧
i=1
dxij . (3.20)
32
Before calculating more Jacobians, we would make explicit a convention:the signs of exterior differential forms could be ignored because we willbe integrating exterior differential forms representing a probability densityfunction (pdf) and hence, we can avoid any difficulty with sign simply bydefining only positive integrals. Therefore, (dXXX) can also be treated as
(dXXX) =m∧
i=1
n∧
j=1
dxij . (3.21)
If XXX is a symmetric n × n matrix, the symbol (dXXX) denotes the exteriorproduct of the 1
2m (m + 1) distinct elements of dXXX:
(dXXX) ,
n∧
i≤j
dxij . (3.22)
Similarly, if XXX is a skew-symmetric, (dXXX) denotes the exterior product ofthe 1
2m (m − 1) distinct elements of dXXX, and if XXX is upper-triangular, thenalso
(dXXX) ,
n∧
i≤j
dxij . (3.23)
If XXX is a diagonal matrix, then we let
(dXXX) ,
n∧
i=1
dxii . (3.24)
Theorem 3.2. Let XXX = AYAYAY where XXX,YYY ∈ Rm×n and AAA is an m × m
nonsingular matrix. Then, we have
(dXXX) = det (AAA)n (dYYY ) (3.25)
and hence, J (XXX → YYY ) = det (AAA)n.
Proof. Let
dXXX =[dxxx1 dxxx2 · · · dxxxn
]
dYYY =[dyyy1 dyyy2 · · · dyyyn
]
33
where xxxj = (xj1, xj2, . . . , xjm)T and yyyj = (yj1, yj2, . . . , yjm)T , j = 1, 2, . . . , n.Then, we have dxxxj = AAA dyyyj and hence, by Theorem 3.1,
(dXXX) =n∧
j=1
(dxxxj)
=n∧
j=1
det (AAA)(dyyyj
)
= det (AAA)n (dYYY ) (3.26)
as desired.
Using a similar argument, we get the Jacobian of a transformation XXX =Y BY BY B where BBB is an n × n nonsingular matrix:
J (XXX → YYY ) = det (BBB)m . (3.27)
3.2 Jacobians in the Complex Case
Let us examine the exterior product of the differentials in XXX ∈ Cm×n with
complex elements. In general, there are mn real variables in ReXXX andanother mn real variables in ImXXX. Therefore, XXX = ReXXX + ImXXX is afunction of 2mn real variables and the exterior product of the differentialswill be denoted by
(dXXX) =([
dReXXX dImXXX])
=
([dReXXXdImXXX
])
= (d ReXXX) (d ImXXX) . (3.28)
In this convention, an empty product is interpreted as unity. Hence, if thematrix XXX is real, then ImXXX is null and (dXXX) = (d ReXXX).
The next theorems give the Jacobians of some transformations which arecommonly used in matrix-variate distribution theory.
Lemma 3.1. Let AAA ∈ Cn×n have a nonzero real part ReAAA 6= 000. Then,
det(AAAAAA†
)= det
([ReAAA − ImAAAImAAA ReAAA
])
(3.29)
= det
([ReAAA ImAAA− ImAAA ReAAA
])
. (3.30)
34
Proof. It follows from the determinant of a partitioned matrix that
det(AAAAAA†
)= det
([ReAAA + ImAAA 000
000 ReAAA − ImAAA
])
= det
([2 ReAAA ReAAA − ImAAA
ReAAA − ImAAA ReAAA − ImAAA
])
= det
([2 ReAAA − ImAAA− ImAAA 1
2 ReAAA
])
= det (ReAAA) det(
ReAAA + ImAAA (ReAAA)−1 ImAAA)
(3.31)
where the second equality can be obtained by adding the last n columns tothe first n columns and then adding the last n rows to the first n rows, andthe third equality can also be obtained by using similar steps. The left-handsides of (3.29) and (3.30) are equal to (3.31) and hence, we complete theproof of the lemma.
Theorem 3.3 (Linear Transformation). Let XXX and YYY be m × n matrices
of mn functionally independent variables. Let AAA ∈ Cm×m, BBB ∈ C
n×n, and
CCC ∈ Cm×n be matrices of constants, where AAA and BBB are nonsingular. Then,
the Jacobian of a nonsingular linear transformation
XXX = AY BAY BAY B + CCC (3.32)
is given by
(dXXX) = det(AAAAAA†
)ndet(BBBBBB†
)m
︸ ︷︷ ︸
=J(XXX→YYY )
(dYYY ) . (3.33)
Proof. Since CCC is a constant, (dCCC) = 0 and hence we may ignore CCC. LetXXX = AZAZAZ where ZZZ = Y BY BY B. Then, since
[ReXXXImXXX
]
=
[ReAAA − ImAAAImAAA ReAAA
] [ReZZZImZZZ
]
, (3.34)
it follows from Theorem 3.2 and Lemma 3.1 that
(dXXX) = det
([ReAAA − ImAAAImAAA ReAAA
])n
(dZZZ)
= det(AAAAAA†
)n(dZZZ) . (3.35)
35
Similarly, we have
[ReZZZ ImZZZ
]=[ReYYY ImYYY
][
ReBBB ImBBB− ImBBB ReBBB
]
(3.36)
and
(dZZZ) = det
([ReBBB ImBBB− ImBBB ReBBB
])m
(dYYY )
= det(BBBBBB†
)m(dYYY ) . (3.37)
Combining (3.35) and (3.37) completes the proof.
Theorem 3.4 (Hermitian Transformation). Let XXX,YYY ∈ Cn×n be Hermitian
matrices of functionally independent variables and AAA ∈ Cn×n be a nonsingu-
lar matrix of constants. Then, the Jacobian of a Hermitian transformation
XXX = AY AAY AAY A† is given by
(dXXX) = det(AAAAAA†
)n
︸ ︷︷ ︸
=J(XXX→YYY )
(dYYY ) . (3.38)
Proof. Since AAA is nonsingular, it can be written as a product of elementarymatrices. Let EEE1,EEE2, . . . ,EEEk be n × n elementary matrices such that
AAA = EEEkEEEk−1 · · ·EEE1 (3.39)
and
EEE1 =
a 0 0 · · · 00 1 0 · · · 00 0 1 · · · 0...
......
. . ....
0 0 0 · · · 1
(3.40)
where a ∈ C. Then,
XXX = EEEkEEEk−1 · · ·EEE1YYY EEE†1EEE
†2 · · ·EEE
†k. (3.41)
Let ZZZ1 = EEE1YYY EEE†1,ZZZ2 = EEE2ZZZ1EEE
†2, . . . ,ZZZk = XXX = EEEkZZZk−1EEE
†k, and then we
have
J (XXX → YYY ) = J (XXX → ZZZk−1) J (ZZZk−1 → ZZZk−2) · · ·J (ZZZ1 → YYY ) . (3.42)
36
Note that EEE1YYY EEE†1 means that the first row of YYY is multiplied by a and the
first column of YYY is multiplied by a∗, i.e.,
EEE1YYY EEE†1 =
|a|2 y11 ay12 · · · ay1n
a∗y∗12 y22 · · · y2n...
.... . .
...a∗y∗1n a∗y∗2n · · · ynn
. (3.43)
yielding
(dZZZ1) = |a|2n (dYYY )
= det(EEE1EEE
†1
)n(dYYY ) (3.44)
Since the elementary row operations produce only a change in the sign of thedeterminant, the elementary matrix EEE1 of any type will produce det
(EEE1EEE
†1
)n
in the Jacobian, ignoring the sign. Using the same steps for ZZZ2,ZZZ3, . . . ,ZZZk,we obtain
(dXXX) = det(EEEkEEE
†k
)ndet(EEEk−1EEE
†k−1
)n· · ·det
(EEE1EEE
†1
)n(dYYY )
= det(AAAAAA†
)n(dYYY ) (3.45)
as desired.
Remark 3.1. If XXX and YYY are skew-Hermitian, then the diagonal elements
are purely imaginary. It is easy to show that the structure of the Jacobian
for a transformation XXX = AY AAY AAY A† remains the same as that in the Hermitian
case of Theorem 3.4. Hence, ignoring the sign, we have
(dXXX) = det(AAAAAA†
)n(dYYY ) . (3.46)
Theorem 3.5 (Cholesky Decomposition). Let XXX ∈ Cn×n be a Hermitian
positive-definite matrix of functionally independent variables. Let TTT = (tij) ∈C
n×n and LLL = (`ij) ∈ Cn×n be upper and lower triangular matrices of func-
tionally independent variables with positive diagonal elements, respectively.
Then, the Jacobians of Cholesky decompositions XXX = TTT †TTT and XXX = LLL†LLLare given respectively by
(dXXX) = 2n
{n∏
i=1
t2(n−i)+1ii
}
︸ ︷︷ ︸
=J(XXX→TTT )
(dTTT ) (3.47)
37
and
(dXXX) = 2n
{n∏
i=1
`2(i−1)+1ii
}
︸ ︷︷ ︸
=J(XXX→LLL)
(dLLL) . (3.48)
Proof. When the diagonal elements of the triangular matrix are positive,there exists a unique representation XXX = TTT †TTT :
x11 x12 · · · x1n
x∗12 x22 · · · x2n...
.... . .
...x∗
1n x∗2n · · · xnn
=
t11 0 · · · 0t∗12 t22 · · · 0...
.... . .
...t∗1n t∗2n · · · tnn
t11 t12 · · · t1n
0 t22 · · · t2n...
.... . .
...0 0 · · · tnn
.
(3.49)
Expressing each of the elements of XXX on and above the diagonal in terms ofthe elements of TTT gives
x11 = t211
x12 = t11t12...
x1n = t11t1n
x22 = t∗12t12 + t222
x23 = t∗12t13 + t22t23...
x2n = t∗12t1n + t22t2n
...
xnn = t∗1nt1n + t∗2nt2n + . . . + t2nn.
Let us now take the exterior product of differentials. Since the products ofrepeated differentials are zero, we need not to keep track of differentials in
38
the elements of TTT that have previously occurred. Hence,
(dx11) = 2t11 (dt11)
(dx12) = t211 (dt12) + . . .
...
(dx1n) = t211 (dt1n) + . . .
(dx22) = 2t22 (dt22) + . . .
(dx23) = t222 (dt23) + . . .
...
(dx2n) = t222 (dt2n) + . . .
...
(dxnn) = 2tnn (dtnn) + . . .
giving
(dXXX) = 2nt2(n−1)+111 t
2(n−2)+122 · · · tnn (dTTT ) (3.50)
as desired. Using the similar steps, we obtain (3.48).
Theorem 3.6. Let XXX,YYY ∈ Cn×n be lower triangular matrices of n (n + 1) /2
functionally independent variables and LLL = (`ij) ∈ Cn×n be a nonsingular
lower triangular matrix of constants. Then, the Jacobian of a transformation
XXX = LYLYLY is given by
(dXXX) =
{m∏
i=1
|`ii|2i
}
︸ ︷︷ ︸
J(XXX→YYY )
(dYYY ) . (3.51)
If all the diagonal elements of lower triangular matrices are real, then
(dXXX) =
{m∏
i=1
`2i−1ii
}
︸ ︷︷ ︸
J(XXX→YYY )
(dYYY ) . (3.52)
39
Proof. Note that
x11 0 · · · 0x21 x22 · · · 0...
.... . .
...xn1 xn2 · · · xnn
=
`11 0 · · · 0`21 `22 · · · 0...
.... . .
...`n1 `n2 · · · `nn
y11 0 · · · 0y21 y22 · · · 0...
.... . .
...yn1 yn2 · · · ynn
.
(3.53)
Expressing each of the elements of XXX on and below the diagonal in terms ofthe elements of YYY gives
x11 = `11y11
x21 = `21y11 + `22y21
x31 = `31y11 + `32y21 + `33y31
...
xn1 = `n1y11 + `n2y21 + . . . + `nnyn1
x22 = `22y22
x32 = `32y22 + `33y32
x42 = `42y22 + `43y32 + `44y42
...
xn2 = `n2y22 + `n3y32 + . . . + `nnyn2
...
xnn = `nnynn.
Similar to the proof of Theorem 3.5, taking the exterior product of differen-
40
tials for the case of all the complex elements, we have
(dx11) = |`11|2 (dy11)
(dx21) = |`22|2 (dy21) + . . .
(dx31) = |`33|2 (dy31) + . . .
...
(dxn1) = |`nn|2 (dyn1) + . . .
(dx22) = |`22|2 (dy22)
(dx32) = |`33|2 (dy32) + . . .
(dx42) = |`44|2 (dy42) + . . .
...
(dxn2) = |`nn|2 (dyn2) + . . .
...
(dxnn) = |`nn|2 (dynn)
yielding
(dXXX) = |`11|2·1 |`22|
2·2 · · · |`nn|2·n (dYYY ) (3.54)
as desired in (3.51). Similar steps give the result (3.52) for the case of realdiagonal elements.
Definition 3.1 (Complex Matrix-Variate Gamma Function). The complex
matrix-variate gamma function, denoted by Γn (α), is defined as1
Γn (α) ,
∫
AAA=AAA†>0det (AAA)α−n etr (−AAA) (dAAA) (3.55)
where Re α > n − 1 and the integral is over the space of n × n Hermitian
positive-definite matrices.
Corollary 3.1. For Re α > n − 1, we have
Γn (α) = πn(n−1)/2m−1∏
i=0
Γ (α − i) . (3.56)
1To distinguish a distribution or function of a complex variable from a real one, we will
use a tilde ‘∼’ for the complex case.
41
Proof. Let XXX ∈ Cn×n be Hermitian positive-definite. Then, by definition,
Γn (α) ,
∫
XXX=XXX†>0det (XXX)α−n etr (−XXX) (dXXX) .
Let TTT = (tij) ∈ Cn×n be a upper triangular matrix with positive diagonal
elements such that XXX = TTT †TTT . Then, it follows from Theorem 3.5 that
det (XXX)α−n (dXXX) = 2n
{n∏
i=1
t2α−2i+1ii
}
(dTTT ) . (3.57)
Since
tr (XXX) = tr(TTT †TTT
)
=n∏
i=1
t2ii
n∏
i<j
|tij |2 (3.58)
and
det (XXX) = det(TTT †) det
(TTT)
=
n∏
i=1
t2ii, (3.59)
we have
Γn (α) =
∫
TTT2n etr
(−TTT †TTT
)
{n∏
i=1
t2α−2i+1ii
}
(dTTT )
=n∏
i=1
2
∫ ∞
0t2α−2i+1ii e−t2iidtii
︸ ︷︷ ︸
=Γ(α−i+1)
×n∏
i<j
∫ ∞
−∞
∫ ∞
−∞
exp[
−{
(Re tij)2 + (Im tij)
2}]
dRe tij dIm tij︸ ︷︷ ︸
=π
(3.60)
from which and the fact that∏n
i<j π = πn(n−1)/2, we complete the proof.
Next, we give the Jacobians of the transformations involving unitarymatrices. When dealing with unitary matrices, a basic property to be notedis the following: if UUU is unitary, we have
d(UUU †UUU
)= dUUU †UUU + UUU †dUUU = 000 (3.61)
42
and hence,
UUU †dUUU = −dUUU †UUU
= −(UUU †dUUU
)†(3.62)
which implies that UUU †dUUU is skew-Hermitian. The exterior product in UUU †dUUUcomes into the picture when evaluating the Jacobians involving unitarytransformations. If starting from UUUUUU † = III, one has dUUUUUU †. Therefore,UUU †dUUU and dUUUUUU † play the same role in the problems.
Theorem 3.7 (Eigenvalue Decomposition). Let XXX ∈ Cn×n be a Hermitian
matrix of functionally independent variables with the ordered eigenvalues
λ1 ≥ λ2 ≥ . . . ≥ λn and UUU ∈ Cn×n be a unitary matrix with real diagonal
elements such that XXX = UDUUDUUDU † where DDD = diag (λ1, λ2, . . . , λn). Then, the
Jacobian of this eigenvalue decomposition is given by
(dXXX) =∏n
i<j(λj − λi)
2
︸ ︷︷ ︸
=J(XXX→DDD,UUU)
(dDDD)(UUU †dUUU
)(3.63)
with
(UUU †dUUU
)=
n∧
i<j
uuu†jduuui (3.64)
where uuui, i = 1, 2, . . . , n, is the ith column of UUU .
Proof. When DDD is the diagonal matrix of the ordered eigenvalues of XXX andthe diagonal elements of UUU are real, there exist unique representation XXX =UDUUDUUDU †. Taking the differentials in XXX = UDUUDUUDU †, we have
dXXX = dUUUDDDUUU † + UUUdDDDUUU † + UUUDDDdUUU †. (3.65)
Multiplying by UUU † on the left and UUU on the right, we get
UUU †dXXXUUU = UUU †dUUUDDD + dDDD + DDDdUUU †UUU
= UUU †dUUUDDD + dDDD −DDDUUU †dUUU. (3.66)
Using Theorem 3.4, the exterior product in UUU †dXXXUUU can be written as(UUU †dXXXUUU
)= det
(UUU †UUU
)n(dXXX)
= (dXXX) . (3.67)
(unitary invariance)
43
Since the diagonal elements of UUU †dUUUDDD − DDDUUU †dUUU are zeros, the exteriorproduct of the diagonal elements in the right-hand side of (3.66) is equalto (dDDD). Moreover, letting dzij be the (i, j)th element of UUU †dUUU , then the(i, j)th element of UUU †dUUUDDD − DDDUUU †dUUU is equal to (λj − λi) dzij . Therefore,the exterior product in the right-hand side of (3.66) is
(UUU †dUUUDDD + dDDD −DDDUUU †dUUU
)=
n∏
i<j
(λj − λi)2 (dDDD)
(UUU †dUUU
)(3.68)
from which and (3.67), we complete the proof.
Theorem 3.8 (LQ Decomposition). Let XXX ∈ Cm×n, m ≤ n, be a full rank
matrix of functionally independent variables. Then, XXX can be factorized by
XXX = LQLQLQ where L = (`ij) ∈ Cm×m is a nonsingular lower triangular matrix
and QQQ ∈ Cm×n is a semiunitary matrix such that QQQQQQ† = IIIm. Let VVV be an
(n − m) × n matrix such that UUU =[
QQQVVV
]is unitary. Then, the Jacobian of
this LQ decomposition is given by:
1. when all the elements of LLL are complex and all the leading diagonal
elements of QQQ are real,
(dXXX) =
{m∏
i=1
|`ii|2(n−i)
}
︸ ︷︷ ︸
J(XXX→LLL,QQQ)
(dLLL) gm,n (QQQ) (3.69)
2. when all the diagonal elements of LLL are positive and all the elements
of QQQ are complex,
(dXXX) =
{m∏
i=1
`2(n−i)+1ii
}
︸ ︷︷ ︸
J(XXX→LLL,QQQ)
(dLLL) gm,n (QQQ) (3.70)
with
gm,n (QQQ) =m∧
i=1
n∧
j=i+1
dqqqiuuu†j (3.71)
where dqqqi and uuuj are the ith row of dQQQ and jth row of UUU , respectively.
44
Proof. If all the elements of LLL are complex and all the leading diagonalelements of QQQ are real (Case 1) or if all the diagonal elements of LLL arepositive and all the elements of QQQ are complex (Case 2), then XXX can beuniquely factorized by XXX = LQLQLQ. Taking the differentials in XXX = LQLQLQ andthen multiplying LLL−1 on the left and UUU † on the right, we get
LLL−1dXXXUUU † = LLL−1dLLLQQQUUU † + dQQQUUU †
= LLL−1dLLL[IIIm 000
]+[
dQQQQQQ† dQQQVVV †]. (3.72)
Case 1. Using Theorem 3.3 and (3.51), we have(LLL−1dXXXUUU †
)= det
(LLLLLL†
)−ndet(UUUUUU †
)−m(dXXX)
=
{m∏
i=1
|`ii|−2n
}
(dXXX) (3.73)
and
(LLL−1dLLL
)=
{m∏
i=1
|`ii|−2i
}
(dLLL) . (3.74)
From (3.72)–(3.74), it follows that{
m∏
i=1
|`ii|−2n
}
(dXXX) =
{m∏
i=1
|`ii|−2i
}
(dLLL) gm,n (QQQ) (3.75)
leading to the desired result (3.69).
Case 2. Similar to Case 1, using Theorem 3.3 and (3.52), we have(LLL−1dXXXUUU †
)= det
(LLLLLL†
)−ndet(UUUUUU †
)−m(dXXX)
=
{m∏
i=1
`−2nii
}
(dXXX) (3.76)
and
(LLL−1dLLL
)=
{m∏
i=1
`−(2i−1)ii
}
(dLLL) . (3.77)
Hence,{
m∏
i=1
`−2nii
}
(dXXX) =
{m∏
i=1
`−(2i−1)ii
}
(dLLL) gm,n (QQQ) (3.78)
yielding the desired result (3.70).
45
3.3 Stiefel Manifold and Invariant Measure
The set of all m × n semiunitary matrices QQQ with m ≤ n such that QQQQQQ† = IIIm
is called the Stiefel manifold, denoted by O (m, n), i.e.,
O (m, n) ={
QQQ ∈ Cm×n : QQQQQQ† = IIIm
}
. (3.79)
As s subspace of R2mn, theses matrices form a submanifold of dimension
2mn − m2, which is a direct extension of the real case (see, e.g., [7, 8]).As a subset of O (m, n), let a manifold O(1) (m, n) be the set of all m × nsemiunitary matrices with real leading diagonal elements, i.e.,
O(1) (m, n) ={
QQQ ∈ Cm×n : QQQQQQ† = IIIm, qii ∈ R
}
. (3.80)
Theorem 3.9. The volume of the Stiefel manifold O (m, n) is given by
Vol(O (m, n)
)=
∫
QQQ∈O(m,n)gm,n (QQQ)
=2mπmn
Γm (n). (3.81)
Proof. Let XXX = (xij) ∈ Cm×n be a full rank matrix of functionally indepen-
dent variables. Then,
∫
XXX∈Cm×n
etr(−XXXXXX†
)(dXXX)
=
∫
XXX∈Cm×n
exp
−m∑
i=1
n∑
j=1
|xij |2
(dXXX) = πmn. (3.82)
Making the transformation used in Case 2 of Theorem 3.8, we have
tr(XXXXXX†
)= tr
(LLLLLL†
)
=m∑
i=1
`2ii
m∑
i>j
|`ij |2 . (3.83)
46
Hence, (3.82) can be written as
πmn =
∫
LLL
∫
QQQ∈O(m,n)exp
−m∑
i=1
`2ii
m∑
i>j
|`ij |2
{m∏
i=1
`2(n−i)+1ii
}
(dLLL) gm,n (QQQ)
=
∫
LLLexp
−m∑
i=1
`2ii
m∑
i>j
|`ij |2
{m∏
i=1
`2(n−i)+1ii
}
(dLLL)
︸ ︷︷ ︸
2−mΓm(n)
∫
QQQ∈O(m,n)gm,n (QQQ)
︸ ︷︷ ︸
Vol(O(m,n))
(3.84)
as desired.
Theorem 3.10. The volume of the manifold O(1) (m, n) is given by
Vol(O(1) (m, n)
)=
∫
QQQ∈O(1)(m,n)gm,n (QQQ)
=πm(n−1)
Γm (n). (3.85)
Proof. Take similar steps as in the proof of Theorem 3.9 using the transfor-mation in Case 1 of Theorem 3.8.
If m = n, then the Stiefel manifold O (m, n) becomes the unitary group
U (m) ={
UUU ∈ Cm×m : UUUUUU † = IIIm
}
(3.86)
and O(1) (m, n) becomes the unitary manifold
U (1) (m) ={
UUU ∈ Cm×m : UUUUUU † = IIIm, uii ∈ R
}
. (3.87)
Corollary 3.2. The volume of the unitary group U (m) is given by
Vol(U (m)
)=
∫
UUU∈U(m)gm,m (UUU)
=2mπm2
Γm (m). (3.88)
Proof. It follows immediately from Theorem 3.9.
47
Corollary 3.3. The volume of the unitary manifold U (1) (m) is given by
Vol(U (1) (m)
)=
∫
UUU∈U(1)(m)gm,m (UUU)
=πm(m−1)
Γm (m). (3.89)
Proof. It follows immediately from Theorem 3.10.
Note that gm,n (QQQ) defines the invariant measure on the Stiefel manifoldO (m, n) and
∫
QQQ∈Dgm,n (QQQ), D ⊂ O (m, n), represents the volume of the
region D on the Stiefel manifold. Dividing gm,n (QQQ) by the volume of theStiefel manifold, we obtain the unit invariant measure on the Stiefel manifoldO (m, n), denoted by [dQQQ]:
[dQQQ] =gm,n (QQQ)
Vol(O (m, n)
) . (3.90)
Similarly, dividing gm,m (UUU) by the volume of U (m), we get a probabilitymeasure known as the unit invariant Haar measure on the unitary groupU (m), denoted by [dUUU ]:
[dUUU ] =gm,m (UUU)
Vol(U (m)
) . (3.91)
References
[1] P. Lancaster and M. Tismenetsky, The Theory of Matrices, 2nd ed. SanDiego, CA: Academic, 1985.
[2] R. A. Horn and C. R. Johnson, Matrix Analysis. Cambridge, U.K.:Cambridge Univ. Press, 1985.
[3] ——, Topics in Matrix Analysis. Cambridge, U.K.: Cambridge Univ.Press, 1991.
[4] A. M. Mathai, Jacobians of Matrix Transformations and Functions of
Matrix Arguments. Singapore: World Scientific, 1997.
[5] H. Flanders, Differential Forms with Applications to the Physical Sci-
ences. New York: Dover Publications, Inc., 1989.
48
[6] A. T. James, “Normal multivariate analysis and the orthogonal group,”Ann. Math. Statistics, vol. 25, no. 1, pp. 40–75, Mar. 1954.
[7] A. K. Gupta and D. K. Nagar, Matrix Variate Distributions. BocaRaton, FL: Chapman & Hall/CRC, 2000.
[8] R. J. Muirhead, Aspects of Multivariate Statistical Theory. New York:Wiley, 1982.
49