felix r. gantmacher the theory of matrices, vol. 2 2000 (1).pdf

THE THEORY OFMATRICESF. R. GANTMACHER

VOLUME TWO

AMS CHELSEA PUBLISHINGAmerican Mathematical Society Providence, Rhode Island

THE PRESENT WORK, PUBLISHED IN TWO VOLUMES, IS AN ENGLISHTRANSLATION, BY K. A. HIRSCH, OF THE RUSSIAN-LANGUAGE BOOKTEORIYA MATRITS By F. R. GANTMACHER (PaliTMaxep)

2000 Mathematics Subject Classification. Primary 15-02.

Library of Congress Catalog Card Number 59-11779International Standard Book Number 0-8218-2664-6 (Vol.11)

International Standard Book Number 0-8218-1393-5 (Set)

Copyright 1959, 1987, 19898 by Chelsea Publishing CompanyPrinted in the United States of America.

Reprinted by the American Mathematical Society, 2000The American Mathematical Society retains all rights

except those granted to the United States Government. The paper used in this book is acid-free and falls within the guidelines

established to ensure permanence and durability.Visit the AMS home page at URL: http://ww.ams.org/

10987654321 04030201 00

PREFACE

THE MATRIX CALCULUS is widely applied nowadays in various branches ofmathematics, mechanics, theoretical physics, theoretical electrical engineer-ing, etc. However, neither in the Soviet nor the foreign literature is there abook that gives a sufficiently complete account of the problems of matrixtheory and of its diverse applications. The present book is an attempt to fillthis gap in the mathematical literature.

The book is based on lecture courses on the theory of matrices and itsapplications that the author has given several times in the course of the lastseventeen years at the Universities of Moscow and Tiflis and at the MoscowInstitute of Physical Technology.

The book is meant not only for mathematicians (undergraduates andresearch students) but also for specialists in allied fields (physics, engi-neering) who are interested in mathematics and its applications. Thereforethe author has endeavoured to make his account of the material as accessibleas possible, assuming only that the reader is acquainted with the theory ofdeterminants and with the usual course of higher mathematics within theprogramme of higher technical education. Only a few isolated sections inthe last chapters of the book require additional mathematical knowledge onthe part of the reader. Moreover, the author has tried to keep the indi-vidual chapters as far as possible independent of each other. For example.Chapter V, Functions of Matrices, does not depend on the material con-tained in Chapters 11 and III. At those places of Chapter V where funda-mental concepts introduced in Chapter IV are being used for the first time,the corresponding references are given. Thus. a reader who is acquaintedwith the rudiments of the theory of matrices can immediately begin withreading the chapters that interest him.

The book consists of two parts, containing fifteen chapters.In Chapters I and III, information about matrices and linear operators

is developed ab initio and the connection between operators and matricesis introduced.

Chapter II expounds the theoretical basis of Gauss's elimination methodand certain associated effective methods of solving a system of n linearequations, for large n. In this chapter the reader also becomes acquaintedwith the technique of operating with matrices that are divided into rectan-gular `blocks.'

iii

iv PREFACE

In Chapter IV we introduce the extremely important 'characteristic'and 'minimal' polynomials of a square matrix, and the'adjoint' and 'reducedadjoint' matrices.

In Chapter V, which is devoted to functions of matrices, we give thegeneral definition of f (A) as well as concrete methods of computing it-where f (A) is a function of a scalar argument A and A is a square matrix.The concept of a function of a matrix is used in 5 and 6 of this chapterfor a complete investigation of the solutions of a system of linear differen-tial equations of the first order with constant coefficients. Both the conceptof a function of a matrix and this latter investigation of differential equa-tions are based entirely on the concept of the minimal polynomial of a matrixand-in contrast to the usual exposition--4o not use the so-called theory ofelementary divisors, which is treated in Chapters VI and VIT.

These five chapters constitute a first course on matrices and their appli-cations. Very important problems in the theory of matrices arise in con-nection with the reduction of matrices to a normal form. This reductionis carried out on the basis of li'eierstrass' theory of elementary divisors.In view of the importance of this theory we give two expositions in thisbook : an analytic one in Chapter VI and a geometric one in Chapter VIi.We draw the reader's attention to 7 and 8 of Chapter VT, where we studyeffective methods of finding a matrix that transforms a given matrix tonormal form. In 9 of Chapter VII we investigate in detail the methodof A. N. Krylov for the practical computation of the coefficients of thecharacteristic polynomial.

In Chapter VITT certain types of matrix equations are solved. We alsoconsider here the problem of determining all the matrices that are permutablewith a given matrix and we Study in detail the many-valued functions ofmatrices '"N/A and 1nA.

Chapters IX and X deal with the theory of linear operators in a unitaryspace and the theory of quadratic and hermitian forms. These chapters donot depend on \Veierstrass' theory of elementary divisors and use, of thepreceding material, only the basic information on matrices and linear opera-tors contained in the first three chapters of the hook. In 9 of Chapter Xwe apply the theory of forms to the study of the principal oscillations of asystem with n degrees of freedom. ]it 11 of this chapter we give an accountof Frobenius' deep results on the theory of IIankel forms. These results areused later, in Chapter XV, to study special eases of the Routh-ITnrwitzproblem.

The last five chapters form the second part of the book I the secondvolume, in the present English translation I. lit Chapter XT we determinenormal forms for complex symmetric, skew-symmetric, and orthogonal mat-

PREFACE V

ices and establish interesting connections of these matrices with real matricesof the same classes and with unitary matrices.

In Chapter XII we expound the general theory of pencils of matrices ofthe form A + AB, where A and B are arbitrary rectangular matrices of thesame dimensions. Just as the study of regular pencils of matrices A + ARis based on Weierstrass' theory of elementary divisors, so the study of singu-lar pencils is built upon Kronecker's theory of minimal indices, which is, asit were, a further development of Weierstrass's theory. By means of Kron-ecker's theory-the author believes that he has succeeded in simplifying theexposition of this theory-we establish in Chapter XII canonical forms ofthe pencil of matrices A + AB in the most general case. The results obtainedthere are applied to the study of systems of linear differential equationswith constant coefficients.

In Chapter XIII we explain the remarkable spectral properties of mat-rices with non-negative elements and consider two important applicationsof matrices of this class: 1) homogeneous Markov chains in the theory ofprobability and 2) oscillatory properties of elastic vibrations in mechanics.The matrix method of studying homogeneous Markov chains was developedin the book (46] by V. I. Romanovskii and is based on the fact that the matrixof transition probabilities in a homogeneous Markov chain with a finitenumber of states is a matrix with non-negative elements of a special type(a `stochastic' matrix).

The oscillatory properties of elastic vibrations are connected with anotherimportant class of non-negative matrices-the `oscillation matrices.' Thesematrices and their applications were studied by M. G. Krein jointly withthe author of this book. In Chapter XIII, only certain basic results in thisdomain are presented. The reader can find a detailed account of the wholematerial in the monograph (17].

In Chapter XIV we compile the applications of the theory of matricesto systems of differential equations with variable coefficients. The centralplace ( 5-9) in this chapter belongs to the theory of the multiplicativeintegral (Prod uktintegral) and its connection with Volterra's infinitesimalcalculus. These problems are almost entirely unknown in Soviet mathe-matical literature. In the first sections and in 11, we study reduciblesystems (in the sense of Lyapunov) in connection with the problem of stabil-ity of motion; we also give certain results of N. P. Erugin. Sections 9-11refer to the analytic theory of systems of differential equations. Here weclarify an inaccuracy in Birkhoff's fundamental theorem, which is usuallyapplied to the investigation of the solution of a system of differential equa-tions in the neighborhood of a singular point, and we establish a canonicalform of the solution in the case of a regular singular point.

vi PREFACE

In 12 of Chapter XIV we give a brief survey of some results of thefundamental investigations of I. A. Lappo-Danilevskii on analytic functionsof several matrices and their applications to differential systems.

The last chapter, Chapter XV, deals with the applications of the theoryof quadratic forms (in particular, of Hankel forms) to the Routh-Hurwitzproblem of determining the number of roots of a polynomial in the righthalf-plane (Re z > 0). The first sections of the chapter contain the classicaltreatment of the problem. In 5 we give the theorem of A. M. Lyapunov inwhich a stability criterion is set up which is equivalent to the Routh-Hurwitzcriterion. Together with the stability criterion of Routh-Hurwitz we give,in 11 of this chapter, the comparatively little known criterion of Lienardand Chipart in which the number of determinant inequalities is only abouthalf of that in the Routh-Hurwitz criterion.

At the end of Chapter XV we exhibit the close connection between stabil-ity problems and two remarkable theorems of A. A. Markov and P. L.Chebyshev, which were obtained by these celebrated authors on the basis of theexpansion of certain continued fractions of special types in series of decreas-ing powers of the argument. Here we give a matrix proof of these theorems.

This, then, is a brief summary of the contents of this book.

F. R. Gantmaeher

PUBLISHERS' PREFACE

TILE PUBLISHERS WISH TO thank Professor Gantmacher for his kindness incommunicating to the translator new versions of several paragraphs of theoriginal Russian-language book.

The Publishers also take pleasure in thanking the VEB Deutscher Verlagder Wissenschaften, whose many published translations of Russian scientificbooks into the German language include a counterpart of the present work.for their kind spirit of cooperation in agreeing to the use of their formulasin the preparation of the present work.

No material changes have been made in the text in translating the presentwork from the Russian except for the replacement of several paragraphs bythe new versions supplied by Professor Gantmaeher. Some changes in thereferences and in the Bibliography have been made for the benefit of theEnglish-language reader.

CONTENTS

PREFACE

PUBLISHERS' PREFACE

XI. COMPLEX SYMMETRIC, SKEW-SYMMETRIC, AND ORTHOGO-NAL MATRICES

iii

vi

1

1. Some formulas for complex orthogonal and unitarymatrices . 1

2. Polar decomposition of a complex matrix 6 3. The normal form of a complex symmetric matrix 9 4. The normal form of a complex skew-symmetric matrix.._... 12 5. The normal form of a complex orthogonal matrix 18

XII. SINGULAR PENCILS OF MATRICES 24

1. Introduction 24 2. Regular pencils of matrices 25 3. Singular pencils. The reduction theorem ..... 29 4. The canonical form of a singular pencil of matrices 35 5. The minimal indices of a pencil. Criterion for strong

equivalence of pencils _ 37 6. Singular pencils of quadratic forms_.. 40 7. Application to differential equations 45

XIII. MATRICES WITH NON-NEGATIVE ELEMENTS. 50

1. General properties 50 2. Spectral properties of irreducible non-negative matrices- 53 3. Reducible matrices 66 4. The normal form of a reducible matrix._. - 74 5. Primitive and imprimitive matrices 80 6. Stochastic matrices 82

Vii

viii CONTENTS

7. Limiting probabilities for a homogeneous Markov chain. _with a finite number of states 87. .

8 Totally non-negative matrices 98. 9. Oscillatory matrices 103

XIV. APPLICATIONS OF THE THEORY OF MATRICES TO THE INVES-TIGATION OF SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 113 1. Systems of linear differential equations with variable

General concepts- 113coefficients . 2. Lyapunov transformations . 116 3 Reducible systems 118. 4. The canonical form of a reducible system. Erugin's theorem 121 5 The matricant 125. 6. The multiplicative integral. The infinitesimal calculus of

Volterra --- 131 7. Differential systems in a complex domain. General prop-

erties 135 8. The multiplicative integral in a complex domain -..._ 138 9 Isolated singular points -......_- 142. 10 Regular singularities 148. 11. Reducible analytic systems 164 12. Analytic functions of several matrices and their applica-

tion to the investigation of differential systems. The papersof Lappo-Danilevskii ___ 168

XV. THE PROBLEM OF ROUTH-HURWITZ AND RELATED QUESTIONS 172 1 Introduction 172. 2. Cauchy indices 173 3. Routh's algorithm 177 4 The singular case Examples 181. . 5. Lyapunov's theorem 185 6. The theorem of Routh-Hurwitz 190 7 Orlando's formula 196. 8. Singular cases in the Routh-Hurwitz theorem ...... 198 9. The method of quadratic forms. Determination of the

number of distinct real roots of a polynomial _........_ - 201

CONTENTS IX

10. 11.

12. 13.

14.

15. 16. 17. 18.

Infinite Hankel matrices of finite rank. 204Determination of the index of an arbitrary rational frac-tion by the coefficients of numerator and denominator___ 208Another proof of the Routh-Hurwitz theorem 216Some supplements to the Routh-Hurwitz theorem. Stabil-ity criterion of Lienard and Chipart ...... ._ _ 220Some properties of Hurwitz polynomials. Stieltjes' theo-rem. Representation of Hurwitz polynomials by con-tinued fractions ........_.._._. __...._- 225Domain of stability. Markov parameters........... -- 232Connection with the problem of momentR 236Theorems of Markov and Chebyshev ____ 240The generalized Routh-Hurwitz problem... 248

BIBLIOGRAPHY 251

INDEX __ _ 268

CHAPTER XICOMPLEX SYMMETRIC, SKEW-SYMMETRIC, AND

ORTHOGONAL MATRICESIn Volume I, Chapter IX, in connection with the study of linear operatorsin a euclidean space, we investigated real symmetric, skew-symmetric, andorthogonal matrices, i.e., real square matrices characterized by the relations j-.

ST=S, KT=-K, and QT=Q-1,respectively (here QT denotes the transpose of the matrix Q). We haveshown that in the field of complex numbers all these matrices have linearelementary divisors and we have set up normal forms for them, i.e., `simplest'real symmetric, skew-symmetric, and orthogonal matrices to which arbitrarymatrices of the types under consideration are real-similar and orthogonallysimilar.

The present chapter deals with the investigation of complex symmetric,skew-symmetric, and orthogonal matrices. We shall clarify the questionof what elementary divisors these matrices can have and shall set up normalforms for them. These forms have a considerably more complicated struc-ture than the corresponding normal forms in the real case. As a preliminary,we shall establish in the first section interesting connections between com-plex orthogonal and unitary matrices on the one hand, and real symmetric,skew-symmetric, and orthogonal matrices on the other hand.

1. Some Formulas for Complex Orthogonal and Unitary Matrices1. We begin with a lemma :

LEMMA 1:1 1. If a matrix G is both hermitian and orthogonal (GT=O=G-1), then it can be represented in the form

G=le'K, (1)where I is a real symmetric involutory matrix and K a real skew-symmetricmatrix permutable with it :

' See [1691, pp. 223-225.f In this and in the following chapters, a matrix denoted by the letter Q is not neces-

sarily orthogonal.1

2 XI. COMPLEX SYMMETRIC, SKEW-SYMMETRIC, ORTHOGONAL MATRICES

I =I=P',12=E, K=S=-KT (2)2. If, in addition, G is a positive-definite hermitian matrix,2 then in

(1) I=E andG = euK. (3)

Proof. LetG = S + iT,

where S and T are real matrices. Then

G=S-iT and GT=ST+ iTT.

(4)

(5)

Therefore the equation j = GT implies that S = ST and T = - TT, i.e., S issymmetric and T skew-symmetric.

Moreover, when the expressions for G and ( from (4) and (5) are sub-stituted in the complex equation G(i=E, it breaks up into two real equations:

S2+T2=E and ST=TS. (6)The second of these equations shows that S and T commute.By Theorem 12' of Chapter IX (Vol. 1, p. 292), the computing normal

matrices S and T can he carried simultaneously into quasi-diagonal form bya real orthogonal transformation. Therefore3

S=Q{dl

T=Q ;0 '211'I-t= 0 j!..'(the numbers s, and It are real). Hence

G=S+iT=Q{ 81 itt-it1 d1

ii 82 it2!

: -it2 82,

(Q=Q=QT-1) (7),0,...,01Q-I

1dq l

a-1. (8)

-it il' 82y+I, ..., 8,, Qq q, JJJ

On the other hand, when we compare the expressions (7) for S and Twith the first of the e.1uatiuns (6). we find:si-ti=1, E2-tY=1,..., s4-to=1, s2q+I=t1,..., (9)

I.e., f: i, tilt, eocfti,ioul matrix of s hermitia;a form (Mee Vol. I,('hal'ter X, 9).

See also the N"le follo tlu g Th.orcm 12' of Vol. 1, Chapter IX (p. 293 a.

1. COMPLEX ORTHOGONAL AND UNITARY MATRICES 3

Now it is easy to verify that a matrix of the type 111_is . d

it!li with s2 - t2 =1can always be represented in the form

d it ll _ aer l _

11

whereIsl=coshgt, et=sinhop,

Therefore we have from (8) and (9) :

G=Q(e' -m1 0 I , 0i.e.,

where

andIK=KI.

From (11) there follows the equation (2).2. If, in addition, it is known that G is a positive-definite hermitian

matrix, then we can state that all the characteristic values of G are positive(see Volume I, Chapter IX, p. 270). But by (10) these characteristic valuesare

e"",e-Ta,e91.,e-",...,eIN,e-P+t,I,..., f 1(here the signs correspond to the signs in (10) ).

Therefore in the formula (10) and the first formula of (11), whereverthe sign -- occurs, the + sign must hold. Hence

I= Q(1, 1, ... , 1)Q-' =E,

I=Q(f 1, f 1,..., f l)Q-1K=Q 0 c'1 !, .. 0 9,9 I, 0, ..., 0 Q-1il -9'1 0 I.! t -47t 0 11

'l

0

e = sign 8.

1

4 NI. COMPLEX SYMMETRIC SKEW-SYMMETRIC, ORTI1oooNAl. 1IATRI('ES

By means of the lemma we shall now prove the following theorem :THEOREM 1: Every complex orthogonal matrix Q can be represented in

the formQ = ReiK, (12)

where R is a real orthogonal matrix and K a real skew-symmetric matrix

R=R=RT-1, K=K=-KT.Proof. Suppose that (12) holds. Then

Q*=QT =ei RT

and

(13)

Q*Q=ersRTRe{s=e2ix.

By the preceding lemma the required real skew-symmetric matrix K canbe determined from the equation

Q* Q = e21K (14)because the matrix Q*Q is positive-definite hermitian and orthogonal. AfterK has been determined from (14) we can find R from (12) :

R= Qe-iK. (15)Then

R*R=e-'NQ*Qe-ra=Eii.e., R is unitary. On the other hand, it follows from (15) that R, as theproduct of two orthogonal matrices, is itself orthogonal : RTR = E. ThusR is at the same time unitary and orthogonal, and hence real. The formula(15) can be written in the form (12).

This proves the theorem.4Now we establish the following lemma :LEMMA 2: If a matrix D is both symmetric and unitary (D=DT=D-I ),

then it can be represented in the form

D=eis, (16)where 9 is a real symmetric matrix (S =S = ST).

4 The formula (13), like the polar decomposition of a complex matrix (in connectionwith the formulas (87), (88) on p. 278 of Vol. 1) has a close connection with the importantTheorem of Cart: n which establishes a certain rcpresentation for the automorphisms ofthe complex Lie groups; see [169J, pp. 232-233.

1. COMPLEX ORTHOGONAL AND UNITARY MATRICES 5

Proof. We setD=U+iV (U=U,V=V). (17)

ThenD=U-iV, DT=UT+iVT.

The complex equation D = DT splits into the two real equations

U=UT, V=VT.Thus, U and V are real symmetric matrices.

The equation DD = E implies :

U2+V2=E, UV=VU. (18)By the second of these equations, U and V commute. When we apply

Theorem 12' (together with the Note) of Chapter IX (Vol. I, pp. 292-3)to them, we obtain :

U = Q (e1, a2, ... 18%)Q-11 V = Q (t1, ta, ... , tw) Q-1 . (19)

Here sk and tk (k =1, 2, ... , n) are real numbers. Now the first of theequations (18) yields :

et+tR=1 (k=1,2,...,n).Therefore there exist real numbers (pk (k = 1, 2, ... , n) such that

et=oosPk, tt=sinopt (k=1, 2, ..., n).Substituting these expressions for sk and tk in (19) and using (17), we find:

D = Q (e'r,, a", ... , e{ft) Q1 = efewhere

S=Q(g1, g,!, ..., -Fw) Q-1 (20)From (20) it follows that S = S = S1.

This proves the lemma.Using the lemma we shall now prove the following theorem :THEOREM 2: Every unitary matrix U can be represented in the form

U = Ress,

where R is a real orthogonal matrix and S a real symmetric matrix

R=R=RT-1, S=S=A .

(21)

(22)

6 Xl. COMPLEX SYMMETRIC, SKEW-SYMMETRIC, ORTh OGO\:U. MIATRICES

Proof. From (21) it follows thatUT = e`sRT. (23)

Multiplying (21) and (23), we obtain from (22) :UTU=eisRTRe+s=e28.

By Lemma 2, the real symmetric matrix S can be determined from theequation

UTU = e2ts (24)because UTU is symmetric and unitary. After S has been determined, wedetermine R by the equation

R = Ue's. (25)Then

RT = e_tsUT (26)

and so from (24), (25). and (26) it follows thatRTR = e-tsUT Ue-'8

= E,i.e., R is orthogonal.

On the other hand, by (25) R is the product of two unitary matricesand is therefore itself unitary. Since R is both orthogonal and unitary,it is real. Formula (25) can be written in the form (21).

This proves the theorem.

2. Polar Decomposition of a Complex MatrixWe shall prove the following theorem :THEOREM 3: If A it ark I1' is a non-singular matrix with complex

elements, thenA=SQ (27)

andA = QISI, (28)

where S and S, are complex symmetric matrices, Q and Q, complex orthogo-nal matrices. Moreover,

S=V/AAT=f(AAT), S, ^VATA=D(ATA),where f(A), f, (A) are polynomials in A.

The factors S and Q in (27) (Q, and S, in (28)) are permutahle if andonly if A and AT are permutable.

2. POLAR DECOMPOSITION OF COMPLEX MATRIX 7

Proof. It is sufficient to establish (27), for when we apply this decom-position to the matrix AT and determine A from the formula thus obtained,we arrive at (28).

If (27) holds, thenA=SQ, AT=Q-IS

and thereforeAAT = S=. (29)

Conversely, since AAT is non-singular (JAATj = I A 12 0), the function,VA is defined on the spectrum of this matrix and therefore an interpola-tion polynomial f (A) exists such that

}1AAT = / (AAT). (30)

We denote the symmetric matrix (30) byS=}'AAT.

Then (29) holds, and so 18 1 0. Determining Q from (27)Q=S-IA,

we verify easily that it is an orthogonal matrix. Thus (27) is established.If the factors S and Q in (27) are permutable, then the matrices

A=SQ and AT=Q-ISare permutable, since

AA" = 82, AT A = Q-I S5Q.Conversely, if AAT = ATA, then

M=Q_1U titi.e., Q is permutable with S2 =AA T . But then Q is also permutable withthe matrix S = f (dAT ).

Thus the theorem is proved completely.2. Using the polar decomposition we shall now prove the following theorem :

e Vol. 1, l'haliter V', 1. 11 'e choose a siogle-valued branch of tilt, functionill a simld.y connected donuiin containing nll the characteristic values of .4:1T, lout not theuwuhor n.

8 XI. COMPLEX SYMMETRIC, SKEW-SYMMETRIC, ORT11000`AL MATRICES

THEOREM 4: If two complex symmetric or skew-symmetric or orthogonalmatrices are similar :

B = T-IA T, (31)

then they are orthogonally similar; i.e., there exists an orthogonal matrix Qsuch that

B = Q-IAQ . (32)Proof. From the conditions of the theorem there follows the existence

of a polynomial q(2) such thatAT= q (A), BT = q (B). (33)

In the case of symmetric matrices this polynomial q(2) is identically equalto l and, in the case of skew-symmetric matrices, to -,1. If A and B areorthogonal matrices, then q(A) is the interpolation polynomial for 1/2 onthe common spectrum of A and B.

Using (33), we conduct the proof of our theorem exactly as we did theproof of the corresponding Theorem 10 of Chapter IX in the real case(Vol. I, p. 289). From (31) we deduce

q (B) = T-Iq (A) Tor by (33)

BT = 2-'ATT.Hence

B= TTATT-I.

Comparing this equation with (31), we easily find:

TTTA = ATTT. (34)

Let us apply the polar decomposition to the non-singular matrix T

T=SQ (S=ST= t(TTT), QT=Q-I)Since by (34) the matrix TTT is permutable with A, the matrix

S = f (TTT) is also permutable with A. Therefore, when we substitute theproduct SQ for T in (31), we have

B=Q S-IASQ=Q-IAQ.This completes the proof of the theorem.

3. NORMAL FORM OF COMPLEX SYMMETRIC MATRIX 9

S. The Normal Form of a Complex Symmetric Matrix1. We shall prove the following theorem :

THEOREM 5: There exists a complex symmetric matrix with arbitrarypreassigned elementary divisors.6

Proof. We consider the matrix H of order n in which the elements ofthe first superdiagonal are 1 and all the remaining elements are zero. Weshall show that there exists a symmetric matrix S similar to H :

S = THT-1. (35)We shall look for the transforming matrix T starting from the conditions :

S = THT-' = S' = TT-1HTTT.This equation can be rewritten as

VH =HT V , (36)where V is the symmetric matrix connected with T by the equation'

T'T=-2iV. (37)Recalling properties of the matrices H and F = HT (Vol. I, pp. 13-14)

we find that every solution V of the matrix equation (36) has the followingform:

0 . . . 0 aoao ai

V= (38)

0ao.ao al

.

. . .

where ao, a1,.. . , are arbitrary complex numbers.Since it is sufficient for us to find a single transforming matrix T, we

set ao = 1, a1= ... = 0 in this formula and define V by the equation"

0 ... 0 lV= ( 0 ... 1 01 . (39)

1 ... 0 0"In connection with the contents of the present section its well as the two sections

that follow, 4 and 5, see [378].7 To simplify the following formulas it is convenient to introduce the factor - 2i."The matrix V is both symmetric and orthogonal.

10 X1. COMPLEX SYMMETRIC. SKEW-SYMMITRRC. ORTIIOGONAI. MATRICES

Furthermore, we shall require the transforming matrix T to be symmetric :T= T'.

Then the equation (37) for T can be written as:T2=-2iV.

(40)

(41)We shall now look for the required matrix T in the form of a polynomial

in V. Since V2 = E, this can be taken as a polynomial of the first degree :T=aE+PV-

From (41), taking into account that 1'2 = E, we find.-0+#2=0, 2afi=-2i.

We can satisfy these relations by setting a=1, A=-i. ThenT=B-iV. (42)

T is a non-singular symmetrix matrix. At the same time, from (41) :T-I= iV-1T=2iVT.

T-1=2

(E+iV). (43)Thus, a symmetric form S of H is determined by

0 ... 0 IFS=THT-1= (E-iV)H(E+iV), V= 10 ... 1 OBI (44)

beSince S satisfies therewritten as follows :

equation (36) and

28=(H+HT)+i(HV-VH)10 1 . . . 0 'I

1

0 . . . 1 0.

+i

0 ofV2 = E, the equation (44) can

1 0

1 I0 -1' . . . 0,D The fact that 7' is non-singular follows, in particular, fran (41 ), because V is non.

singular.

3. NORMAL FORM OF COMPLEX SYMMETRIC MATRIX 11

The formula (45) determines a symmetric form S of the matrix H.In what follows, if n is the order of H, H = H("), then we shall denote the

corresponding matrices T, V, and S by T("), V(") and S(").Suppose that arbitrary elementary divisors are given :

(2-Aj)r', (2-2 )P" ..., (46)We form the corresponding Jordan matrix

J = (2 E(P,) + H(P.), 22E(Pd + H(P'),..., 2 P") + H(P")) .

For every matrix H(P0 we introduce the corresponding symmetric formS(PP. From

x") = T(P1) H(Pt) [T(Pi]-1 (j=1,2, ..., u)it follows that

AtE(Pt) + S(P)) = T(Pi) D WO + H(nt)] [7(Pt)]-1.

Therefore setting,s = (21E(P") + $(P.), 22E(P.) + S(P.),

... , 2.E('") + S(Pa) )T = ( T("), T(P.), ..., T(ft)

we have :

(47)(48)

9= Tip-1.S is a symmetric form of J. S is similar to J and has the same elementary

divisors (46) as J. This proves the theorem.COROLLARY 1. Every square complex matrix A aik 11i is similar to a

symmetric matrix.Applying Theorem 4, we obtain :COROLLARY 2. Every complex symmetric matrix S= II a{k 11i is orthogo-

nally similar to a symmetric matrix with the normal form S, i.e., there existsan orthogonal matrix Q such that

9= QSQ-1. (49)

The normal form of a complex symmetric matrix has the quasi-diagonalform

9 = ( 21E(P,) + S(P,), 22E(a) + S(n.), ... , 2.,E(P") + S(P")) , (50)where the blocks S(P) are defined as follows (see (44), (45)) :


,g(P) =2

[ )- iy(P)] H(P) (Ei P) + iV(P)]

= 2 [H(P) + E(p)T + i (g(P) V(P) - y(P) H(P))]0 1 . . . 01

.

I

1

0 . . . 1 0

+i

0 . . . 1 0

(51)1. .

0-1 . . . 0

4. The Normal Form of a Complex Skew-symmetric Matrix1. We shall examine what restrictions the skew symmetry of a matriximposes on its elementary divisors. In this task we shall make use of thefollowing theorem :

THEOREM 6: A skew-symmetric matrix always has even rank.Proof. Let r be the rank of the skew-symmetric matrix K. Then K has

r linearly independent rows, say those numbered il, iz, . . . , i,; all the remain-ing rows are linear combinations of these r rows. Since the columns of Kare obtained from the corresponding rows by multiplying the elements by- 1, every column of K is a linear combination of the columns numberedil, i2, . . . , i,.. Therefore every minor of order r of K can be represented inthe form

aK1 b$ ... i,

(81 12 . t,where a is a constant.

Hence it follows that

K (ii i2 ... i) 0.But a skew-symmetric determinant of odd order is always zero. There-

fore r is even, and the theorem is proved.THEOREM 7: If A0 is a characteristic value of the skew-symmetric matrix

K with the corresponding elementary divisors

(,1-AO)t" ..., (,2-?.0)ft,then - A-1 is also a characteristic value of K with the same number and thesame powers of the corresponding elementary divisors of K

4. NORMAL FORM OF COMPLEX SKEW-SYMMETRIC MATRIX 13

(A+A0)1', (,1+A,)1', ..., (I+A )1e.2. If zero is a characteristic value of the skew-symmetric matrix K,1o

then in the system of elementary divisors of K all those of even degree cor-responding to the characteristic value zero are repeated an even numberof times.

Proof. 1. The transposed matrix KT has the same elementary divisorsas K. But KT = - K, and the elementary divisors of -K are obtainedfrom those of K by replacing the characteristic values A,, .42, ... by - A2,- 22, .... Hence the first part of our theorem follows.

2. Suppose that to the characteristic value zero of K there correspond b,elementary divisors of the form A, b2 of the form A2, etc. In general, wedenote by 6, the number of elementary divisors of the form 1P (p = 1, 2, ...).We shall show that 62, b4, ... are even numbers.

The defect d of K is equal to the number of linearly independent charac-teristic vectors corresponding to the characteristic value zero or, what is thesame, to the number of elementary divisors of the form A, 22, A3, .... There-fore

(52)Since, by Theorem 6, the rank of K is even and d = n - r, d has the same

parity as n. The same statement can be made about the defects d3, d,, ... ofthe matrices K3, K1, ..., because odd powers of a skew-symmetric matrixare themselves skew-symmetric. Therefore all the numbers d, = d, d3, d,, .. .have the same parity.

On the other hand, when K is raised to the m-th power, every elementarydivisor AP for p < m splits into p elementary divisors (of the first degree)and for p ? m into m elementary divisors." Therefore the number of ele-mentary divisors of the matrices K, K3, ... that are powers of A are deter-mined by the formulas32

d6=B1+262+3ds+484+5(66+8s+ ), (53)Comparing (52) with (53) and bearing in mind that all the numbers

d, = d, d3i d,, . . . are of the same parity, we conclude easily that 82, b,, ... areeven numbers.

This completes the proof of the theorem.

'O Le., if I K ; = 0. For odd n we always have j K j = 0." See Vol. 1, Chapter VI, Theorem 9, p. 15R.2 These formulas were introduced (without reference to Theorem 9) in Vol. I, ('hapter

VI (see formulas (49) on p. 155).

14 Xl. COMPLEX SYMMETRIC, SKEW-SYM\ILTRIC, ORTIIOC:Ox.\1. MATRICES

2. THEOREM 8: There exists a skew-symmetric matrix with arbitrary pre-assigned elementary divisors subject to the restrictions 1., 2. of the pre-ceding theorem.

Proof. To begin with, we shall find a skew-symmetric form for thequasi-diagonal matrix of order 2p:

J )=(AOE+H,-)E-H) (54)having two elementary divisors (A - A,,) P and (A + Ao) P ; here E = E(P),H = H(P).

We shall look for a transforming matrix T such that

TJ;:'')T-1is skew-symmetric, i.e., such that the following equation holds :

TJ( )T-1 + TT_1 [J(PP)]T TT= 0or

WJ)+[J,(i;)]TW=O,where W is the symmetric matrix connected with T by the equation"

TTT=-2iW.We dissect W into four square blocks each of order p :

W (Wll W12=\W21 W22

Then (55) can be written as follows :W11

W21

W1 A0E+H 0W2J ( 0 -4E-H)

AOE+HT 0 l (WIl Wlt+

l -_( 0 -a0E-HT/ W21 WszlO. (57)When we perform the indicated operations on the partitioned matrices

on the left-hand side of (57), we replace this equation by four matrixequations :

1. HTW11+ W11(2A E+H)=O,2. HTW12-W14H=0,3. HW21-Wt1H=O,4. HTW22+ W22(2AE+H) = 0.

i Sei footnote 7 on p. 9.


The equation AX - XB = 0, where A and B are square matrices withoutcommon characteristic values, has only the trivial solution X = 0." There-fore the first and fourth of the equations (58) yield : W11 = W22 = 0.18As regards the second of these equations, it can be satisfied, as we have seenin the proof of Theorem 5, by setting

0 . . . 0 1

W12=V=101

U

'(59)

11 . . . 0 0

since (cf. (36) )VH-H'rV=O.

From the symmetry of W and V it follows that

W21=W2=V.The third equation is then automatically satisfied.

Thus,

W= 0 V)V 0But then, as has become apparent on page 10, the equation (56) will besatisfied if we set

ThenT=E(2p)-iV(2p). (61)

T-1= 2 (E(E p) + i V(2 p)), (62)Therefore, the required skew-symmetric matrix can be found by the formula16

gApp) _ I r &(2 P)_ iV(2 p)J j (PP) [E 2 p) + i y _2 p)]

-T At4- [JSpp) - J( )T+ i (Jr V(2 P) _ F(2 (63)

When we substitute for and V(2P) the corresponding partitionedmatrices from (54) and (60), we find:

14 See Vol. I, Chapter VIII, 1.15 For 1,.., 0 the equations 1. and 4. have no solutions other than zero. For ) = 0

there exist other solutions, but we choose the zero solution.16 Here we use equations (55) and (60). From these it follows that

V(=p) JV V(1p) _ _App)T


kpp)[1H OHT

HT0

Hl +(4E

0H-AOE0

-H) \V 0lV 0

(AoKO+ H_ 20E _ H)J

H-HT i(2AOV+HV+VH) (64)2 (-i(2 AOV+HV + VH) HT-H )'

0 1 . . . . . . . . 0 0 . . . . . . i 2A0-1 0 2AO i

1 i .0 ... --1 0 2 A i ... 00 . . . -i -2A0 0 -1. . . . 011

-2A0 -i !1 0

-i. . . 1-2) -i . . . 0 .0

.(65)

We shall now construct a skew-symmetric matrix K(q) of order q havingone elementary divisor A9, where q is odd. Obviously, the required skew-symmetric matrix will be similar to the matrix

0 1 0.......00 0 1 ,

j() =

0

-1 0'0 10 0

(66)


In this matrix all the elements outside the first superdiagonal are equal tozero, and along the first superdiagonal there are at first (q -1) /2 ele-ments 1 and then (q -1) /2 elements -1. Setting

X(f) = TJ(q) I ' ,we find from the condition of skew-symmetry :

W1J(0+ J(q)T W1=0where

TTT=-2iW1.

By direct verification we can convince ourselves that the matrix

110 ... 0 1Wl=V(Q)=

0 ... 1 01 ... 0 0

(67)

(68)

(69)

satisfies the condition (68). Taking this value for W1 we find from (69),as before :

T= E(q)- i V(q), T-1= [E(q)+iV(q)], (70)ke) =

2[E(q)

- iV(gj J(q) [E(q) + iV(q)]= 2 [J(q) - J(q)T + i (J(q) 00- V(q) J(q))] (71)

When we perform the corresponding computation, we find :

0 1 . . . . . . . 0-1 0 .

0 . . . . . . 1 0

1

2 K(q) = +i (72)

0

-10 -1 . . . 0

Suppose that arbitrary elementary divisors are given, subject to theconditions of Theorem 7:

18 Xl. COMPLEX SYMMETRIC, SKEW-SYMME1'RIC, ORTHOGONAL MIATRII'L

(I(A+A)'t (i=1, 2, ..., u),) (k=1, 2, ..., v; ql, q2, ...,q, are odd numbers)." } (73)

Then the quasi-diagonal skew-symmetric matrix

K = {Ka,(AA,), KNO r) 1..., ..., (74)

has the elementary divisors (73).This concludes the proof of the theorem.COROLLARY : Every complex skew-symmetric matrix K is orthogonally

similar to a skew-symmetric matrix having the normal form K determinedby (74), (65), and (72) ; i.e., there exists a (complex) orthogonal matrix Qsuch that

K = QKQ-1. (75)

Note. If K is a real skew-symmetric matrix, then it has linear ele-mentary divisors (see Vol. I, Chapter IX, 13).

iqp,,, d + ip,,, A, ..., A are real numbers).e antes

In this case, setting all the pJ = 1 and all the qk = 1 in (74), we obtain asthe normal form of a real skew-symmetric matrix

0 9'ttIIO

O,...,0}.-q'ti :i 111

5. The Normal Form of a Complex Orthogonal Matrix1. Let us begin by examining what restrictions the orthogonality of amatrix imposes on its elementary divisors.

THEOREM 9: 1. If Ao (A02:#6 1) is a characteristic value of an orthogonalmatrix Q and if the elementary divisors

(A-A0)",, (A-AX., ..., (A-AX,

Sonic of rile ttunihcrs utay he zero. Moreover, one of the numbers uand t may he zero; i.e., in Some easl.s there may he clemcutarv divisors of only one type.

5. NORMAL FORM OF COMPLEX ORT1IoOoxA1, MATRIX 19

correspond to this characteristic value, then 1/A0 is also a characteristic valueof Q and it has the same corresponding elementary divisors:

(A-A0 1)1', (A-Aa 1)1., ..., (A-A; -12. If Ao = - 1 is a characteristic value of the orthogonal matrix Q, thc'ii

the elementary divisors of even degree corresponding to A. are repeated aneven number of times.

Proof. 1. For every non-singular matrix Q on passing from Q to Q-'each elementary divisor (A-Ao)t is replaced by the elementary divisor

(A-A01)/'e On the other hand, the matrices Q and QT always have thesame elementary divisors. Therefore the first part of our theorem followsat once from the orthogonality condition QT=Q-1

2. Let us assume that the number 1 is a characteristic value of Q, while-1 is not (I E - Q I = 0, 1 E + Q I 0). Then we apply ('ayley's formulas(see Vol. I, Chapter IX, 14), which remain valid for complex matrices.We define a matrix K by the equation

K = (E _Q) (E + Q)-1 (76)Direct verification shows that KT =

- K, so that K is skew-symmetric.When we solve the equation (76) for Q, we find :19

Q = (E -K) (E +K)-1.Setting

1-1 2f (A) = we have f'(A) =-(l+Z)' 0. Therefore in the transi-tion from K to Q = f (K) the elementary divisors do not split.lo Hence inthe system of elementary divisors of Q those of the form (A -1)2p are re-peated an even number of times, because this holds for the elementarydivisors of the form A2p of K (see Theorem 7).

The case where Q has the characteristic value - 1, but not + 1, is reducedto the preceding case by considering the orthogonal matrix - Q.

We now proceed to the most complicated case, where Q has both thecharacteristic value + 1 and - 1. We denote by yi(A) the minimal poly-nomial of Q. Using the first part of the theorem, which has already beenproved, we can write y'(A) in the form

"See Vol. 1, Chapter VI, 7. Setting f(.)=1/i., we haveHeuee it follows that is the transition from Q to Q ' the rlementary divisors ilo not split(see Vol. 1, p. 158).

Note that (71i) implies that K -j- K _(E + Q) awl therefor,'K= .: 2 1=- '

. E+Q r0.-" tine Vol. I. p. I:,x.

20 XI. COMPLEX SYMMETRIC, SKEW-SYMMETRIC, ORTHOGONAL. MATRICES

Y(2)=(x-1)'"'(A-1-1)''II(a-2,)'i(2-'')'f (.c#1; f=1,2,... ,x).

We consider the polynomial g (A) of degree less than m (m is the degreeof ty (d)) for which g (1) =1 and all the remaining m -1 values on thespectrum of Q are zero; and we set :21

P=g(Q)- (77)Note that the functions (g(1) )2 and g(1/2) assume on the spectrum of Q

the same values as g(A). ThereforePs=P, PT=g(QT)=g(Q-1)=P, (78)

i.e., P is a symmetric projective matrix.22We define a polynomial h(2) and a matrix N by the equations

h (A) = (A -1) g (A), (79)N = h (Q) = (Q - E) P. (80)

Since (h(,l) vanishes on the spectrum of Q, it is divisible by rp(A)without remainder. Hence :

N'' = 0,

i.e., N is a nilpotent matrix with m1 as index of nilpotency.From (80) we find:"

NT= (QT-E) P.as From the fundamental formula (we Vol. 1, p. 104)

g(d)= 3T(9 (1t)Z!1-}-9'(1t)ZM+--'Iit follows tl:at

k-1

(81)

p=Zn.T= A hermitian operator P is called projective if P=P. In accordance with this,

n hermitian matrix 1' for which P= P is called An example of a projectiveoperator P in a unitary spare R is the operator of the orthogonal projection of a vectorx e R into it subspace S = PR, i.e., Px = xS, where xS e S and (x - xS) 1. S (see Vol. I,p. 248).

spa All the matrices that occur here, P, .V, NT, QT are permutable among eachother and with Q, since they are all functions of Q.

5. NORMAL FORM OF COMPLEX ORTHOGONAL MATRIX 21

Let us consider the matrix

R=N(NT +2E).

From (78), (80), and (81) it follows that

R=NNT +2N=(Q--QT)P.

From this representation of R it is clear that R is skew-symmetric.On the other hand, from (82)

Rk=Nk(NT+2E)k (k=1,2, ...).But NT, like N, is nilpotent, and therefore

ANT+2EI 0.

(82)

Hence it follows from (83) that the matrices Rk and Nk have the same rankfor every k.

Now for odd k the matrix Rk is skew-symmetric and therefore (see p. 12)has even rank. Therefore each of the matrices

N, Ne, Ns, .. .

has odd rank.By repeating verbatim for N the arguments that were used on p. 13

for K we may therefore state that among the elementary divisors of N thoseof the form 12P are repeated an even number of times. But to each ele-mentary divisor 12p of N there corresponds an elementary divisor (1-1) 2pof Q, and vice versa .211 Hence it follows that among the elementary divisorsof Q those of the form (1-1)2P are repeated an even number of times.

We obtain a similar statement for the elementary divisors of the form(A + 1)2P by applying what has just been proved to the matrix - Q.

Thus, the proof of the theorem is complete.2. We shall now prove the converse theorem.

24 Since h(1) =0, h'(1) # 0, in passing from Q to N=h(Q) the elementary divisorsof the form (t - 1)'p of Q do not split and are therefore replaced by elementary divisorsh'p (see Vol. I, Chapter VI, 7).

22 1I. COMPLEX 5Yb1METRIC, SKEW-SY.111ETRIC, ORTIIOooN.U. MA,nucrS

THEOREM 10: Every system of powers of the form

- A 1)Pi (At 0; j = 1, 2, ..., u),1)Q., ..., (A -1$,

(A + 1)", (A + I)'., ..., (A + I)'(q1, ..., q tl, ..., t are odd numbers)

is the system of elementary divisors of some complex orthogonal matrix Q.35Proof. We denote by auj the numbers connected with the numbers A,

(j = 1, 2, ... , n) by the equationsAi=e'q (j=1,2,...,u)

We now introduce the `canonical' skew-symmetric matrices (see the pre-ceding section)

K(PH)) (j = 1, 2, ..., u); K(0, ...,K(9"); K('a), ... , V0,Mfwith the elementary divisors

(A - piff, (2 + pf)Pf (j= 1, 2, ... , u) All, ... , 29v; k', ... , A".If K is a skew-symmetric matrix, then

Q = ez

is orthogonal (QT= eRT = e_R = Q-1). Moreover, to each elementary divi-sor (A - p) P of K there corresponds an elementary divisor (A - e) P of Q.26

Therefore the quasi-diagonal matrix

R(PIPL) g(ArPr) g(41) g(40 x(h) g(ar)Q = e Y, , ..., a 'v ; e , ..., e ; -6 , ..., - e } (85)is orthogonal and has the elementary divisors (84).

This proves the theorem.From Theorems 4, 9, and 10 we obtain :

Sonie (or eveli all) of the oumhers i., may he -t 1. One or two of the numbersu, v, it may he zero. Then the element;:ry divisors of the corresponding type are absentin Q.

26 This follows from the filet that for f(X 1 =e4 we have f'O.) _crA# 0 for every 1.

5. NORMAL FORM OF COMPLEX ()RT11oGONA1. MATRIX 23

CoRoLLARY: Every (complex) orthogonal matrix Q is orthogonallysimilar to an orthogonal matrix having the normal form Q; i.e., there existsan orthogonal matrix Ql such that

Q = Q3Q ,-1 (86)Note. Just as we have given a concrete form to the diagonal blocks in the

skew-symmetric matrix k, so we could for the normal form 0.27

27 See [378).

CHAPTER XII

SINGULAR PENCILS OF MATRICES 1. Introduction

1. The present chapter deals with the following problem :Given four matrices A, B, A,, B, all of dimension m X n with elements

from a number field F, it is required to find under what conditions thereexist two square non-singular matrices P and Q of orders m and n, respec-tively, such that'

PAQ =A1, PBQ = B1 (1)By introduction of the pencils of matrices A + AB and A, + AB, the

two matrix equations (1) can be replaced by the single equationP (A + AB) Q =A1 + 2B1 (2)

DEFINITION 1: Two pencils of rectangular matrices A + AB and A, + AB,of the same dimensions in X n connected by the equation (2) in which P andQ are constant square non-singular matrices (i.e., matrices independent ofA) of orders m and n, respectively, will be called strictly equivalent.2

According to the general definition of equivalence of A-matrices (seeVol. 1, Chapter VI, p. 132), the pencils A + AB and A, + AB1 are equivalentif an equation of the form (2) holds in which P and Q are two squareA-matrices with constant non-vanishing determinants. For strict equivalenceit is required in addition that P and Q do not depend on A.'

A criterion for equivalence of the pencils A + AB and A, + AB1 followsfrom the general criterion for equivalence of A-matrices and consists in theequality of the invariant polynomials or, what is the same, of the elementarydivisors of the pencils A + AB and A, + AB, (see Vol. I, Chapter VI, p. 141).

1 If such matrices P and Q exist, then their clement, can be taken front the field F.This follows from the fact that the equations (l) can be written in the form PA = A,Q",PB=B,Q-' and are therefore equivalent to n certain system of linear homogeneous equa-tions for the elements of P and Q- with coefficients in F.

2 See Vol. I, Chapter VI, P. 145.8 We have replaced the term 'equivalent pencils' that occurs in the literature by

strictly equivalent pencils,' in order to draw a sharp distinction between Definition I andthe definition of equivalence in Vol. I, Chapter Vi.

24

2. REGULAR PENCILS OF MATRICES 25

In this chapter, we shall establish a criterion for strict equivalence oftwo pencils of matrices and we shall determine for each pencil a strictlyequivalent canonical form.2. The task we have set ourselves has a natural geometrical interpretation.We consider a pencil of linear operators A + 1B mapping R into Rm. Fora definite choice of bases in these spaces the pencil of operators A + 1B cor-responds to a pencil of rectangular matrices A + 1B (of dimension m X n) ;under a change of bases in R. and R. the pencil A + 1B is replaced by astrictly equivalent pencil P(A + AB)Q, where P and Q are square non-singular matrices of order m and n (see Vol. I, Chapter III, 2 and 4).Thus, a criterion for strict equivalence gives a characterization of that classof matrix pencils A + 1B (of dimension m X n) which describe one and thesame pencil of operators A + 1B mapping R into R. for various choices ofbases in these spaces.

In order to obtain a canonical form for a pencil it is necessary to findbases for R and R. in which the pencil of operators A + 1B is described bymatrices of the simplest possible form.

Since a pencil of operators is given by two operators A and B, we canalso say that: The present chapter deals with the simultaneous investigationof two operators A and B mapping R. into Rm.3. All the pencils of matrices A + AB of dimension m X n fall into twobasic types : regular and singular pencils.

DEFINITION 2: A pencil of matrices A + 1B is called regular if1) A and B are square matrices of the same order n; and2) The determinant I A + AB I does not vanish identically.

In all other cases (m n, or m = n but I A + 1B - 0), the pencil is calledsingular.

A criterion for strict equivalence of regular pencils of matrices and alsoa canonical form for such pencils were established by Weierstrass in 1867[377] on the basis of his theory of elementary divisors, which we have ex-pounded in Chapters VI and VII. The analogous problems for singularpencils were solved later, in 1890, by the investigations of Kronecker [249].'Kronecker's results form the primary content of this chapter.

2. Regular Pencils of Matrices1. We consider the special case where the pencils A + 1B and Al + 1B1consist of square matrices (m = n) I B 1 0, 1 B1 I # 0. In this case, as wehave shown in Chapter VI (Vol. I, pp. 145-146), the two concepts of `equiv-

' Of more recent papers dealing with singular pencils of matrices we mention 12349,[3691, and [2551.

26 XII. SINGULAR PENCII.S OF MATRICES

alence' and 'strict equivalence' of pencils coincide. Therefore, by applyingto the pencils the general criterion for equivalence of A-matrices (Vol. I,p. 141) we are led to the following theorem :

THEOREM 1: Two pencils of square matrices of the same order A + ABand A, + ABI for which I B # 0 and I BI I ' 0 are strictly equivalent if andonly if the pencils have the same elementary divisors in F.

A pencil of square matrices A + AB with I B 1 7& 0 was called regular inChapter VI, because it represents a special case of a regular matrix poly-nomial in A (see Vol. I, Chapter IV, p. 76). In the preceding section of thischapter we have given a wider definition of regularity. According to thisdefinition it is quite possible in a regular pencil to have I B =0 (and evenCAI=IBI=0).

In order to find out whether Theorem 1 remains valid for regular pencils(with the extended Definition 1), we consider the following example:

213 11211 1`211 1111A+AB= 325. +A 1 1 2'1, AI+ABI=

(

1 2 1 + 1 1 1 (3)3 2 6! 113!! ,111 1 1 1

It is easy to see that here each of the pencils A + AB and AI + ABI hasonly one elementary divisor, A + 1. However, the pencils are not strictlyequivalent, since the matrices B and BI are of ranks 2 and 1, respectively;whereas if an equation (2) were to hold, it would follow from it that theranks of B and B, are equal. Nevertheless, the pencils (3) are regularaccording to Definition 1, since

IA+ABI=I A,+ABI -A+1.This example shows that. Theorem 1 is not true with the extended defini-

tion of regularity of a pencil.2. In order to preserve Theorem 1, we have to introduce the concept of 'in-finite' elementary divisors of a pencil. We shall give the pencil A + AB interms of 'homogeneous' parameters A, p : uA + AB. Then the determinanti(A, u) IiA+ i.B I is a homogeneous function of A, It. By determiningthe greatest common divisor Dk(A, Fp) of all the minors of order k of thematrix pA + AB (k = 1, 2, ... , n), we obtain the invariant polynomials bythe well known formulas

1,(A,)= D. (A. ,12(A,)=nn-2 (x, ;)P...here all the Dk(A, /I) and i;(A, p) are homogeneous polynomials in i. and It.

2. REGULAR PENCILS OF MATRICES 27

Splitting the invariant polynomials into powers of homogeneous polynomialsirreducible over F, we obtain the elementary divisors ea (A, p) (a = 1, 2....of the pencil pA + AB in F.

It is quite obvious that if we set p =1 in ea(A, p) we are back to the ele-mentary divisors ea(A) of the pencil A + AB. Conversely, from each ele-mentary divisor ea(A) of degree q we obtain the correspondingly elementarydivisor ea(A, p) by the formula ea (A,p) =742e). We can obtain in this wayall the elementary divisors of the pencil uA + AB apart from those of theform pa.

Elementary divisors of the form pq exist if and only if I B I = 0 and arecalled `infinite' elementary divisors of the pencil A + AB.

Since strict equivalence of the pencils A + AB and A, + AB, impliesstrict equivalence of the pencils pA + AB and pA, + AB,, we see that forstrictly equivalent pencils A + i.B and Al + AB1 not only their ' finite.' butalso their 'infinite' elementary divisors must coincide.

Suppose now that A + AB and Al + AB, are two regular pencils forwhich all the elementary divisors coincide (including the infinite ones).We introduce homogeneous parameters : pA + AB, pA1 + AB1. Let us nowtransform the parameters

A=a1)+a2/4, p=i9 +i i (z1f2-a 1,0)In the new parameters the pencils are written as follows :

Al + AB , J Al + AB', , where B = {4,A + a,B, B1 = ,81A1 + a1B1.

From the regularity of the pencils pA + AB and pA1 + AB1 it follows that wecan choose the numbers a1 and 0, such that I B I 0 and I B1 0.

Therefore by Theorem 1 the pencils A + AB and A1 + AB1 and con-sequently the original pencils pA + AB and pA1 + AB1 (or, what is the same,A + AB and A, + AB1) are strictly equivalent. Thus, we have arrived atthe following generalization of Theorem 1:

THEOREM 2: Two regular pencils A +.AB and A, + AR, are strictlyequivalent if and only if they have the same ('finite' and 'infinite') ele-mentary divisors.

In our example above the pencils (3) had the same 'finite' elementarydivisor A + 1, but different. 'infinite' elementary divisors (the first pencilhas one 'infinite' elementary divisor p2; the second has two : p, p). Thereforethese pencils turn out to be not strictly equivalent.

28 XII. SINCt'I.AR PENCILS OF MATRICE)3. Suppose now that A + AB is an arbitrary regular pencil. Then thereexists a number c such that J A + cB I 0. We represent the given pencilin the form A, + (2- c) B, where A, = A + cB, so that I Al 0. Wemultiply the pencil on the left by AT': E + (Z - c) A 1 B. By a similaritytransformation we put the pencil in the forma

E + (A - c) (J0,JI) _ (E-cJo +AJ0, B-cJ, +AJI) , (4)

where {J0,JI} is the quasi-diagonal normal form of AiIB, Jo is a nilpotentJordan matrix,6 and I J, 10.

We multiply the first diagonal block on the right-hand side of (4) by(E - cJo) -' and obtain : E + A (E - cJo) -'J0. Here the coefficient of Ais a nilpotent matrix! Therefore by a similarity transformation we canput this pencil into the form8

E + AJ0 = ( N(",), N(us), ..., Plus)) (N("> = E(") + I(")). (5)

We multiply the second diagonal block on the right-hand side of (4) byJ11; it can then be put into the form J + AE by a similarity transformation,where J is a matrix of normal form9 and E the unit matrix. We have thusarrived at the following theorem :

THEOREM 3: Every regular pencil A + AB can be reduced to a (strictlyequivalent) canonical quasi-diagonal form

(N(",), N(us),. . .

, N(",), J + 2E) (N(") = E(") + )iH(")), (6)

where the first s diagonal blocks correspond to infinite elementary divisors"', l", ..., p"aof the pencil A + AB and where the normal form of the lastdiagonal block J + AE is uniquely determined by the finite elementarydivisors of the given pencil.

-'The Unit n,atriet's E in the diagonal blocks on the right-hand side of (4) have thesame order as J. and ./,.

Le., .1J = 11 for some integer I > 0.Front .1.!=O it follows that I(E-r./.) 0.Here E(jr ) is :r unit matrix of order it and 11(11 a is a matrix of order it whose elements

in the first superdiagoual are 1, while the reumiuing elements are zero.Since the matrix ./ can ho replaced here by in a r itrary similar matrix, we may

asmne that .1 has one of the oorural forms (for example, the natural form of the firstor second kind or the .Jordan form (see Vol. 1, Chapter VI, ti ).

3. SINGULAR PENCILS. THE REDUCTION THEOREM 29

3. Singular Pencils. The Reduction Theorem1. We now proceed to consider a singular pencil of matrices A + AB ofdimension in X n. We denote by r the rank of the pencil, i.e., the largestof the orders of minors that do not vanish identically. From the singu-larity of the pencil it follows that at least one of the inequalities r < is andr < m holds, say r < n. Then the columns of the 1-matrix A + AB arelinearly dependent, i.e., the equation

(A+AB)x=o, (7)where x is an unknown column matrix, has a non-zero solution. Everynon-zero solution of this equation determines some dependence among thecolumns of A + AB. We restrict ourselves to only such solutions x(2) of (7)as are polynomials in 2,10 and among these solutions we choose one of leastpossible degree a :

x (,1) = xe - .1x1 + A'xz - ... + (-1)'1'x. (x. # 0). (8)Substituting this solution in (7) and equating to zero the coefficients of

the powers of 2, we obtain :

Axo=o, Bxo-Ax1=o, Bx1-Axe=o, ... , Bx._1- Ax.=o, Bx.=o. (9)

Considering this as a system of linear homogeneous equations for theelements of the columns z0, - x1i + x2 ... , (-1)'x,, we deduce that thecoefficient matrix of the system

6+1

A O ... 0B A

M.=M.(A+.1B]= 0 BA

0 0... B(10)

is of rank p. < (e + 1) n. At the same time, by the minimal property of e,the ranks p., p1i ... , e._1 of the matrices

I" For the actual determination of the elements of the column .r satisfying (7) it isconvenient to solve a system of linear homogeneous equations in which the coefficients ofthe unknown depend Iiucarly on ).. The fundamental linearly iudependcut solutions r canalways be dhoseu such that their elements are polynuwials in )..

30 XII. SINGULAR PENCILS OF MATRICES

A O...OA 0 B A

Mo=(BM1= B A , ... , M.-I=.

.(10')

`` 0 B AO ... B

satisfy the equations po= n, pI = 2 n, . . . , p,_I =en.Thus: The number a is the least value of the index k for which the sign

< holds in the relation px < (k + 1)n.Now we can formulate and prove the following fundamental theorem :

2. THEOREM 4: If the equation (7) has a solution of minimal degree e ande > 0, then the given pencil A + AB is strictly equivalent to a pencil ofthe form

(L, 00 A + .B)'

where

.+ 1

A 1 0 . . . 00 2 1

L. =

110 0 . . .A1

(12)

and A+;Lit is a pencil of matrices for which the equation analogous to (7)has no solution of degree less than e.

Proof. We shall conduct the proof of the theorem in three stages. First,we shall show that the given pencil A + 1B is strictly equivalent to a pencilof the form

(L, D + AF10 A+2B)' (13)

where D, F, A, B are constant rectangular matrices of the appropriatedimensions. Then we shall establish that the equation (A + A) i = 0 hasno solution x(A) of degree less than e. Finally, we shall prove that byfurther transformations the pencil (13) can be brought into the quasi-diagonal form (11).


1. The first part of the proof will be couched in geometrical terms.Instead of the pencil of matrices A + AB we consider a pencil of operatorsA + AB mapping R into Rm and show that with a suitable choice of basesin the spaces the matrix corresponding to the operator A + AB assumes theform (13).

Instead of (7) we take the vector equation(A + 2B)x =o (14)

with the vector solution

x (,1) =x0-,1x1 + 22x2-... + (15)the equations (9) are replaced by the vector equations

Axo= o, Ax1= Bxo, Ax2= Bx1, ..., Ax,= Bx,_1, Bx,= o (16)

Below we shall show that the vectors

Ax1i Ax2t ... , Ax, (17)

are linearly independent. Hence it will be easy to deduce the linear inde-pendence of the vectors

x0, x1, .. ., x,. (18)For since Axo = o we have from ao xo + al x, + + ax, = o that

al A x1 + + a. A x, = o, so that by the linear independence of the vectors(17) a1= a2 = ... = a, = 0. But xo ,& 0, since otherwise

dx (A) would be

a solution of (14) of degree e - 1, which is impossible. Therefore an = 0also.

Now if we take the vectors (17) and (18) as the first e + 1 vectors fornew bases in R. and R,,, respectively, then in these new bases the operatorsA and B, by (16), will correspond to the matrices

.+1 .+1

0 1 ... 0 * ... 1 0... 0 0 * ... *0 0 1 ... 0 * ... * ; 10 1 ... 0 0 * ... *;

`4- 0 0 ... 1 0 0... 1 00 0 ... 0 * ... * ' 0 0...0 0 * ... *

.

f. . . . 0 0 * ... *

0 0 ... 0 * .. *I 0 0...0 0 * ... *

32 X11. SINGULAR PENCILS OF MATRICES

hence the A-matrix A + At is of the form (13). All the preceding argu-ments will be justified if we can show that the vectors (17) are linearlyindependent. Assume the contrary and let Ax, (h ? 1) be the first vectorin (17). that is linearly dependent on the preceding ones :

Axr = a1AxA_1 + a2AxA_2 + + aa_1Ax1.

By (16) this equation can be rewritten as follows :Bxh_I = 011BxA_s + a2BxA-s + .....}- aA_1Bx0,

i.e.,

wherexA_1= o ,

xA-1 = xA-1 - a1xA-2 - a2XA_S aA_lxo

Furthermore, again by (16),,AxA_1= B (xA_2 - a1xA_a aA_Zxu = BXA-2

whereXA-2 = xA-2 - alXA_s - ... - aA-Zxo

Continuing the process and introducing the vectors

xh_s=XA_3-a1XA_4-...-ah_sxo, ..., X1 =X1-a1X0, xa =xo

we obtain a chain of equations

BxZ_1 = o , Ax,*,_1= Bx,'_2, .... Axi = Bxo , Ax=o. (19)From (19) it follows that

x'(A)=x;-Axi+--.+e-1 (xo=xo o)is a non-zero solution of (14) of degree < h -1 < e, which is impossible.Thus, the vectors (17) are linearly independent.

2 We shall now show that the equal ion (A + AB) z =o has no solutionsof degree less than e. To begin with, we observe that the equation L, y = o,like (7), has a non-zero solution of least degree e. We can see this imme-diately, if we replace the matrix equation L, y= o by the system of ordinaryequations

41 + y2 = 0, AY2 + ya =0, ... , Ay. + y.+1 =0 (Y= (y1, Y2, ... , Y,+I));(k=1, 2,...,e+1).yr=(-1)A-Iy1AA-I


On the other hand, if the pencil has the `triangular' form (13) then thecorresponding matrix pencil Mk (k = 0, 1, .... e) (see (10) and (10') onpp. 29 and 30) can also be brought into triangular form, after a suitablepermutation of rows and columns:

Mx [L,] Mt [D + .1F] (20)l O MM[A+LB])

For k = e -1 all the columns of this matrix, like those of M._1 [L.],arelinearly independent." But M._1 [L,] is a square matrix of order e(e + 1).Therefore in M,_1 [A + ,1B] also, all the columns are linearly independentand, as we have explained at the beginning of the section, this means that theequation (A + AB) z = o has no solution of degree less than or equal to a - 1,which is what we had to prove.

3. Let us replace the pencil (13) by the strictly equivalent pencilD +AF(01

%z/

(0L.A + AB} (O, E` 1- (Os

D + AF+A(A,1B

Ah)-LX(21)

where E,, E2, E3, and E. are square unit matrices of orders e, m - e, e + 1,and n - e - 1, respectively, and X, Y are arbitrary constant rectangularmatrices of the appropriate dimensions. Our theorem will be completelyproved if we can show that the matrices X and Y can be chosen such that thematrix equation

holds.L,X=D+AF +Y(A+2B) (22)

We introduce a notation for the elements of D, F, X and also for therows of Y and the columns of A and B :

D drat j, F=/:,

A= (a1, aq, ... , B = (b1, b2, ... ,

Then the matrix equation (22) can be replaced by a .system, of scalar equa-tions that expresses the equality of the elements of the k-th column on theright-hand and left-hand sides of (22) (k = 1, 2, ... , n - e -1) :

Ii This follows from the fact that the rank of the matrix (20) for A = e - 1 is equalto en; a similar equation hulls for the rank of the matrix Me-1 [L.].

34 XII. SINGI'I.AR PENCILS OF MATRICES

x21; + Axlk = dlk + Aflk + ylak + Aylbk,

xyk + Ax2k = d2k + Af2k + y2ak + Ay2bk ,x4k + Axsk = dsk + Afsk + ysak + Aysbk , (23).................

x.+I.k + Ax.k = d.k + A/.k + y.ak + Ay.bk(k=1, 2, ..., n-e-1).

The left-hand sides of these equations are linear binomials in A. Thefree term of each of the first a - 1 of these binomials is equal to the coeffi-cient of A in the next binomial. But then the right-hand sides must alsosatisfy this condition. Therefore

Ylaak - ysbk = f2k - dlk,

y2ak - y$bk = 18k - d2k

y.-lak - y.bk =fek - d.-l.k(k=1, 2, ..., n-e-1).If (24) holds, then the required elements of X can obviously be determinedfrom (23).

It now remains to show that the system of equations (24) for the ele-ments of Y always has a solution for arbitrary d1k and ffk (i = 1, 2, . . . , a;k = 1, 2, ... , n - e - 1). Indeed, the matrix formed from the coefficientsof the unknown elements of the rows y,, - Y2, y3, - y., ... , can be written,after transposition. in the form

But this is the matrix M, _2 for the pencil of rect 'ngular matrices A + AB(see (10') on p. 30). The rank of the matrix is (, -1) (n-s-1), be-cause the equation (A + )B) x = o, by what we have shown, has no solutionsof degree less than e. Thus, the rank of the system of equations (24) isequal to the number of equations and such a system is consistent (non-contradictory) for arbitrary free terms.

This completes the proof of the theorem.

4. CANONICAL FORM OF SINGULAR PENCIL 35

4. The Canonical Form of a Singular Pencil of Matrices1. Let A + AB be an arbitrary singular pencil of matrices of dimensionm X n. To begin with, we shall assume that neither among the columnsnor among the rows of the pencil is there a linear dependence with constantcoefficients.

Let r < n, where r is the rank of the pencil, so that the columns of A + ABare linearly dependent. In this case the equation (A + AB)x = o has a non-zero solution of minimal degree el. From the restriction made at the begin-ning of this section it follows that el > 0. Therefore by Theorem 4 thegiven pencil can be transformed into the form

L., 0(0 Al + AB1

where the equation (A1 + AB1) x(1 = o has no solution xt1 of degree lessthan el.

If this equation has a non-zero solution of minimal degree C2 (where,necessarily, es et), then by applying Theorem 4 to the pencil At + AB1we can transform the given pencil into the form

C., 0 0

'A 00 A2+AB

Continuing this process, we can put the given pencil into the quasi-diagonal form

L'nAy + ABp

(25)

where 0 < et c 82 :5 ... < ea and the equation (A, + x( p) o has nonon-zero solution, so that the columns of A. + 1B, are linearly independent.12

If the rows of A, + AB, are linearly dependent, then the transposedpencil A, + ABP' can be put into the form (25), where instead of C 1 ,8 2,- . , eathere occur the numbers (0 )_ 5 >)y .ts But then the givenpencil A + AB turns out to be transformable into the quasi-diagonal form

1= In the special case where E. + e. + + Fr = m the block A, + )J> is absent.Since no linear depeudCUee kith constant coefficients exists among the rows of the

pencil A + }.B and causCqucuth of .1,. + XIl,. we ha ST tI, > U.

36 XII. SINOI'I.AR PENCILS OF MATRICES

L,, 0/l L.

L.pLIT,.

LTA + AB

(26)

(O

5. MINIMAL INDICES. CRITERION FOR STRONG EQUIVALENCE 37

where there is no longer any linear dependence with constant coefficientsamong the rows or the columns of the pencil A + AB. The pencil A + 2Bcan now be represented in the form (26). Thus, in the general case, thepencil A + AB can always be put into the canonical quasi-diagonal form

a

{k [0, L89+11 ... , Lip, Lok+1, ..., Loo, A0 + ABo}. (29)

The choice of indices for e and 71 is due to the fact that it is convenient hereto take and '1= r1== =i1k=0.

When we replace the regular pencil AD + IB in (29) by its canonicalform (6) (see 2, p. 28), we finally obtain the following quasi-diagonalmatrix

0

{"[0; L o; N("1), ..., N(w); J + .E}, (30)where the matrix J is of Jordan normal form or of natural normal form andN(s) = E(w) + IH(w).

The matrix (30) is the canonical form of the pencil A + AB in the mostgeneral case.

In order to determine the canonical form (30) of a given pencil imme-diately, without carrying out the successive reduction processes, we shall,following Kronecker, introduce in the next section the concept of minimalindices of a pencil.

5. The Minimal Indices of a Pencil. Criterion forStrong Equivalence of Pencils

1. Let A + AB be an arbitrary singular pencil of rectangular matrices.Then the k polynomial columns x1(A ), x2 (A), ... , xk (1) that are solutionsof the equation

(A+AB)x=o (31)are linearly dependent if the rank of the polynomial matrix formed fromthese columns %_ [x1(A), x2(A), ... , xk(A) ] is less than k. In that casethere exist k polynomials pi (A), P2(A), ..., pk(I), not all identically zero,such that

P1 (A) XI W+ P2 (A) X2 (A)+-"+ Pk(')Xk (A) _0'But if the rank of X is k, then such a dependence does not exist and thesolutions x1(A), x2(2), . . . , xk(A) are linearly independent.


Among all the solutions of (31) we choose a non-zero solution x, (A) ofleast degree el. Among all the solutions of the same equation that are lin.early independent of xl(A) we take a solution x2(2) of least degree e2.Obviously, el < E2. We continue the process, choosing among the solutionsthat are linearly independent of xl(A) and x2(A) a solution x,(1) of minimaldegree E,, etc. Since the number of linearly independent solutions of (31)is always at most n, the process must come to an end. We obtain a funda-mental series of solutions of (31)

x1(1), x2 (A), ... , x,, (A) (32)having the degrees

$1 5 82 5 S ep- (33)

In general, a fundamental series of solutions is not uniquely determined(to within scalar factors) by the pencil A + AB. However, two distinctfundamental series of solutions always have one and the same series ofdegrees 61, E2, ... , ea. For let us consider in addition to (32) another funda-mental series of solutions al(1), 22(A), ... with the degrees El, 82, ....Suppose that in (33)

and similarly, in the series El, E2, ,

Obviously, el = El. Every column x`,(A) (i=1, 2, ... , iil) is a linear com-bination of the columns x, (A), x2(1), . . . , x,,,(A), since otherwise the solu-tion xp,+1(A) in (32) could be replaced by x,(2), which is of smaller degree.It is obvious that, conversely, every column x, (A) (i = 1, 2, . . . , nj) is alinear combination of the columns xl(A), i2(,l), ...,x,,,+, (A). Thereforen, = Ti and E,,,+1 =E,,,+l. Now by a similar argument we obtain that,n2=n'2 and En,+1=a*,+1, etc.2. Every solution xk(A) of the fundamental series (32) yields a lineardependence of degree Ek among the columns of A + AB (k = 1, 2, ... , p).Therefore the numbers el, 82, ... , e, are called the minimal indices for thecolumns of the pencil A + AB.

The minimal indices 17, rte, ..., 71, for the rows of the pencil A + ,1B areintroduced similarly. Here the equation (A + AB) x = o is replaced by(AT +ABT)y=o, and 111, 112, ..., ij,, are defined as minimal indices for thecolumns of the transposed pencil AT + AB T.

5. MINIMAL INDICES. CRITERION FOR STRONG EQUIVALENCE 39

Strictly equivalent pencils have the same minimal indices. For letA + )1B and P(A + A.B)Q be two such pencils (P and Q are non-singularsquare matrices). Then the equation (31) for the first pencil can be written,after multiplication on the left by P, as follows:

P(A+2B)Q-Q-'x=o.Hence it is clear that all the solutions of (31), after multiplication on theleft by Q-', give rise to a complete system of solutions of the equation

P(A+.1B)Qz=o.Therefore the pencils A + AB and P(A + AB) Q have the same minimalindices for the columns. That the minimal indices for the rows also coincidecan be established by going over to the transposed pencils.

Let us compute the minimal indices for the canonical quasi-diagonalmatrix

(h [op L,D+t, ... , Lop; LT4, Ap + AB0 } (34)(A0 + AB0 is a regular pencil having the normal form (6) ).

We note first of all that: The complete system of indices for the columns(rows) of a quasi-diagonal matrix is obtained as the union of the correspond-ing systems of minimal indices of the individual diagonal blocks. The matrixL. has only one index e for the columns, and its rows are linearly independ-ent. Similarly, the matrix L' has only one index rt for the rows, and its-columns are linearly independent. Therefore the matrix (34) has as itsminimal indices for the columns

e1=...=e,=0, eg+1, ..., epand for the rows

7111=*,*=Th,=0, !'lA+l, - , 1 7 e

note further that L. has no elementary divisors, since among itsminors of maximal order a there is one equal to 1 and one equal to A'. Thesame statement is, of course, true for the transposed matrix L,. Since theelementary divisors of a quasi-diagonal matrix are obtained by combiningthose of the individual diagonal blocks (see Volume I, Chapter VI, P. 141),the elementary divisors of the A-matrix (34) coincide with those of its regular`kernel' A,, + AB,,.

The canonical form of the pencil (34) is completely determined by theminimal indices e1, ... , !?Q and the elementary divisors of thepencil or, what is the same, of the strictly equivalent pencil A + AB. Since


two pencils having one and the same canonical form are strictly equivalent,we have proved the following theorem :

T$aoREM 5 (Kronecker) : Two arbitrary pencils A + AB and Al + AB1of rectangular matrices of the same dimension m X n are strictly equivalentif and only if they have the same minimal indices and the same (finite andinfinite) elementary divisors.

In conclusion, we write down, for purposes of illustration, the canonicalform of a pencil A + AB with the minimal indices e1= 0, e2 =1 , es = 2,'71= 0, 172 = 0, t)a = 2 and the elementary divisors As, (A + 2) s, a :18

00

A 1

A 1 00 A 1

I

A+2 10 A+2

6. Singular Pencils of Quadratic Forms1. Suppose given two complex quadratic forms :

A aax,xr, B (x, x) = j b;kx,xk; (36)

they generate a pencil of quadratic forms A(x, x) + AB(x, x). This pencilof forms corresponds to a pencil of symmetric matrices A + AB (AT = A,BT = BY . If we subject the variables in the pencil of forms A (x, x) + AB (x, x)to a non-singular linear transformation x = Tz (I T I 0), then the trans-formed pencil of forms A(z, z) + AB(z, z) corresponds to the pencil ofmatrices

I

. (35)

15 All the elements of the matrix that are not mentioned expressly are zero.

6. SINGULAR PENCILS OF QUADRATIC FORMS 41

A+AB=TT(A+AB)T ; (37)here T is a constant (i.e., independent of A) non-singular square matrix oforder n.

Two pencils of matrices A + AB and A + Ali that are connected by a rela-tion (36) are called congruent (see Definition 1 of Chapter X ; Vol. I, p. 296).

Obviously, congruence is a special case of equivalence of pencils ofmatrices. However, if congruence of two pencils of symmetric (or skew-symmetric) matrices is under consideration, then the concept of congruencecoincides with that of equivalence. This is the content of the followingtheorem.

THEOREM 6: Two strictly equivalent pencils of complex symmetric (orskew-symmetric) matrices are always congruent.

Proof. Let A -A+ AB and A- A + AB be two strictly equivalentpencils of symmetric (skew-symmetric) matrices:

A=PAQ (AT=A, AT=A; IPI O, IQI O). (38)By going over to the transposed matrices we obtain :

A= QTAPT. (39)From (38) and (39), we have

AQPT-1=P-'QTA. (40)Setting

U=QPT-1, (41)we rewrite (40) as follows :

AU= UT A.

From (42) it follows easily thatAUk= UT*A (k=0, 1, 2, ...)

(42)

and, in general,

whereAS=STA , (43)

S=/(U), (44)and f (A) is an arbitrary polynomial in A. Let us assume that this poly-nomial is chosen such that I S 10. Then we have from (43) :

A = STAB-' . (45)


Substituting this expression for A in (38), we have:A= PSTAS-'Q. (46)

If this relation is to be a congruence transformation, the following equa-tion must be satisfied:

which can be rewritten as(PST)T = S-1 Q

S2=QPT-'=U.Now the matrix S = f (U) satisfies this equation if we take as f (A) theinterpolation polynomial for CA on the spectrum of U. This can be done,because the many-valued function } has a single-valued branch determinedon the spectrum of U, since I U I :F& 0.

The equation (46) now becomes the condition for congruence11=TTAT (T=SQ=1/QPT_'Q). (47)

From this theorem and Theorem 5 we deduce :COROLLARY : Two pencils of quadratic forms

A (x, x) + AB (x, x) and A (z, z) + AB (z, z)can be carried into one another by a transformation xt-- Tz (I T I , 0) ifand only if the pencils of symmetric matrices A + AB and A + AB have thesame elementary divisors (finite and infinite) and the same minimal indices.

Note. For pencils of symmetric matrices the rows and columns have thesame minimal indices :

p= q; el = i?I, ... , Ep= tly. (48)

2. Let us raise the following question : Given two arbitrary complex quad-ratic forms

n

A (x, x) = aikxixk , B (x, x) = bax,xk .i.k_1 Ck_1

Under what conditions can the two forms be reduced simultaneously tosums of squares

n n

.E arzi and is biz (49)

by a non-singular transformation of the variables x = Tz (I T I =A 0) t

6. SINGULAR PENCILS OF QUADRATIC FORMS 43

Let us assume that the quadratic forms A(x, x) and B(x, x) have thisproperty. Then the pencil of matrices A + AB is congruent to the pencilof diagonal matrices

(al + A1, ap + 92, ... , an + Abn) (50)

Suppose that among the diagonal binomials a{+1b{ there are precisely r(r S n) that are not identically zero. Without loss of generality we canassume that

ai+Abt40 (i=n-r+ 1, ..., n). (51)Setting

Ao + ABo = (an +1 + Abu-,+1, ... , an + Abn) , (52)

we represent the matrix (51) in the form

(0, Ao+ABo). (53)Comparing (52) with (34) (p.39), we see that in this case all the minimal

indices are zero. Moreover, all the elementary divisors are linear. Thuswe have obtained the following theorem:

THEOREM 7: Two quadratic forms A(x, x) and B(x, x) can be reducedsimultaneously to runts of squares (49) by a transformation of the variablesif and only if in the pencil of matrices A+,1B all the elementary divisors(finite and infinite) are linear and all the minimal indices are zero.

In order to reduce two quadratic forms A(x, x) and B(x,.r) simulta-neously to some canonical form in the general ease, we have to replace thepencil of matrices A + AB by a strictly equivalent 'canonical' pencil ofsymmetric matrices.

Suppose the pencil of symmetric matrices A + AB has the minimal indicesel = ... = e, = 0, eg+1 0, ... , e, 0, the infinite elementary divisors101, 9, ... , 101 and the finite ones (I + A])`1, (I + A2)c , . . . , (A + A,)`, Then,in the canonical form (30), g = h, p = q and sp+ 1 = rlp+ 1, ... , EP = ny. Wereplace in (30) every two diagonal blocks of the form L, and L6 by a singlediagonal block (L` L.) and each block of the form M') =E'"+AH(v)by the

44 Xll. SINGI'LAR PI;NCII.S OF IIATRICES

strictly equivalent symmetric block

N(u)= V(")N()=

0 0...0 10 0... 1 A1 A...0 0

with V(") =

0 0 ... 0 10 0 ... 1 0

1

11 0...0 0Moreover, instead of the regular diagonal block J + AE in (30) (J is aJordan matrix)

J+AE={(A+A1)E(4)+H(a), ..., (A+A,)Met) +H(a)),

we take the strictly equivalent block

A,,),... ,

ZAa)l (55)where

Z,ZI)= V(a) [(A + Aj) E(a) + H(a)J0 . .. 0 A+Ai;+0 . . . A + A, 1 I1

(i =1, 2, ..., t). (56)

jiA+Ai 1 . . . . 0 liThe pencil A + AB is strictly equivalent to the symmetric pencil

A+ AB

= 0 ),...,0 ),( 0, Leg+1 (L.P0 L P N(a.) Z(d,) Z(a)

. (54)

(57)

Two quadratic forms with complex coefficients A(x, x) and B(x, x) canbe simultaneously reduced to the canonical forms A(z, z) and B(z, z) definedin (57) by a transformation of the variables x = Tz (I T 0).11

In the mission edition the author stated that propo,itions analogous to Theorems tiand 7 hold for hermitinn forms. A. I. Ma)'rev has pointed out to the author that this isnot the ease. As regards, singular pencils of lie ratitimi norms, we 1197 111.

7. APPLICATION TO DIFFERENTIAL EQUATIONS 45

7. Application to Differential Equations1. The results obtained will now be applied to a system of m linear differ-ential equations of the first order in n unknown functions with constantcoefficients:l9

dxk=

or in matrix notation :

hereto

We introduce new unknown functions z1, z2, ..., z, that are connectedwith the old x1, x2, ..., x by a linear non-singular transformation withconstant coefficients:

k-1 k-1

Ax + Bdx = f (t);

A=llaall, B=llbfkljx=(x1, 02, ..., x,,), ! =(!1+/2' . fm)

(59)

x=Qz IQI O) (60)Moreover, instead of the equations (58) we can take m arbitrary inde-

pendent combinations of these equations, which is equivalent to multiplyingthe matrices A, B, f on the left by a square non-singular matrix P of order m.Substituting Qz for x in (59) and multiplying (59) on the left by P, weobtain :

where(61)

(62)

Az+Bdi=f(t),A=PAQ, B=PBQ, f=Pf =(ft,I2,...,fw)-

The matrix pencils A + AB and A + 19 are strictly equivalent :A + 1B = P (A + AB) Q . (63)

We choose the matrices P and Q such that the pencil A + 2B has thecanonical quasi-diagonal form

1s The particular case where m = n and the system (58) is solved with respect to thederivatives has been treated in detail ito Vol. 1, Chapter V, 5.

It is well known that a system of linear differential equations with constant coeffi-cients of arbitrary order s can be reduced to the form (58) if all the derivatives of theunknown functions up to and including the order s - I are included as additional unknownfunctions.

10 We recall that parentheses denote column matrices. Thus, .r = (r,, r,, ... , is thecolumn with the elements z,, .r,, ... , r_


A+,1B=(0, L194.1, ..., L,p, LTik+t, ...,Ly, N(",), ..., N(w), J+A). (64)In accordance with the diagonal blocks in (64) the system of differential

equations splits into v = p - g + q - h + s + 2 separate systems of the form

i 1

L.t(a)1zt=1)fdtT d P-9+1+i P-v+1+i

(d,) z =d P-D+Q-k+1+k P-9+9-k+1+k

N(-k) (de) z = T

(J+dl)z=/ ,where

(65)

(i=1,2, ...,p-q), (66)(9 =1, 2, ... , q - h) , (67)

( k=1 ,2 , (68)

(69)

(70)

z

1 1 22

z = (z1, ... , zo), f = (/I, ... ,1k), z = (zp+1, ...), 1= ...) etc., (71)A (a) = A + Bd , if A (d) = A + 2B . (72)

Thus, the integration of the system (59) in the most general ease isreduced to the integration of the special systems (65)-(69) of the same type.In these systems the matrix pencil A + AB has the form 0, L L", N's), andJ +AE, respectively.

1) The system (65) is not inconsistent if and only if

/,=0, ..., 7k=0. (73)


In that case we can take arbitrary functions of t as the unknown functionsI

zI, z2, ... , za that form the columns Z.2) The system (66) is of the form

L,(d )z = f (74)

or, more explicitly,'"'

ai +22=%I(t), de +z,=%2(t), ..., d +Z.+I=i.(t). (75)

Such a system is always consistent. If we take for z s +I (t) an arbitraryfunction of t, then all the remaining unknown functions z, , z, _I, ... , zI canbe determined from (75) by successive quadratures.

3) The system (67) is of the form

Ln (at)z=f

or, more explicitly,21

(76)

ai=fa(t),a

+zI=/2 (t), ..., ai +zn-Ii"=to+I(t) (77)From all the equations (77) except the first we determine zi), z,-I, ... , z1

uniquely ://Zn-l7+1,

Zq-I-fn adt I (78). . . . . . . . .

. . . . . . .

zI-f2 dt +... + (- I)v -I dtn-tSubstituting this expression for zI into the first equation, we obtain thecondition for consistency :

di.dfs dill (79)

20 We have changed the indices of z and f to simplify the notation. In order to returnfrom (73) to (66) we have to replace E by E; and add to each index of z the numher9+e9 1+ -Sy+i-t+l-l,toeachindex of7thenumber1+ep+t+.. +eg+i-t

21 Here, as in the preceding ease, we have changed the notation. See the precedingfootnote.

48 Xll. SINGULAR PENCILS OF MATRICES


or, more explicitly,

(80)

d +z1=f1, de +zY-1=fr_l, (81)

Hence we determine successively the unique solutions

Z.

dt. . . . . . . . . . . . . . . . . .

e,Js Ts,-


(82)

(83)

As we have proved in Vol. I, Chapter V, 5, the general solution of sucha system has the form

z =e-"zo + I e-J(9-T) f (t) dT;0

(84)

here zo is a column matrix with arbitrary elements (the initial values ofthe unknown functions for t=0).

The inverse transition from the system (61) to (59) is effected by theformulas (60) and (62), according to which each of the functions x1, . . . , xis a linear combination of the functions zl, . . . , z and each of the functionsfi(t), . . . , &(t) is expressed linearly (with constant coefficients) in termsof the functions fl(t), . . . ,2. The preceding analysis shows that: In general, for the consistency of thesystem (58) certain well-defined linear dependence relations (with constantcoefficients) must hold among the right-hand sides of the equations and thederivatives of these right-hand sides.

If these relations are satisfied, then the general solution of the systemcontains both arbitrary constants and arbitrary functions linearly.

The character of the consistency conditions and the character of the


solutions (in particular, the number of arbitrary constants and arbitraryfunctions) are determined by the minimal indices and the elementary divi-sors of the pencil A + AB, because the canonical form (65)-(69) of the sys-tem of differential equations depends on these minimal indices and ele-mentary divisors.

CHAPTER XIIIMATRICES WITH NON-NEGATIVE ELEMENTS

In this chapter we shall study properties of real matrices with non-negativeelements. Such matrices have important applications in the theory ofprobability, where they are used for the investigation of Markov chains(`stochastic matrices,' see [ 46] ), and in the theory of small oscillations ofelastic systems (`oscillation matrices,' see [171).

1. General Properties1. We begin with some definitions.

DEFINITION 1: A rectangular matrix A with real elementsA=1ja;k (i=1,2,...,m;k=1,2, ..,n)

is called non-negative (notation: A ? 0) or positiv

felix r. gantmacher the theory of matrices, vol. 2 2000 (1).pdf

Documents

present book

theory of matrices

mathematical literature

chapter iv

places of chapter v

individual chapters

theoretical physics

functions of matrices