constructing rotation matrices using power series€¦ · such matrices have only real-valued...

Constructing Rotation Matrices using Power Series

David Eberly, Geometric Tools, Redmond WA 98052https://www.geometrictools.com/

This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copyof this license, visit http://creativecommons.org/licenses/by/4.0/ or send a letter to Creative Commons,PO Box 1866, Mountain View, CA 94042, USA.

Created: December 11, 2007Last Modified: December 29, 2007

Contents

1 Introduction 3

1.1 Power Series of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Power Series Involving Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Reduction of the Matrix Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.4 The Cayley-Hamilton Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.5 Ordinary Difference Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.6 Generating Rotation Matrices from Skew-Symmetric Matrices . . . . . . . . . . . . . . . . . . 8

2 Rotations Matrices in 2D 9



4.1 The Case δ = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

4.2 The Case δ 6= 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

4.2.1 The Case ρ > 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4.2.2 The Case ρ = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4.3 Summary of the Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

4.4 Source Code for Random Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

5 Fixed Points, Invariant Spaces, and Direct Sums 19

5.1 Fixed Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

5.2 Invariant Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

1

https://www.geometrictools.com/

http://creativecommons.org/licenses/by/4.0/

5.3 Direct Sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5.4 Rotations in 4D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

5.4.1 The Case (d, e, f) = (0, 0, 0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5.4.2 The Case (a, b, c) = (0, 0, 0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.4.3 The Case (d, e, f) 6= 0, (a, b, c) 6= 0, and δ = 0 . . . . . . . . . . . . . . . . . . . . . . . 24

5.4.4 The Case (d, e, f) 6= 0, (a, b, c) 6= 0, and δ 6= 0 . . . . . . . . . . . . . . . . . . . . . . . 25

5.4.5 A Summary of the Factorizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

References 29

2

1 Introduction

Although we tend to work with rotation matrices in two or three dimensions, sometimes the question arisesabout how to generate rotation matrices in arbitrary dimensions. This document describes a method forcomputing rotation matrices using power series of matrices. The approach is one you see in an undergraduatemathematics course on solving systems of linear differential equations with constant coefficients; for example,see [Bra83, Chapter 3].

1.1 Power Series of Functions

The natural exponential function is introduced in Calculus, namely, exp(x) = ex, where the base is (approx-imately) e

.= 2.718281828. The function may be written as a power series

exp(x) = 1 + x+x2

2!+x3

3!+ · · ·+ xk

k!+ · · · =

∞∑k=0

xk

k!(1)

The power series is known to converge for any real number x. The formula extends to complex numbersz = x+ iy, where x and y are real numbers and i =

√−1,

ez = ex+iy = exeiy = exp(x) (cos(y) + i sin(y)) (2)

The term exp(x) may be written as a power series using Equation (1). The trigonometric terms also havepower series representations,

sin(y) = y − y3

3!+y5

5!− · · · =

∞∑k=0

(−1)ky2k+1

(2k + 1)!(3)

cos(y) = 1− y2

2!+y4

4!− · · · =

∞∑k=0

(−1)ky2k

(2k)!(4)

The latter two power series also converge for any real number y.

1.2 Power Series Involving Matrices

The power series representations extend to functions whose inputs are square matrices rather than scalars,taking us into the realm of matrix algebra; for example, see [HJ85]. That is, exp(M), cos(M), and sin(M)are power series of the square matrix M , and they converge for all M .

In particular, we are interested in exp(M) for square matrix M . A note of caution is in order. Thescalar-valued exponential function exp(x) has various properties that do not immediately extend to thematrix-valued function. For example, we know that

exp(a+ b) = ea+b = eaeb = ebea = eb+a = exp(b+ a) (5)

for any scalars a and b. This formula does not apply to all pairs of matrices A and B. The problem is thatmatrix multiplication is not commutative, so reversing the order of terms exp(A) and exp(B) in the productsgenerally produces different values. It is true, though that

exp(A+B) = exp(A) exp(B), when AB = BA (6)

3

That is, as long as A and B commute in the multiplication, the usual property of power-of-sum-equals-product-of-powers applies.

The power series for exp(M) of a square matrix M is formally defined as

exp(M) = I +M +M2

2!+ · · ·+ Mk

k!+ · · · =

∞∑k=0

Mk

k!(7)

and converges for all M . The first term, I, is the identity matrix of the same size as M . The immediatequestion is how one goes about computing the power series. That is one of the goals of this document andis described next.

1.3 Reduction of the Matrix Power Series

Suppose that the n × n matrix M is diagonalizable; that is, suppose we may factor M = PDP−1, whereD = Diag(d1, . . . , dn) is a diagonal matrix and P is an invertible matrix with inverse P−1. The matrixpowers are easily shown to be Mk = PDkP−1, where Dk = Diag(dk1 , . . . , d

kn). The power series factors as

exp(M) =∑∞k=0

Mk

k! =∑∞k=0 P

Dk

k! P−1

= P(∑∞

k=0Dk

k!

)P−1

= P Diag(∑∞

k=0dk1k! , . . . ,

∑∞k=0

dknk!

)P−1

= P Diag(ed1 , . . . , edn

)P−1

(8)

The expression is valid regardless of whether P and D have real-valued or complex-valued entries. For thepurpose of numerical computation, when D has a real-valued entry, dj , then exp(dj) may be computed usingany standard mathematics library. Naturally, the computation is only an approximation. When D has acomplex-valued entry, dj = xj + iyj , then Equation (2) may be used to obtain exp(dj) = exp(xj)(cos(yj) +i sin(yj)). All of exp(xj), cos(yj), and sin(yj) may be computed using any standard mathematics library.

The problem, though, is that not all matrices M are diagonalizable. A special case that arises often is areal-valued symmetric matrix M . Such matrices have only real-valued eigenvalues and a complete basis oforthonormal eigenvectors. The eigenvalues are the diagonal entries of D and the corresponding eigenvectorsbecome the columns of P . The orthonormality of the eigenvectors guarantees that P is orthogonal, soP−1 = PT (the transpose is the inverse). A more general result is discussed in [HS74, Chapter 6], whichapplies to all matrices. A brief summary is provided here.

A real-valued matrix S that is diagonalizable is said to be semisimple. A real-valued matrix N is nilpotentif there exists a power p ≥ 1 for which Np = 0. Generally, a real-valued matrix M may be uniquelydecomposed as M = S + N , where S is real-valued and semisimple, N is real-valued and nilpotent, andSN = NS. Because S is diagonalizable, we may factor it is S = PDP−1, where D = Diag(d1, . . . , dn) is adiagonal matrix and P is invertible. What this means regarding the exponential function is

exp(M) = exp(S +N) = exp(S) exp(N) = P Diag(ed1 , . . . , edn

)P−1

p−1∑k=0

Nk

k!(9)

The key here is that the power series for exp(N) is really a finite sum because N is nilpotent. Equation (9)shows us how to compute exp(M) for any matrix M ; in particular, the equation may be implemented on acomputer. Notice that in the special case of a symmetric matrix M , it must be that M = S and N = 0.

4

The discussion here allows for complex numbers. In particular, P and D might have complex-valued entries.Instead, we may factor the n × n matrix S = QEQ−1 as follows. Let S have r real-valued eigenvalues λ1through λr and c real-valued eigenvalues α1+iβ1 through αc+iβc, where n = r+c. Because S is real-valued,the complex-valued eigenvalues appear in pairs, αj + iβj and αj − iβj ; thus, c is even, say, c = 2m. We mayconstruct a basis of vectors for IRn that become the columns of the matrix Q and for which the matrix Ehas the block-diagonal form

E =

λ1. . .

λr α1 β1

−β1 α1

· · · αm βm

−βm αm

(10)

Specifically, let Ui be a real-valued eigenvector corresponding to the real-valued eigenvalue λi for 1 ≤ i ≤ r.Let Vj = Xj+iYj be a complex-valued eigenvector corresponding to the complex-valued eigenvalue αj+iβjfor 1 ≤ j ≤ c. The numbers α and β are real-valued and the vectors X and Y have real-valued components.Because S is real-valued, when µ is a complex-valued eigenvalue with eigenvector V, the conjugate µ haseigenvector V, which is the componentwise conjugation of V. The matrix P has columns that are theeigenvectors,

P =[U1 · · · Ur V1 V1 · · · Vm Vm

](11)

To verify the factorization,

SP = S[U1 · · · Ur V1 V1 · · · Vm Vm

]=

[SU1 · · · SUr SV1 SV1 · · · SVm SVm

]=

[λ1U1 · · · λrUr (α1 + iβ1)V1 (α1 − iβ1)V1 · · · (αm + iβm)Vm (αm − iβm)Vm

]= PD

(12)where D is the diagonal matrix of eigenvalues. Applying the inverse of P to both sides of the equationproduces S = PDP−1.

Let µ = α + iβ be a complex-valued eigenvalue of S. Let V = X + iY be a complex-valued eigenvector forµ. We may write SV = (α+ iβ)V in terms of real parts and imaginary parts,

SX + iSY = (αX− βY) + i(βX + αY) (13)

Equating real parts and imaginary parts,

SX = αX− βY, SY = βX + αY (14)

5

Writing this in block-matrix form,

S[X Y

]=[X Y

] α β

−β α

(15)

Eigenvectors corresponding to two distinct eigenvalues are linearly independent. To see this, let e1 and e2be distinct eigenvalues with corresponding eigenvectors W1 and W2, respectively. To demonstrate linearindependence, we must show that

0 = c1W1 + c2W2 (16)

implies c1 = c2 = 0. Applying S to Equation (16) leads to

0 = c1SW1 + c2SW2 = c1e1W1 + c2e2W2 (17)

Multiplying Equation (16) by e1 and subtracting from Equation (17) produces

0 = c2(e2 − e1)W2 (18)

The eigenvalues are distinct, so e2 − e1 6= 0. Eigenvectors are nonzero, so the only possibility to satisfyEquation (18) is c2 = 0, which in turn implies 0 = c1W1 and c1 = 0.

This result implies that V and V are linearly independent vectors, and consequently X = (V + V)/2 andY = (V− V)/(2i) are linearly independent vectors. In the matrix P of Equation (11), we may replace eachpair of complex-valued eigenvectors V and V by the real-valued vectors X and Y to obtain a real-valuedand invertible matrix

Q =[U1 · · · Ur X1 Y1 · · · Xm Ym

](19)

Using Equation (15), it may be verified that S = QEQ−1.

The exponential of E has the block-diagonal form

exp(E) =

eλ1

. . .

eλr

eα1

cosβ1 sinβ1

− sinβ1 cosβ1

· · ·

eαm

cosβm sinβm

− sinβm cosβm

(20)

This follows from

Ek =

αk βk

−βk αk

= αkI + βkJ (21)

where

I =

1 0

0 1

, J =

0 1

−1 0

, IJ = JI (22)

6

Because I and J commute in their product,

exp(Ek) = exp(αkI + βkJ) = exp(αkI) exp(βkJ) = eαk

cosβk sinβk

− sinβk cosβk

(23)

The matrix

exp(M) = Q exp(E)Q−1p−1∑k=0

Nk

k!(24)

is therefore computed using only real-valued matrices. The heart of the construction relies on factoringM = S +N , computing the eigenvalues of S, and computing an orthogonal basis for IRn from S.

1.4 The Cayley-Hamilton Theorem

This is also a useful result that allows a reduction of the power series for exp(M) to a finite sum. Theeigenvalues of an n× n matrix M are the roots to the characteristic polynomial

p(t) = det(M − tI) = p0 + p1t+ · · ·+ pntn =

n∑k=0

pktk (25)

where I is the identity matrix. The degree of p is n and the coefficients are p0 through pn = (−1)n. Thecharacteristic equation is p(t) = 0. The Cayley-Hamilton Theorem states that if you formally substitute Minto the characteristic equation, you obtain equality,

0 = p(M) = p0I + p1M + · · ·+Mn =

n∑k=0

pkMk (26)

Multiplying this equation by powers of M and reducing everytime the largest power is a multiple of n allowsa reduction of the power series for exp(M) to

exp(M) =

∞∑k=0

Mk

k!=

n−1∑k=0

ckMk (27)

for some coefficients c0 through cn−1. These coefficients are necessarily functions of the characteristic coef-ficients pj for 0 ≤ j ≤ n but the relationships are at first glance complicated to determine in closed form.This leads us into the topic of ordinary difference equations.

1.5 Ordinary Difference Equations

This is a topic whose theory parallels that of ordinary differential equations. It is a large topic that is notdiscussed in full detail here; a summary of it is provided in [Ebe03, Appendix D]. Generally, one has asequence of numbers {xk}∞k=0 and a relationship that determines a term in the sequence from the previousn terms:

xk+n = f(xk, · · · , xk+n−1), k ≥ 0 (28)

7

This is an explicit equation for xk+n. The equation is said to have degree n ≥ 1. The initial conditions areuser-specified x0 through xn−1. An implicit equation is

F (xk, · · · , xk+n−1, xk+n) = 0, k ≥ 0 (29)

and are generally more difficult to solve because the xk+n term might occur in such a manner as to preventsolving for it explicitly.

The subtopic of interest here is how to solve ordinary difference equations that are linear and have constantcoefficients. The general equation of this form is written with all terms on one side of the equation as if itwere implicit, but this is still an explicit equation because we may solve for xk+n,

pnxk+n + pn−1xk+n−1 + . . . p1xk+1 + p0xk = 0 (30)

where p0 through pn are constants with pn 6= 0. Notice that the coefficients may be used to form thecharacteristic polynomial described in the previous section, p(t) = p0 + p1t+ · · ·+ pnt

n with pn = (−1)n.

The method of solution for the linear equation with constant coefficients involves computing the roots of thecharacteristic equation. Let t1 through t` be the distinct roots of p(t) = 0. Let each root tj have multiplicitymj ; necessarily, n = m1 + · · ·+m`. The Fundamental Theorem of Algebra states that the polynomial factorsinto a product,

p(t) =∏j=1

(t− tj)mj (31)

Equation (30) has mj linearly independent solutions corresponding to tj , namely,

xk = tkj , xk = ktkj , , · · · , xk = kmj−1tkj (32)

This is the case even when tj is a complex-valued root. The general solution to Equation (30) is

xk =∑j=1

(mj−1∑s=0

cjsks

)tkj , k ≥ n (33)

where the n coefficients cjs are determined from the initial conditions which are the user-specified values x0through xn−1.

As it turns out, the finite difference equations and constructions apply equally as well when the xk arematrices. In this case, the cjs of Equation (33) are themselves matrices.

1.6 Generating Rotation Matrices from Skew-Symmetric Matrices

Equation (6) shows how to exponentiate a sum of matrices. As noted, exp(A+B) = exp(A) exp(B) as longas AB = BA. In particular, it is the case that I = exp(0) = exp(A+ (−A)) = exp(A) exp(−A), where I isthe identity matrix. Thus, we have the formula for inversion of an exponential of a matrix,

exp(A)−1 = exp(−A) (34)

The transpose of a sum of matrices is the sum of the transposes of the matrices, a property that also extendsto the exponential power series. A consequence is

exp(A)T = exp(AT)

(35)

8

Now consider a skew-symmetric matrix S. Such a matrix has the property ST = −S. Define R = exp(S).Using Equations (34) and (35),

RT = exp(S)T = exp(ST)

= exp(−S) = exp(S)−1 = R−1 (36)

Because the inverse and transpose of R are the same matrix, R must be an orthogonal matrix. Althoughin the realm of advanced mathematics, it may be shown that in fact R is a rotation matrix. The essenceof the argument is that the space of orthogonal matrices has two connected components, one for which thedeterminant is 1 (rotation matrices) and one for which the determinant is −1 (reflection matrices). Eachcomponent is path connected. The curve of orthogonal matrices exp(tS) for t ∈ [0, 1] is a path connecting I(the case t = 0) and R = exp(S) (the case t = 1), so R and I must have the same determinant, which is 1,and R is therefore a rotation matrix.

2 Rotations Matrices in 2D

The general skew-symmetric matrix in 2D is

S = θ

0 −1

1 0

(37)

where θ is any real-valued number. The corresponding rotation matrix is R = exp(S). The characteristicequation for S is

0 = p(t) = det(S − tI) = t2 + θ2 (38)

The Cayley-Hamilton Theorem guarantees that

S2 + θ2I = 0 (39)

so that S2 = −θ2I. Higher powers of S are S3 = −θ2S, S4 = −θ2S2 = θ4I, and so on. Substituting theseinto the power series for exp(S) and grouping together the terms involving I and S produces

R = exp(S)

= I + S + S2

2! + S3

3! + S4

4! + S5

5! + · · ·

= I + S − θ2

2! I −θ2

3! S + θ4

4! I + θ5

5! S − · · ·

=(

1− θ2

2! + θ4

4! − · · ·)I +

(1− θ2

3! + θ4

5! − · · ·)S

= cos(θ)I +(

sin(θ)θ

)S

=

cos(θ) − sin(θ)

sin(θ) cos(θ)

(40)

The rotation matrix may be computed using the eigendecomposition method of Equation (8). The eigenvaluesof S are ±iθ, where i =

√−1. Corresponding eigenvectors are (±i, 1). The matrices D, P , and P−1 in the

9

decomposition are

D =

iθ 0

0 −iθ

, P =

i −i

1 1

, P−1 =1

2i

1 i

−1 i

(41)

It is easily verified that S = PDP−1. The rotation matrix is computed to be

R = P exp(D)P−1

=

i −i

1 1

eiθ 0

0 e−iθ

12i

1 i

−1 i

=

eiθ+e−iθ

2 − eiθ−e−iθ

2i

eiθ−e−iθ2i

eiθ+e−iθ

2

=

cos(θ) − sin(θ)

sin(θ) cos(θ)

(42)

where we have used the identities e±iθ = cos(θ)± i sin(θ).

The rotation matrix may also be computed using ordinary finite differences. The linear difference equationis

Xk+2 + θ2Xk = 0, k ≥ 0

X0 = I, X1 = S(43)

We know from the construction that Sk = Xk. Equation (33) is

Xk = (iθ)kC0 + (−iθ)kC1 (44)

for unknown coefficient matrices C0 and C1. The initial conditions imply

I = C0 + C1

S = iθC0 − iθC1

(45)

and have the solution

C0 = I−(i/θ)S2

C1 = I+(i/θ)S2

(46)

The solution to the linear difference equation is

Sk = Xk = (iθ)kI − (i/θ)S

2+ (−iθ)k I + (i/θ)S

2(47)

When k is even, say, k = 2p, Equation (47) reduces to

S2p = (−1)pθ2pI, p ≥ 1 (48)

10

When k is odd, say, k = 2p+ 1, Equation (47) reduces to

S2p+1 = (−1)pθ2pS, p ≥ 1 (49)

But these powers of S are exactly what we computed manually and substituted into the power series forexp(S) to produce Equation (40).



S = θ

0 a b

−a 0 c

−b −c 0

(50)

where θ, a, b, and c are any real-valued numbers with a2 + b2 + c2 = 1. The corresponding rotation matrixis R = exp(S). The characteristic equation for S is

0 = p(t) = det(S − tI) = −t3 − θ2t (51)

The Cayley-Hamilton Theorem guarantees that

− S3 − θ2S = 0 (52)

so that S3 = −θ2S. Higher powers of S are S4 = −θ2S2, S5 = −θ2S3 = θ4S, S6 = −θ2S2, and so on.Substituting these into the power series for exp(S) and grouping together the terms involving I, S, and S2

produces

R = exp(S)

= I + S + S2

2! + S3

3! + S4

4! + S5

5! + S6

6! + · · ·

= I + S + 12!S

2 − θ2

3! S −θ2

4! S2 + θ4

5! S + θ4

6! S2 − · · ·

= I +(

1− θ2

3! + θ4

5! − · · ·)S +

(12! −

θ2

4! + θ4

6! − · · ·)S2

= I +(

sin(θ)θ

)S +

(1−cos(θ)

θ2

)S2

(53)

If we define S = S/θ, thenR = I + sin(θ)S + (1− cos(θ))S2 (54)

which is Rodrigues’ Formula for a rotation matrix. The angle of rotation is θ and the axis of rotation hasunit-length direction (−c, b,−a).

The rotation matrix may also be computed using ordinary finite differences. The linear difference equationis

Xk+3 + θ2Xk = 0, k ≥ 0

X0 = I, X1 = S, X2 = S2(55)

11

We know from the construction that Sk = Xk. The roots of the characteristic equation are 1 and ±iθ.Equation (33) is

Sk = Xk = 1kC0 + (iθ)kC1 + (−iθ)kC2 (56)

for unknown coefficient matrices C0, C1, and C2. The initial conditions imply

I = C0 + C1 + C2

S = C0 + iθC1 − iθC2

S2 = C0 − θ2C1 − θ2C2

(57)

and have the solution

C0 = S2+θ2I1+θ2

C1 = (I−C0)−(i/θ)(S−C0)2

C2 = (I−C0)+(i/θ)(S−C0)2

(58)

When k is odd, say, k = 2p+ 1, Equation (56) reduces to the following using Equation (58)

S2p+1 = (−1)pθ2pS, p ≥ 1 (59)

When k is even, say, k = 2p+ 2, Equation (56) reduces to the following using Equation (58)

S2p+2 = (−1)pθ2pS2, p ≥ 1 (60)

But these powers of S are exactly what we computed manually and substituted into the power series forexp(S) to produce Equation (53).



S = θ

0 a b d

−a 0 c e

−b −c 0 f

−d −e −f 0

(61)

where θ, a, b, c, d, e, and f are any real-valued numbers with a2+b2+c2+d2+e2+f2 = 1. The correspondingrotation matrix is R = exp(S). The characteristic equation for S is

0 = p(t) = det(S − tI) = t4 + θ2t2 + θ4(af − be+ cd)2 = t4 + θ2t2 + θ4δ2 (62)

where δ = af − be+ cd. The Cayley-Hamilton Theorem guarantees that

S4 + θ2S2 + θ4δ2 = 0 (63)

Computing closed-form expressions for powers of S is simple, much like previous dimensions, when δ = 0.It is more difficult when δ 6= 0. Let us consider the cases separately. In both cases, we assume that θ 6= 0,otherwise S = 0 and the rotation matrix is the identity matrix.

12

4.1 The Case δ = 0

Suppose that δ = af − be + cd = 0; then S4 + θ2S2 = 0. The closed-form expressions for powers of S areeasy to generate. Specifically,

S2p+2 = (−1)pθ2pS2, p ≥ 1

S2p+3 = (−1)pθ2pS3, p ≥ 1(64)

This leads to the power series reduction

R = exp(S) = I + S +

(1− cos(θ)

θ2

)S2 +

(θ − sin(θ)

θ3

)S3 (65)

It is possible for further reduction depending on the entries of S. For example, if d = e = f = 0, weeffectively have the 3D rotation in (x, y, z) embedded in 4D; the w-component is unchanged by the rotation.The matrix S really satisfies S3 + θ2S = 0, in which case we may replace S3 = −θ2S in Equation (65)and obtain Equation (53). The fact that S is a “root” of a 3rd-degree polynomial when the characteristicpolynomial is 4th-degree is related to the linear algebraic concept of minimal polynomial.

Another reduction occurs, for example, when b = c = d = e = f = 0. The matrix S satisfies S2 + θ2I = 0and Equation (65) reduces to Equation (40).

4.2 The Case δ 6= 0

Suppose that δ = af − be+ cd 6= 0. The characteristic polynomial t4 + θ2t2 + θ4δ2 is a quadratic polynomialin t2. We may use the quadratic formula to obtain its roots,

t2 =−θ2 ±

√θ4 − 4θ4δ2

2= −θ2

(1±√

1− 4δ2

2

)(66)

Notice that

1− 4δ2 = (a2 + b2 + c2 + d2 + e2 + f2)− 4(af − be+ cd)2

=[(a− f)2 + (b+ e)2 + (c− d)2

] [(a+ f)2 + (b− e)2 + (c+ d)2

]= ρ2 ≥ 0

(67)

whereρ =

√[(a− f)2 + (b+ e)2 + (c− d)2] [(a+ f)2 + (b− e)2 + (c+ d)2] (68)

Thus, the right-hand side of Equation (66) is real-valued. Moreover, 1 − 4δ2 < 1, so the right-hand side ofEquation (66) consists of two negative real numbers. Taking the square roots produces the t-roots for thecharacteristic equation, all having zero real part,

t = ±iθ√

1− ρ2

=: ±iα, ±iθ√

1 + ρ

2=: ±iβ (69)

The values α and β are defined by these equations, both values being positive real numbers with α ≤ β.

13

4.2.1 The Case ρ > 0

Using the linear difference equation approach, the powers of S are

Sk = (iα)kC0 + (−iα)kC1 + (iβ)kC2 + (−iβ)kC3 (70)

where the C-matrices are determined by the initial conditions

I = C0 + C1 + C2 + C3

S = iαC0 − iαC1 + iβC2 − iβC3

S2 = −α2C0 − α2C1 − β2C2 − β2C3

S3 = −iα3C0 + iα3C1 − iβ3C2 + iβ3C3

(71)

The solution is

C0 = αβ3I−iβ3S+αβS2−iβS3

2αβ(β2−α2)

C1 = αβ3I+iβ3S+αβS2+iβS3

2αβ(β2−α2)

C2 = −α3βI+iα3S−αβS2+iαS3

2αβ(β2−α2)

C3 = −α3βI−iα3S−αβS2−iαS3

2αβ(β2−α2)

(72)

We may use Equation (70) to compute four consecutive powers of S, namely,

S4p = α4p(S2+β2I)−β4p(S2+α2I)β2−α2

S4p+1 = α4p(S3+β2S)−β4p(S3+α2S)β2−α2

S4p+2 = −α4p+2(S2+β2I)−β4p+2(S2+α2I)β2−α2

S4p+3 = −α4p+2(S3+β2S)−β4p+2(S3+α2S)β2−α2

(73)

for p ≥ 0. The exponential of S factors to

exp(S) =∑∞k=0

Sk

k!

=∑∞p=0

S4p

(4p)! +∑∞p=0

S4p+1

(4p+1)! +∑∞p=0

S4p+2

(4p+2)! +∑∞p=0

S4p+3

(4p+3)!

= 1β2−α2

[(∑∞p=0

α4p

(4p)!

)(S2 + β2I)−

(∑∞p=0

β4p

(4p)!

)(S2 + α2I)+(∑∞

p=0α4p

(4p+1)!

)(S3 + β2S)−

(∑∞p=0

β4p

(4p+1)!

)(S3 + α2S)−(∑∞

p=0α4p+2

(4p+2)!

)(S2 + β2I) +

(∑∞p=0

β4p+2

(4p+2)!

)(S2 + α2I)−(∑∞

p=0α4p+2

(4p+3)!

)(S3 + β2S) +

(∑∞p=0

β4p+2

(4p+3)!

)(S3 + α2S)

](74)

This expansion appears to be quite complicated, but the following identities allow us to simplify the expres-sion. Each equation below defines the function fj(x),

f0(x) =∑∞p=0

x4p

(4p)! = 14 (ex + e−x) + 1

2 cos(x)

f1(x) =∑∞p=0

x4p+1

(4p+1)! = 14 (ex − e−x) + 1

2 sin(x)

f2(x) =∑∞p=0

x4p+2

(4p+2)! = 14 (ex + e−x)− 1

2 cos(x)

f3(x) =∑∞p=0

x4p+3

(4p+3)! = 14 (ex − e−x)− 1

2 sin(x)

(75)

14

Notice that f0(x) = f ′1(x), f1(x) = f ′2(x), f2(x) = f ′3(x), and f3(x) = f ′0(x). It is also easy to verify thatexp(x) = f0(x) + f1(x) + f2(x) + f3(x). Equation (74) becomes

R = exp(S) = 1β2−α2

[f0(α)(S2 + β2I)− f0(β)(S2 + α2I)+

f1(α)α (S3 + β2S)− f1(β)

β (S3 + α2S)−

f2(α)(S2 + β2I) + f2(β)(S2 + α2I)−f3(α)α (S3 + β2S) + f3(β)

β (S3 + α2S)]

=(β2 cos(α)−α2 cos(β)

β2−α2

)I +

(β2 sin(α)/α−α2 sin(β)/β

β2−α2

)S+(

cos(α)−cos(β)β2−α2

)S2 +

(sin(α)/α−sin(β)/β

β2−α2

)S3

(76)

4.2.2 The Case ρ = 0

Let ρ = 0. Based on Equation (67), the only time this can happen is when (a, b, c) = ±(f,−e, d). Thiscondition implies a2 + b2 + c2 = 1/2 and δ2 = 1/4. The t-roots are ±iθ/

√2, each one having multiplicity 2.

Define α = θ/√

2. Using the linear difference equation approach, the powers of S are

Sk = (iα)kC0 + (−iα)kC1 + k(iα)kC2 + k(−iα)kC3 (77)

where the C-matrices are determined by the initial conditions

I = C0 + C1

S = iα(C0 − C1 + C2 − C3)

S2 = −α2(C0 + C1 + 2C2 + 2C3)

S3 = −iα3(C0 − C1 + 3C2 − 3C3)

(78)

The solution is

C0 = I2 −

3iS2√2 θ− iS3√2 θ3

C1 = I2 + 3iS

2√2 θ

+ iS3√2 θ3

C2 = − I4 + iS2√2 θ− S2

2θ2 + iS3√2 θ3

C3 = − I4 −iS

2√2 θ− S2

2θ2 −iS3√2 θ3

(79)

We may use Equation (77) to compute four consecutive powers of S, namely,

S4p = α4pI − 4pα4p(I2 + S2

2α2

)S4p+1 = α4p+1 S

α − 4pα4p+1(S2α + S3

2α3

)S4p+2 = α4p+2 S2

α2 + 4pα4p+2(I2 + S2

2α2

)S4p+3 = α4p+3 S3

α3 + 4pα4p+3(S2α + S3

2α3

) (80)

15

for p ≥ 0. Define S = S/α. The exponential of S factors to

exp(S) =∑∞k=0

Sk

k!

=∑∞p=0

S4p

(4p)! +∑∞p=0

S4p+1

(4p+1)! +∑∞p=0

S4p+2

(4p+2)! +∑∞p=0

S4p+3

(4p+3)!

=(∑∞

p=0α4p

(4p)!

)I −

(∑∞p=0

4pα4p

(4p)!

)(I2 + S2

2

)+(∑∞

p=0α4p+1

(4p+1)!

)S −

(∑∞p=0

4pα4p+1

(4p+1)!

)(S2 + S3

2

)+(∑∞

p=0α4p+2

(4p+2)!

)S2 +

(∑∞p=0

4pα4p+2

(4p+2)!

)(I2 + S2

2

)+(∑∞

p=0α4p+3

(4p+3)!

)S3 +

(∑∞p=0

4pα4p+3

(4p+3)!

)(S2 + S3

2

)(81)

We may simplify this using the definitions for fj(x) and the following identities,

g0(x) =∑∞p=0

4px4p

(4p)! = xf3(x)

g1(x) =∑∞p=0

4px4p+1

(4p+1)! = xf0(x)− f1(x)

g2(x) =∑∞p=0

4px4p+2

(4p+2)! = xf1(x)− 2f2(x)

g3(x) =∑∞p=0

4px4p+3

(4p+3)! = xf2(x)− 3f3(x)

(82)

Notice that g0(x) = g′1(x), g1(x) = g′2(x), and g2(x) = g′3(x). Equation (81) becomes

R = exp(S) =(f0(α)− g0(α)

2 + g2(α)2

)I +

(f1(α)− g1(α)

2 + g3(α)2

)S+(

f2(α)− g0(α)2 + g2(α)

2

)S2 +

(f3(α)− g1(α)

2 + g3(α)2

)S3

=(

2 cos(α)+α sin(α)2

)I +

(3 sin(α)−α cos(α)

2α

)S+(

sin(α)2α

)S2 +

(sin(α)−α cos(α)

2α3

)S3

(83)

4.3 Summary of the Formulas

We start with the skew-symmetric matrix

S = θ

0 a b d

−a 0 c e

−b −c 0 f

−d −e −f 0

(84)

where a2 + b2 + c2 + d2 + e2 + f2 = 1. Define

δ = af − be+ cd, ρ =√

1− 4δ2, α = θ

√1− ρ

2, β = θ

√1 + ρ

2(85)

Equation (76) really is the most general formula for the rotation matrix R corresponding to a skew-symmetric

16

matrix S. To repeat that formula,

R =(β2 cos(α)−α2 cos(β)

β2−α2

)I +

(β2 sin(α)/α−α2 sin(β)/β

β2−α2

)S

+(

cos(α)−cos(β)β2−α2

)S2 +

(sin(α)/α−sin(β)/β

β2−α2

)S3

(86)

This equation was constructed for when β > α > 0. However, when α = 0, Equation(86) reduces to Equation(65), where it is necessary that β = 1,

R = I + S +(

1−cos(β)β2

)S2 +

(β−sin(β)

β3

)S3 (87)

The evaluation of sin(α)/α at α = 0 is done in the limiting sense, limα→0 sin(α)/α = 1. Equation (83) wasfor the case β = α but may be obtained also from Equation (86) in the limiting sense as β → α. l’Hopital’sRule may be applied to the coefficients of I, S, S2, and S3 to obtain the coefficients in Equation (83),

R =(

2 cos(α)+α sin(α)2

)I +

(3 sin(α)−α cos(α)

2α

)S

+(

sin(α)2α

)S2 +

(sin(α)−α cos(α)

2α3

)S3

(88)

4.4 Source Code for Random Generation

Here is some sample code to illustrate Equation (86) when β 6= α.

vo id RandomRotat ion4D BetaNotEqualAlpha ( ){

double a = Mathd : : SymmetricRandom ( ) ; // number i s i n [−1 ,1)double b = Mathd : : SymmetricRandom ( ) ; // number i s i n [−1 ,1)double c = Mathd : : SymmetricRandom ( ) ; // number i s i n [−1 ,1)double d = Mathd : : SymmetricRandom ( ) ; // number i s i n [−1 ,1)double e = Mathd : : SymmetricRandom ( ) ; // number i s i n [−1 ,1)double f = Mathd : : SymmetricRandom ( ) ; // number i s i n [−1 ,1)Matr ix4d S(

0 . 0 , a , b , d ,−a , 0 . 0 , c , e ,−b , −c , 0 . 0 , f ,−d , −e , −f , 0 . 0

) ;

double t h e t a = Mathd : : Sq r t ( a∗a + b∗b + c∗c + d∗d + e∗e + f ∗ f ) ;double i n vLeng th = 1.0/ th e t a ;a ∗= invLeng th ;b ∗= invLeng th ;c ∗= invLeng th ;d ∗= invLeng th ;e ∗= invLeng th ;f ∗= invLeng th ;

double d e l t a = a∗ f − b∗e + c∗d ;double d e l t a 2 = d e l t a ∗ d e l t a ;double p = Mathd : : Sq r t ( 1 . 0 − 4.0∗ d e l t a 2 ) ;double a lpha = the t a∗Mathd : : Sq r t ( 0 . 5∗ ( 1 . 0 − p ) ) ;double beta = the t a∗Mathd : : Sq r t ( 0 . 5∗ ( 1 . 0 + p ) ) ;double invDenom = 1 .0/ ( beta∗beta − a lpha∗ a lpha ) ;double cosa = Mathd : : Cos ( a lpha ) ;double s i n a = Mathd : : S in ( a lpha ) ;double cosb = Mathd : : Cos ( beta ) ;double s i n b = Mathd : : S in ( beta ) ;

double k0 = ( beta∗beta∗ cosa − a lpha∗ a lpha∗ cosb )∗ invDenom ;

17

double k1 = ( beta∗beta∗ s i n a / a lpha − a lpha∗ a lpha∗ s i n b / beta )∗ invDenom ;double k2 = ( cosa − cosb )∗ invDenom ;double k3 = ( s i n a / a lpha − s i n b / beta )∗ invDenom ;

Matr ix4d I = Matr ix4d : : IDENTITY ;Matr ix4d S2 = S∗S ;Matr ix4d S3 = S∗S2 ;Matr ix4d R = k0∗ I + k1∗S + k2∗S2 + k3∗S3 ; // The random r o t a t i o n mat r i x .

// San i t y checks .Matr ix4d S4 = S∗S3 ;double t h e t a2 = the t a∗ t h e t a ;double t h e t a4 = the ta2∗ t h e t a2 ;Matr ix4d ze ro0 = S4 + the ta2∗S2 + ( the t a4∗ d e l t a 2 )∗ I ; // = the z e r o mat r i xdouble one = R . Determinant ( ) ; // = 1Matr ix4d ze ro1 = R . TransposeTimes (R) − I ; // = the z e r o mat r i x

}

Here is some sample code to illustrate the case when β = α.

vo id RandomRotat ion4D BetaEqualAlpha ( ){

double a = Mathd : : SymmetricRandom ( ) ; // number i s i n [−1 ,1)double b = Mathd : : SymmetricRandom ( ) ; // number i s i n [−1 ,1)double c = Mathd : : SymmetricRandom ( ) ; // number i s i n [−1 ,1)double mult = Mathd : : Sq r t ( 0 . 5 ) /Mathd : : Sq r t ( a∗a + b∗b + c∗c ) ;a ∗= mult ;b ∗= mult ;c ∗= mult ;double d = c ;double e = −b ;double f = a ;double t h e t a = Mathd : : SymmetricRandom ( ) ;Matr ix4d S(

0 . 0 , a , b , d ,−a , 0 . 0 , c , e ,−b , −c , 0 . 0 , f ,−d , −e , −f , 0 . 0

) ;S ∗= the t a ;

double l e n s q r = a∗a + b∗b + c∗c + d∗d + e∗e + f ∗ f ; // = 1double d e l t a = a∗ f − b∗e + c∗d ; // = 1/2double d e l t a 2 = d e l t a ∗ d e l t a ; // = 1/4double d i s c r = 1 .0 − 4.0∗ d e l t a 2 ; // = 0double p = Mathd : : Sq r t (Mathd : : FAbs ( d i s c r ) ) ; // = 0double a lpha = the t a∗Mathd : : Sq r t ( 0 . 5 ) ;double cosa = Mathd : : Cos ( a lpha ) ;double s i n a = Mathd : : S in ( a lpha ) ;

double k0 = (2 .0∗ cosa + a lpha∗ s i n a ) / 2 . 0 ;double k1 = (3 .0∗ s i n a − a lpha∗ cosa )/ (2 . 0∗ a lpha ) ;double k2 = s i n a /(2 .0∗ a lpha ) ;double k3 = ( s i n a − a lpha∗ cosa )/ (2 . 0∗ a lpha∗ a lpha∗ a lpha ) ;

Matr ix4d I = Matr ix4d : : IDENTITY ;Matr ix4d S2 = S∗S ;Matr ix4d S3 = S∗S2 ;Matr ix4d R = k0∗ I + k1∗S + k2∗S2 + k3∗S3 ; // The random r o t a t i o n mat r i x .

// San i t y checks .Matr ix4d S4 = S∗S3 ;double t h e t a2 = the t a∗ t h e t a ;double t h e t a4 = the ta2∗ t h e t a2 ;Matr ix4d ze ro0 = S4 + the ta2∗S2 + ( the ta4∗ d e l t a 2 )∗ I ; // = the z e r o mat r i xdouble one = R . Determinant ( ) ; // = 1Matr ix4d ze ro1 = R . TransposeTimes (R) − I ; // = the z e r o mat r i x

}

18

5 Fixed Points, Invariant Spaces, and Direct Sums

A rotation in IR2 is sometimes described as “rotation about a point”, where the point is the origin–a 0-dimensional quantity living in a 2-dimensional space. A rotation in IR3 is sometimes described as “rotationabout an axis”, where the axis is a line–a 1-dimensional quantity in a 3-dimensional space. Sometimes youwill hear someone attempt to extend these geometric descriptions to a rotation in IR4 by saying that youhave a “rotation about a plane”, the plane being a 2-dimensional quantity living in a 4-dimensional space.This description is not meaningful, and in fact is not technically correct. It is better to describe the rotationsin terms of fixed points and invariant spaces.

In the following discussion, I assume that a rotation (rotation matrix) is not the identity (identity matrix).

5.1 Fixed Points

Let F(X) be a vector field; that is, F : IRn → IRn, which states that F is a function that maps n-tuples ton-tuples.

If F is a linear function, then it may be represented by a matrix M ; that is, F(X) = MX. This rep-resentation is assumed to be with respect to the standard Euclidean basis for IRn, which is the set ofvectors {ek}nk=1 for which ek has a 1 in component k of the vector and all other components are 0.In IR2, the basis is {(1, 0), (0, 1)}. In IR3, the basis is {(1, 0, 0), (0, 1, 0), (0, 0, 1)}. In IR4, the basis is{(1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0), (0, 0, 0, 1)}. Other matrix representations are possible when you choose dif-ferent bases.

A point P is said to be a fixed point for the function when F(P) = P. For a linear function, MP = P.

The only fixed point for rotation in IR2 is the origin 0. If R is the 2× 2 matrix that represents the rotation,then R0 = 0. For all vectors X 6= 0, RX 6= X.

A rotation in IR3 has infinitely many fixed points, all lying on the axis of rotation. If the axis is the linecontaining the origin and having direction U, then a point on the axis is tU for some scalar t. If R is the3× 3 matrix that represents the rotation about the axis, then R(tU) = tU.

5.2 Invariant Spaces

Let V ⊆ IRn be a vector subspace. The trivial subspaces are {0}, the set consisting of only the origin, andIRn, the entire set of n-tuples. In 2D, the nontrivial subspaces are lines containing the origin. In 3D, thenontrivial subspaces are lines and planes containing the origin.

Let F(X) be a linear function from IRn to IRn. Define F(V ) to be the set of all n-tuples that are mapped toby n-tuples in V ; that is, Y ∈ F(V ) means that Y = F(X) for some X ∈ V .

V is said to be an invariant space relative to F when F(V ) ⊆ V ; that is, any vector in V is transformed byF to a vector in V . It is not required that a nonzero vector in V be mapped to itself, so an invariant spacemight not have any nonzero fixed points.

The trivial subspaces are always invariant under linear transformations. If F(X) = MX for an n×n matrixM , then M0 = 0, in which case 0 is a fixed point and the set {0} is invariant. Every vector X ∈ IRn

19

is mapped to a vector MX ∈ IRn, so IRn is invariant. The more interesting question is whether there arenontrivial invariant subspaces.

The only invariant subspaces for a rotation in 2D are the trivial ones. A line containing the origin is anontrivial subspace, but a rotation causes a line to rotate to some other line, so the original line is notinvariant.

A rotation in 3D has two nontrivial invariant subspaces, the axis of rotation and the plane perpendicular toit. For example, consider the canonical rotation matrix

R =

cos θ − sin θ 0

sin θ cos θ 0

0 0 1

(89)

The axis of rotation is the z-axis. Points in the xy-plane are rotated to other points in the xy-plane. Thus,the z-axis is an invariant subspace and the xy-plane is an invariant subspace.

5.3 Direct Sums

Let V1 through Vm be nontrivial subspaces of IRn. If every vector X ∈ IRn can be represented uniquely as

X =

m∑i=1

Xi, Xi ∈ Vi (90)

then IRn is said to be a direct sum of the subspaces, denoted by

IRn = V1 ⊕ · · · ⊕ Vm =

m⊕i=1

Vi (91)

For example, if V2 is the set of points in the xy-plane and V2 is the set of points on the z-axis, thenIR3 = V1⊕V2. A vector is represented as (x, y, z) = (x, y, 0) + (0, 0, z), where (x, y, 0) ∈ V1 and (0, 0, z) ∈ V2.

Let V1 through Vm be nontrivial subspaces of IRn. Let Fi : Vi → Vi be linear transformations on thesubspaces; that is, Vi is invariant with respect to Fi. Let F : IRn → IRn be a linear transformation for whichF(X) = Fi(X) whenever X ∈ Vi. F is said to be a direct sum of the linear functions, denoted by

F = F1 ⊕ · · · ⊕ Fm =

m⊕i=1

Fi (92)

Let the dimension of Vi be denoted by di = dim(Vi). Necessarily, n = d1 + · · · dm. If Fi(X) = MiXfor matrices Mi, where Mi is a di × di matrix, then F(X) = MX where M is an n × n matrix andM = Diag(M1, . . . ,Mm).

Using our example of a 3D rotation about the z-axis, V1 is the set of points in the xy-plane, V2 is the set ofpoints on the z-axis, and IR3 = V1 ⊕ V2. We may choose

M1 =

cos θ − sin θ

sin θ cos θ

, M2 =[

1]

(93)

20

in which case the 3D rotation matrix R is

R = M1 ⊕M2 = Diag(M1,M2) =

cos θ − sin θ 0

sin θ cos θ 0

0 0 1

(94)

More generally, let the axis of rotation have unit-length direction C. Let A and B be vectors that spanthe plane perpendicular to the axis, but choose them so that {A,B,C} is an orthonormal set (vectors areunit-length and mutually perpendicular). The invariant subspaces are

V1 = {xA + yB : (x, y) ∈ IR2}, V2 = {zC : z ∈ IR} (95)

and {A,B,C} is a basis for IR3. The coordinates for this basis are X = [x y z]T and the rotation matrixrelative to this basis is R of Equation (94). The rotated vector is X′ = RX. To see the representation inCartesian coordinates, define the matrix P = [ABC] whose columns are the specified vectors. Any vectorY ∈ IR3 may be represented as

Y = PX (96)

for some vector X. The inverse of P is just its transpose, so X = PTY. The rotated vector is

Y′ = PX′ = PRX = PRPTY (97)

Thus, the rotation matrix in Cartesian coordinates is

R′ = PRPT (98)

The question now is how do these concepts extend to 4D rotations. The next section describes this.

5.4 Rotations in 4D

We found that the skew-symmetric matrix S of Equation (61) has eigenvalues ±iαθ and ±iβθ, where α andβ are real numbers with 0 ≤ α ≤ β. The factorization is

S = P

0 αθ 0 0

−αθ 0 0 0

0 0 0 βθ

0 0 −βθ 0

PT (99)

for some orthogonal matrix P . The corresponding rotation matrix is

R = P

cos(αθ) sin(αθ) 0 0

− sin(αθ) cos(αθ) 0 0

0 0 cos(βθ) sin(βθ)

0 0 − sin(βθ) cos(βθ)

PT (100)

21

The problem is to determine what is P . This requires determining the invariant subspaces of IR4 relativeto the rotation R or, equivalently, relative to the skew-symmetric S. There will be two nontrivial invariantsubspaces, each of dimension 2. Let V1 = span(U1,U2), which will be the invariant subspace that correspondsto α, and V2 = span(U3,U4), which be the invariant subspace that corresponds to β, where the four spanningvectors form an orthonormal set. P may be constructed as the orthogonal matrix whose columns are thesefour vectors,

P =[U1 U2 U3 U4

](101)

Recall that α =√

(1− ρ)/2 and β =√

(1 + ρ)/2 for some ρ ∈ [0, 1]. Also, δ = af − be+ cd, δ2 = α2β2, and1− 4δ2 = ρ2. Finally, we have always assumed that a2 + b2 + c2 + d2 + e2 + f2 = 1.

5.4.1 The Case (d, e, f) = (0, 0, 0)

The condition (d, e, f) = (0, 0, 0) reduces S to the 3D case that we already understand. The skew-symmetricmatrix is

S =

0 a b 0

−a 0 c 0

−b −c 0 0

0 0 0 0

(102)

Observe that δ = 0, α = 0, β = 1, and a2 + b2 + c2 = 1. By observation,

U1 =

c

−b

a

0

, U2 =

0

0

0

1

(103)

It is easily verified that SU1 = −0U2 = 0 and SU2 = 0U1 = 0.

When (b, c) 6= (0, 0), we may choose

U3 =1√

b2 + c2

b

c

0

0

(104)

This unit-length vector is perpendicular to U1 and U2. Let ek be the vector with a 1 in component k and0 in all other components. We may then compute

U4 = det

e1 e2 e3

c −b a

b√b2+c2

c√b2+c2

0

+ 0e4 =1√

b2 + c2

−ac

ab

b2 + c2

0

(105)

22

When (b, c) = (0, 0), it must be that a = 1, in which case we may choose

U3 =

1

0

0

0

, U4 =

0

1

0

0

(106)

so that P is the identity matrix.

In either case for (b, c), it is easily verified that SU3 = −1U4 and SU4 = 1U3.

5.4.2 The Case (a, b, c) = (0, 0, 0)

The condition (a, b, c) = (0, 0, 0) implies that the skew-symmetric matrix is

S =

0 0 0 d

0 0 0 e

0 0 0 f

−d −e −f 0

(107)

Observe that δ = 0, α = 0, β = 1, and d2 + e2 + f2 = 1. By observation,

U3 =

d

e

f

0

, U4 =

0

0

0

1

(108)

It is easily verified that SU3 = −1U4 and SU4 = 1U3.

When (d, e) 6= (0, 0), we may choose

U1 =1√

d2 + e2

e

−d

0

0

(109)

This unit-length vector is perpendicular to U3 and U4. Let ek be the vector with a 1 in component k and0 in all other components. We may then compute

U2 = det

e1 e2 e3

d e f

e√d2+e2

−d√d2+e2

0

+ 0e4 =1√

d2 + e2

df

ef

−d2 − e2

0

(110)

23

When (d, e) = (0, 0), it must be that f = 1, in which case we may choose

U1 =

1

0

0

0

, U2 =

0

1

0

0

(111)

so that P is the identity matrix.

5.4.3 The Case (d, e, f) 6= 0, (a, b, c) 6= 0, and δ = 0

Consider the case 0 = δ = af − be + cd = (c,−b, a) · (d, e, f); then α = 0 and β = 1. Let ek be the vectorwith a 1 in component k and 0 in all other components. We may formally construct an eigenvector for β = 1using the determinant,

V = det

e1 e2 e3 e4

−i a b d

−a −i c e

−b −c −i f

= [(−d) + (bf + ae)i]e1 + [(−e) + (cf − ad)i]e2+

[(−f)− (bd+ ce)i]e3 + [(a2 + b2 + c2 − 1)i]e4

(112)

Using the results of Section 1.3, V = X + iY with

X =

−d

−e

−f

0

, Y =

bf + ae

cf − ad

−bd− ce

a2 + b2 + c2 − 1

(113)

The vectors are orthogonal and have the same length. To see this, we know that SX = −βθY and SY = βθX.First,

XTSX = −θXTY (114)

The left-hand side is a scalar, so its transpose is just the scalar itself,

XTSX = (XTSX)T = XTSTX = −XTSX (115)

The only way a scalar and its negative can be the same is when the scalar is zero. Thus, XTSX = 0, whichimplies XTY = 0, so the vectors are orthogonal. Second,

θXTX = XTSY = (XTSY)T = YTSTX = −YTSX = θYTY (116)

which implies |X| = |Y|.

24

Two real-valued eigenvectors for ±i that are orthogonal and unit-length are the normalized −X and −Y,

U3 =1√

d2 + e2 + f2

d

e

f

0

, U4 =1√

d2 + e2 + f2

−ae− bf

ad− cf

bd+ ce

d2 + e2 + f2

(117)

where we have used a2 + b2 + c2 + d2 + e2 + f2 = 1.

The condition δ = 0 allows us to choose

U1 =1√

a2 + b2 + c2

c

−b

a

0

(118)

It is easily verified that U1 is perpendicular to both U3 and U4. We may choose the final vector using

U2 = −1(d2+e2+f2)

√a2+b2+c2

det

e1 e2 e3 e4

d e f 0

−ae− bf ad− cf bd+ ce d2 + e2 + f2

c −b a 0

= 1√a2+b2+c2

ae+ bf

−ad+ cf

−bd− ce

a2 + b2 + c2

(119)

5.4.4 The Case (d, e, f) 6= 0, (a, b, c) 6= 0, and δ 6= 0

In our construction when the rotation is not the identity matrix, we know that β > 0. The eigenvaluesiβθ and −iβθ each have 1-dimensionsal eigenspaces. The eigenvectors for iβθ must satisfy the equation(S − iβθI)V = 0. Because the eigenspace is 1-dimensional, the matrix S − iβθI has rank 3. Equivalently,the matrix S/θ−iβI has rank 3. Let ek be the vector with a 1 in component k and 0 in all other components.

25

We may formally construct an eigenvector using the determinant,

V = det

e1 e2 e3 e4

−iβ a b d

−a −iβ c e

−b −c −iβ f

= [(cδ − dβ2) + β(bf + ae)i]e1 + [(−bδ − eβ2) + β(cf − ad)i]e2+

[(aδ − fβ2)− β(bd+ ce)i]e3 + [β(a2 + b2 + c2 − β2)i]e4

(120)

Using the results of Section 1.3, V = X + iY with

X =

cδ − dβ2

−bδ − eβ2

aδ − fβ2

0

, Y = β

bf + ae

cf − ad

−bd− ce

a2 + b2 + c2 − β2

(121)

The vectors are orthogonal and have the same length. To see this, we know that SX = −βθY and SY = βθX.First,

XTSX = −βθXTY (122)

The left-hand side is a scalar, so its transpose is just the scalar itself,

XTSX = (XTSX)T = XTSTX = −XTSX (123)

The only way a scalar and its negative can be the same is when the scalar is zero. Thus, XTSX = 0, whichimplies XTY = 0, so the vectors are orthogonal. Second,

βθXTX = XTSY = (XTSY)T = YTSTX = −YTSX = βθYTY (124)

which implies |X| = |Y|. We may choose

U3 =X

|X|, U4 =

Y

|Y|(125)

We will construct vectors U1 and U2 for α next.

The Case α < β. The same construction applies to α as it did to β, producing

X′ =

cδ − dα2

−bδ − eα2

aδ − fα2

0

, Y′ = α

bf + ae

cf − ad

−bd− ce

a2 + b2 + c2 − α2

(126)

and

U1 =X′

|X′|, U2 =

Y′

|Y′|(127)

26

The Case α = β. When α = β = 1/√

2, it must be that ρ = 0, δ = σ/2 for σ ∈ {−1,+1}, (d, e, f) =σ(c,−b, a), and a2 + b2 + c2 = 1/2 = d2 + e2 + f2. The skew-symmetric matrix is

S = θ

0 a b σc

−a 0 c −σb

−b −c 0 σa

−σc σb −σa 0

(128)

Observe that S2 + (θ2/2)I = 0. Define S = S/θ. When solving for eigenvectors for iα = i/√

2, we solve(S− (i/

√2)I)V = 0. Because the eigenspace is 2-dimensional, the first two rows of S− (i/

√2)I are linearly

independent and must be orthogonal to the eigenvectors of i/√

2. However, this means that the first tworows are in the orthogonal complement of the eigenspace for i/

√2, so these rows are linearly independent

eigenvectors for −i/√

2.

Consider the first row of S−(i/√

2)I. The real and imaginary parts are the candidates we want for producingU3 and U4. Specifically, the normalized real and imaginary part are

U3 =√

2

0

a

b

σc

, U4 =

−1

0

0

0

(129)

A similar argument applied to (−b,−c,−i/√

2, σa) leads to

U1 =√

2

−b

−c

0

σa

, U2 =

0

0

−1

0

(130)

5.4.5 A Summary of the Factorizations

In all cases, we have assumed a2 + b2 + c2 + d2 + e2 + f2 = 1 so that S 6= 0. The factorization is

S = P

0 α 0 0

−α 0 0 0

0 0 0 β

0 0 −β 0

PT (131)

27

The case (d, e, f) = (0, 0, 0) and (b, c) 6= (0, 0):

S =

0 a b 0

−a 0 c 0

−b −c 0 0

0 0 0 0

, P =

c 0 b√

b2+c2−ac√b2+c2

−b 0 c√b2+c2

ab√b2+c2

a 0 0√b2 + c2

0 1 0 0

(132)

The case (d, e, f) = (0, 0, 0) and (b, c) = (0, 0):

S =

0 1 0 0

−1 0 0 0

0 0 0 0

0 0 0 0

, P =

0 0 1 0

0 0 0 1

1 0 0 0

0 1 0 0

(133)

The case (a, b, c) = (0, 0, 0) and (d, e) 6= (0, 0):

S =

0 0 0 d

0 0 0 e

0 0 0 f

−d −e −f 0

, P =

e√

d2+e2df√d2+e2

d 0

−d√d2+e2

ef√d2+e2

e 0

0 −√d2 + e2 f 0

0 0 0 1

(134)

The case (a, b, c) = (0, 0, 0) and (d, e) = (0, 0):

S =

0 0 0 0

0 0 0 0

0 0 0 1

0 0 −1 0

, P =

1 0 0 0

0 1 0 0

0 0 1 0

0 0 0 1

(135)

The case (a, b, c) 6= (0, 0, 0), (d, e, f) 6= (0, 0, 0), and δ = 0:

P =

c√a2+b2+c2

ae+bf√a2+b2+c2

d√d2+e2+f2

−ae−bf√d2+e2+f2

−b√a2+b2+c2

−ad+cf√a2+b2+c2

e√d2+e2+f2

ad−cf√d2+e2+f2

a√a2+b2+c2

−bd−ce√a2+b2+c2

f√d2+e2+f2

bd+ce√d2+e2+f2

0√a2 + b2 + c2 0

√d2 + e2 + f2

(136)

28

The case (a, b, c) 6= (0, 0, 0), (d, e, f) 6= (0, 0, 0), δ = 0, α < β:

P =

cδ−dα2

`α(bf+ae)

`cδ−dβ2

mβ(bf+ae)

m

−bδ−eα2

`α(cf−ad)

`−bδ−eβ2

mβ(cf−ad)

m

aδ−fα2

`α(−bd−ce)

`aδ−fβ2

mβ(−bd−ce)

m

0 α(a2+b2+c2−α2)` 0 β(a2+b2+c2−β2)

m

(137)

where ` =√

(cδ − dα2)2 + (bδ + eα2)2 + (aδ − fα2)2 and m =√

(cδ − dβ2)2 + (bδ + eβ2)2 + (aδ − fβ2)2.

The case (a, b, c) 6= (0, 0, 0), (d, e, f) 6= (0, 0, 0), δ = 0, α = β:

S =

0 a b σc

−a 0 c −σb

−b −c 0 σa

−σc σb −σa 0

(138)

and

P =

−b√

2 0 0 −1

−c√

2 0 a√

2 0

0 −1 b√

2 0

σa√

2 0 σc√

2 0

(139)

References

[Bra83] Martin Braun. Differential Equations and Their Applications. Springer-Verlag, New York, NY, 3rdedition, 1983.

[Ebe03] David Eberly. Game Physics. Morgan Kaufmann, San Francisco, CA, 2003.

[HJ85] Roger A. Horn and Charles R. Johnson. Matrix Analysis. Cambridge University Press, Cambridge,England, 1985.

[HS74] Morris W. Hirsch and Stephen Smale. Differential Equations, Dynamical Systems, and LinearAlgebra. Academic Press, Orlando, FL, 1974.

29

constructing rotation matrices using power series€¦ · such matrices have only real-valued...

Documents