relativity and covariant electromagnetism · relativity and covariant formulation of the...
TRANSCRIPT
Relativity and Covariant Electromagnetism
Ismael Rodrigues SilvaQueen Mary, University of London
August 2014
Preface
This work is part of my exchange programme in the United Kingdom,written during the summer of 2014, on my year abroad at Queen Mary,University of London. The topics covered are brief introductions to SpecialRelativity and covariant formulation of the Electromagnetism, includingthe consequences from the postulates of Special Relativity, and a step-by-step explanation of Tensor Calculus and Relativistic Mechanics.
Acknowledgements
To my family, which has never stopped believing me, specially mymother Tania. To my advisor at Queen Mary, University of London, Dr.Alston Misquitta, and to my advisor and counselor at Universidade Fed-eral de Santa Catarina, Dr. Marco Kneipp. To my friends, in particularAugusto, Luciano, Madlene, Deborah, Antonio and Maique, who, fromBrazil, have been supporting me in all moments. To my friends whohelped me through my quick journey in the United Kingdom, Fernanda,Cos, Vasily, Vladimir, Ivan, Elena, Marieta, Sophia, Kristina and Mane-tou. To my sponsor, CNPq. To Dr. Brian Wecht from Queen Mary, whonot only tought me Statistical Physics, but also what a lecturer must belike.
1 The Theory of Relativity
1.1 Historical Background
The basis on which Einstein built the special theory of relativity was the factthat Maxwell’s equations predict that the speed of propagation of the electro-magnetic waves is a universal constant, independent of the motion of the sourceor of the detector of the waves. The two postulates of special relativity, formu-lated by Einstein in 1905, are:
1. The laws of physics are the same in all inertial frames of reference;
2. The speed of light in free space has the same value c in all inertial framesof reference.
1
An inertial frame of reference is a frame in which a freely moving bodyproceeds with constant velocity, that is, a frame in which Newton’s first law ofmotion holds or, in other words, in which the velocity of any particle remainsconstant unless there is a net force acting on it. If a system moves with constantvelocity with respect to an inertial reference system, then it is also inertial.
Ordinary mechanics assumes that the propagation of interactions of materialparticles is instantaneous. Experiments show, however, that there is no instan-taneous interaction in nature: there is a finite maximum speed of propagation ofinteraction, which implies that motions of bodies with greater speed are impos-sible, for if such a motion could occur, then by means of it one could realise aninteraction with a speed exceeding the maximum possible speed of propagationof interaction. From the second postulate, it follows that this maximum speedis the same in all inertial systems of reference. This universal constant, whichis also the speed of light in free space, designated by c, exactly given by1
c = 2.99792458 · 108m/s. (1)
The mechanics based on the principle of relativity stated above is said tobe relativistic. If the speeds involved are much less than c, the mechanics iscalled classical or Newtonian. Time is absolute in classical mechanics, andso there is one time for all reference frames, what makes simultaneity is anabsolute concept. This is a contradiction in special relativity though. If we usethe general law of combination of velocities to the propagation of interaction,then the speed of propagation would be different in different inertial frames ofreference. Once time is not absolute, simultaneous events in one frame may notbe simultaneous in other frames.
The principle of relativity introduces then drastic and fundamental changesin basic physical concepts. The notion of space and time which we have areonly approximations due to the fact that the speeds with which we deal dailyare very small compared to the speed of light.
1.2 Intervals
An event is described by the place where it occurred and time when it occurred.It is useful to use a four-dimensional space, whose spatial axes are x, y, z andtemporal axis is ct. In this space, events are points (ct1, x1, y1, z1) called worldpoints, and there corresponds to each particle a line, called world line.2
Consider two inertial reference systems K and K ′, with axes (ct, x, y, z) and(ct’, x’, y’, z’) respectively, moving relative to each other with constant velocity.Suppose that the frames coincide at t = t′ = 0, and consider a flash of lightemanating from their common origin at the instant they coincide. Therefore,
1Originally, one metre was intended to be one ten-millionth of the distance from the Earth’sequator to the North Pole, but since 1983 it has been defined as the length of the path travelledby light in vacuum during a time interval of 1/299,792,458 of a second.
2It is easy to show that to a particle in uniform rectilinear motion there corresponds astraight world line.
2
remembering that the distance travelled by the wave is given by the product ofits speed and the interval of time, the spherical wave front described in K by
x2 + y2 + z2 = (ct)2 (2)
will be described in K ′ by
x′2 + y′2 + z′2 = (ct′)2. (3)
In other words,
c2t2 − x2 − y2 − z2 = 0⇔ c2t′2 − x′2 − y′2 − z′2 = 0. (4)
Homogeneity of space and time and isotropy of space require that the relation-ship between (ct, x, y, z) and (ct′, x′, y′, z′) is linear. In fact, a general linearrelation can be used to find the equations for the transformations of the coor-dinates, but this will be done later in a different simpler way.
Relation (4) motivates us to define the interval s12 between two events asthe scalar
s212 = c2(t2 − t1)2 − (x2 − x1)2 − (y2 − y1)2 − (z2 − z1)2, (5)
where (ct1, x1, y1, z1) are the coordinates of the first event and (ct2, x2, y2, z2)are the coordinates of the second event. The interval can be regarded as thedistance between two world points in our four-dimensional space. If the eventsare infinitely close to each other, the infinitesimal interval ds between them isgiven by3
ds2 = c2dt2 − dx2 − dy2 − dz2. (6)
Expression (5) allows us to have either s212 = 0 or s212 > 0 or s212 < 0. If
c2(t2 − t1)2 > (x2 − x1)2 + (y2 − y1)2 + (z2 − z1)2, (7)
then s212 > 0, and the real number s12 is said to be timelike, and there exists acoordinate system in which the two events occur at the same point in space. If
c2(t2 − t1)2 < (x2 − x1)2 + (y2 − y1)2 + (z2 − z1)2, (8)
then s212 < 0, so s12 is imaginary and is said to be spacelike, and there exists acoordinate system in which the two events occur simultaneously. Finally, if
c2(t2 − t1)2 = (x2 − x1)2 + (y2 − y1)2 + (z2 − z1)2, (9)
then the interval is equal to zero and is said to be null or lightlike.The equivalence in (4) implies that, if the interval in K is null, then so is
the interval in K ′. In other words, it is invariant in this case. It turns out thatthe interval, which is a scalar, is always invariant, as we shall see later using theconcept of four-vectors. Until we get there, assume the interval is invariant.
3This geometry was introduced by H. Minkowski and is called pseudo-euclidean, and thefour-dimensional space mentioned is called Minkowski space.
3
1.3 Proper time
The proper time of an object is defined as the time read by a clock moving withthis object. The proper time interval between two events will be therefore theinterval of time measured in a reference frame in which the two events occur atthe same point in space. Let us use the Greek letter τ to describe the propertime.
Consider the same reference systems K and K ′ moving relative to eachother with constant velocity v, and suppose there is a clock at rest in K ′. Theinfinitesimal interval in K is given by
ds2 = c2dt2 − dx2 − dy2 − dz2. (10)
On the other hand, the clock is at rest in K ′, so that dx′ = dy′ = dz′ = 0, andthe time measured is the proper time. Therefore, being constant the interval,we also have
ds2 = c2dτ2, (11)
and so
c2dτ2 = c2dt2 − dx2 − dy2 − dz2 (12)
or
dτ = dt
√1− dx2 + dy2 + dz2
c2dt2= dt
√1− v2
c2, (13)
since
dx2 + dy2 + dz2
dt2= v2, (14)
where v = |v| is the relative speed between the reference systems K and K ′.Let us now define the velocity coefficient
β ≡ v
c(15)
and the Lorentz factor or Lorentz term
γ ≡ 1√1− v2
c2
=1√
1− β2, (16)
where β = |β| = v/c. This way, equation (13) can be written as
dτ =1
γdt, (17)
and we have the important relation4
4Mathematically, just compare equation (18) with the equation dτ = dτdtdt for the differ-
ential dτ .
4
dτ
dt=
1
γ. (18)
Supposing v is constant, one can integrate (18) and obtain the time intervalindicated by the moving clock in K:∫ t2
t1
dτ
dtdt = τ(t2)− τ(t1) ≡ τ2 − τ1 =
∫ t2
t1
1
γdt =
1
γ(t2 − t1) (19)
or
∆τ =1
γ∆t ≤ ∆t, (20)
once v is always less than or equal to c, so that 0 < 1/γ ≤ 1. Therefore, weconclude that the proper time interval of a moving object is always less thanthe corresponding interval in the rest system. In other words, moving clocksrun slow.
According to (11), we also have dτ = ds/c, so the time interval read by theclock in K ′ is also given by
τ2 − τ1 =1
c
∫ b
a
ds, (21)
taken along the world line of the clock. But, since the clock at rest alwaysindicates a greater time interval than the moving one, we conclude that∫ b
a
ds (22)
has its maximum value if it is taken along the straight world line joining thepoints a and b.
1.4 The Lorentz Transformation
We wish to derive now the formulae for the transformation of coordinates fromone inertial system to another. Consider the same inertial frames K and K ′
with axes ct, x, y, z and ct′, x′, y′, z′ respectively, and suppose that the axes xand x′ are coincident. Let v be the speed of K ′ relative to K, and suppose thatthe origins of the two systems coincide at times t = t′ = 0. We will define thissituation as a boost in x-direction. According to classical mechanics, for theboost described we would have
x′ = x− vt, y′ = y, z′ = z, t′ = t, (23)
or, in matrix form, t′
x′
y′
z′
=
1 0 0 0−v 1 0 00 0 1 00 0 0 1
txyz
, (24)
5
which is called Galilean transformation and is clearly inconsistent with the prin-ciple of relativity once it does not remain constant the interval.
Since the interval can be regarded as the distance between two world pointsin our four-dimensional space, the transformation we seek must be expressiblemathematically as a rotation in this space. Let us consider a rotation in the txplane, so that c2t2 − x2 must be invariant. In the most general case, we have
ct′ = ct cosh ζ − x sinh ζ, x′ = −ct sinh ζ + x cosh ζ (25)
where ζ is called rapidity5. In matrix notation, this is written asct′
x′
y′
z′
=
cosh ζ − sinh ζ 0 0− sinh ζ cosh ζ 0 0
0 0 1 00 0 0 1
ctxyz
, (26)
likewise for a rotation about the z-axisct′
x′
y′
z′
=
1 0 0 00 cos θ sin θ 00 − sin θ cos θ 00 0 0 1
ctxyz
, (27)
so that ζ can be interpreted as a four-dimensional angle of rotation in the txplane.
We wish now to determine ζ, which depends on v. But that is trivial: justconsider the motion, in K, of the origin of K ′. We have then x′ = 0, so thesecond equation in (25) gives us
ct sinh ζ = x cosh ζ (28)
or
x = (c tanh ζ)t, (29)
so that
v =dx
dt= c tanh ζ (30)
or
tanh ζ =v
c= β. (31)
One can now easily find
sinh ζ =β√
1− β2, cosh ζ =
1√1− β2
, (32)
5One can easily check, using the identity cosh2 ζ − sinh2 ζ = 1, that (25) maintains truethe equation c2t2 − x2 = c2t′2 − x′2.
6
or simply
sinh ζ = γβ, cosh ζ = γ. (33)
Using this result in (25) we find our transformation of coordinates, which iscalled Lorentz transformation6:
ct′ = γ (ct− βx) , x′ = γ(x− βct), y′ = y, z′ = z, (34)
or, in terms of v,
t′ =t− vx
c2√1− v2
c2
, x′ =x− vt√1− v2
c2
, y′ = y, z′ = z. (35)
Note that, if v > c, then x and t are imaginary, what is physically meaningless.If v c, we have the classical mechanics equations, what also happens whenone supposes c → ∞. The formulae expressing the coordinates from K as afunction of the ones from K ′, called inverse transformation7, are obtained from(35) simply by changing v to −v:
t =t′ + vx′
c2√1− v2
c2
, x =x′ + vt′√
1− v2
c2
, y = y′, z = z′. (36)
In matrix notation, we can write the Lorentz transformation in (34) asct′
x′
y′
z′
=
γ −βγ 0 0−βγ γ 0 0
0 0 1 00 0 0 1
ctxyz
, (37)
and for the inverse transformationctxyz
=
γ βγ 0 0βγ γ 0 00 0 1 00 0 0 1
ct′
x′
y′
z′
. (38)
The transformation matrix is often called Lorentz matrix or boost matrix. If theboost is in y-direction or z-direction, we would have, respectively8,
ct′
x′
y′
z′
=
γ 0 −βγ 00 1 0 0−βγ 0 γ 0
0 0 0 1
ctxyz
(39)
6The Lorentz transformation is in accordance with special relativity, but was derived beforespecial relativity. We will refer the transformation in (34), in which the coordinates from K′
are functions of the ones from K, as direct transformation.7Some authors define (36) as the direct transformation.8In both cases, v, and consequently β, change the sign for inverse transformation.
7
and ct′
x′
y′
z′
=
γ 0 0 −βγ0 1 0 00 0 1 0−βγ 0 0 γ
ctxyz
. (40)
1.5 Length Contraction and Time Dilation
Similarly to the definition of the proper time, the proper length of an object isits length in a reference system in which the body is at rest. The proper lengthbetween two events is the length measured in a reference frame in which thetwo events occur simultaneously.
Consider a boost in x-direction, and suppose there is a rod at rest in K ′
with ends at points x′1 and x′2 > x′1. The length of the rod in K ′ is thenL0 = x′2 − x′1, which is the proper length. In K, we have L = x2 − x1. UsingLorentz Transformation, we find
x′1 = γ (x1 − vt1) , x′2 = γ (x2 − vt2) , (41)
so that
L0 = x′2 − x′1 = γ(x2 − vt2 − x1 + vt1) (42)
or
L =L0
γ, (43)
since t2 = t1 in K once the length must be measured simultaneously at theends of the rod. This means that the greatest length of the rod is measured inthe system in which it is at rest, and the length decreases in a system in whichit moves with speed v. This is called length contraction or Lorentz-Fitzgeraldcontraction.
Similary, suppose once more there is a clock at rest in K ′. The proper timeinterval in K ′ is then given by ∆τ = τ2 − τ1. In K, the time interval is givenby ∆t = t2 − t1. Using inverse transformation9,
t1 = γ
(τ1 +
vx′1c2
), t2 = γ
(τ2 +
vx′2c2
), (44)
which implies
∆t = t2 − t1 = γ
(τ2 +
vx′2c2− τ1 −
vx′1c2
)(45)
or
9One may use direct transformation, remembering that, in this case, x2 − x1 = v(t2 − t1)in K.
8
∆t = γ∆τ, (46)
since x′1 = x′2 in K ′, for it was assumed that the clock is at rest there. This iscalled time dilation, since the time interval in a moving frame is greater thanthe one in the rest frame. Note that (46) agrees with the result found in (20).
1.6 Transformation of Velocity
Two consecutive Lorentz transformations depend, in general, on their order,just like the result of two rotations about different axes depends on the orderin which they are carried out.
Consider a boost in x-direction, letting v be the velocity of K ′ with respectto K, and consider a particle moving in K with velocity
u = (ux, uy, uz) =
(dx
dt,dy
dt,dz
dt
). (47)
In K ′, we have
u’ =(u′x, u
′y, u′z
)=
(dx′
dt′,dy′
dt′,dz′
dt′
). (48)
Using Lorentz Transformation, we obtain
dx′ = γ(dx− vdt), dy′ = dy, dz′ = dz, dt′ = γ
(dt− vdx
c2
), (49)
where v = |v|. Dividing the first three equations by the forth, we get
u′x =ux − v1− uxv
c2, u′y =
uy
γ(1− uxv
c2
) , u′z =uz
γ(1− uxv
c2
) , (50)
which are the transformation of velocity. The inverse transformation is obtainedby changing v to −v. Note that, setting c→∞ or v c, we have the classicaltransformation of velocity
u′x = ux − v, u′y = uy, u′z = uz, (51)
which is obtained by differentiating the first three equations in (23) with respectto t.
1.7 Lorentz Transformation in 3 Dimensions
For a boost in an arbitrary direction with velocity v, it is convenient to decom-pose the spatial column vector r = (x, y, z) into components perpendicular andparallel to v,
r = r⊥ + r‖, (52)
9
so that
r · v = r⊥ · v + r‖ · v = r‖v. (53)
This way, only the time and the component r‖ will transform, so, according to(35),
t′ = γ(t−
r‖v
c2
), r′ = r⊥ + γ
(r‖ − vt
). (54)
By substituting r⊥ = r− r‖ into the above expression for r′, we get
r′ = r + (γ − 1) r‖ − γvt. (55)
Since r‖ and v are parallel, we have10
r‖ = r‖v
v=(r · v
v
) v
v, (56)
and substituting now for r′, gives
r′ = r + (γ − 1)(r · v
v
) v
v− γvt. (57)
Factoring v in the above expression and substituting r‖v = r·v in the expressionfor t′ in (54), our transformation becomes
t′ = γ(t− r · v
c2
), r′ = r +
(γ − 1
v2r · v − γt
)v, (58)
which is the Lorentz transformation in 3 dimensions. We wish now to find thetransformation matrix. If we define
β =v
c≡
βxβyβz
=1
c
vxvyvz
(59)
and its transpose
βT =vT
c= (βx βy βz) =
1
c(vx vy vz) , (60)
then we can rewrite (58) as
ct′ = γct− γβT · r, r′ = −γβct+
(I + (γ − 1)
ββT
β2
)r, (61)
where I is the 3× 3 identity matrix, such that Ir = r, and β2 = β2x + β2
y + β2z .
In block matrix form, this can be written as(ct′
r′
)=
(γ −γβT
−γβ I + (γ − 1)ββT
β2
)(ctr
), (62)
10Geometrically and algebraically, v/v is a dimensionless unit vector pointing in the samedirection as r‖ and r‖ = r · v/v is the projection of r into the direction of v.
10
or, if we define
αij = (γ − 1)βiβjβ2
, (63)
for i, j = x, y, z, then in a more explicitly stated way we havect′
x′
y′
z′
=
γ −γβx −γβy −γβz−γβx 1 + αxx αxy αxz−γβy αyx 1 + αyy αyz−γβz αzx αzy 1 + αzz
ctxyz
. (64)
Note that this is a transformation between two frames whose axes are paralleland whose origins coincide. The most general Lorentz transformation also con-tains rotation of the three axes, since the composition of two boosts is not apure boost, but a boost followed by a rotaion.
11
2 Tensor Calculus
2.1 Four-vectors
The coordinates (ct, x, y, z) of an event can be considered as the components ofa four-dimensional radius vector. We shall use the following notation:
x0 = ct, x1 = x, x2 = y, x3 = z. (65)
Note that the quantity
(x0)2 − (x1)2 − (x2)2 − (x3)2, (66)
which is the interval, doest not change under Lorentz transformation. From nowon, Greek letters will take on the values 0, 1, 2, 3 and Latin letters will take onthe values 1, 2, 3. This way, the components of our four-dimensional vector canbe denoted by xµ, µ = 0, 1, 2, 3, and they transform according to the system ofequations
x′0
x′1
x′2
x′3
=
γ −βγ 0 0−βγ γ 0 0
0 0 1 00 0 0 1
x0
x1
x2
x3
. (67)
A contravariant four-vector V µ is, by definition, a set of four quantitiesV 0, V 1, V 2, V 3, which transform like the components of xµ under transforma-tions of the four-dimensional coordinate system. Its components will transform,therefore, according to the system
V ′0
V ′1
V ′2
V ′3
=
γ −βγ 0 0−βγ γ 0 0
0 0 1 00 0 0 1
V 0
V 1
V 2
V 3
. (68)
Components with index 0 are called time components, while the ones withindex 1, 2 or 3 are called space components. Contravariants four-vectors arealways written with superscripts. A four-vector Vµ, written with subscripts,which will be defined later, is said to be covariant. The components of thesetwo kinds of four-vectors are related by
V0 = V 0, V1 = −V 1, V2 = −V 2, V3 = −V 3. (69)
In matrix form, the column vector V µ is11
11V µ and Vµ will be used to indicate either column and row vectors, respectively, or setsof four components. To remember the matrix form of each four-vector, use the mnemonic”upper indices go up to down; lower indices go left to right”.
12
V µ =
V 0
V 1
V 2
V 3
4×1
, (70)
and, on the other hand,
Vµ = (V0 V1 V2 V3)1×4 = (V 0 − V 1 − V 2 − V 3)1×4. (71)
The square magnitude of a four-vector, in comparison with (66), is given by
(V 0)2 − (V 1)2 − (V 2)2 − (V 3)2, (72)
which, according to (69), can be written as
V0V0 + V1V
1 + V2V2 + V3V
3 =
3∑µ=0
VµVµ. (73)
From now on, we will use Einstein summation convention, in which one sumsover any repeated index (also called summing index or dummy index, and oneis always contravariant and the other covariant), and omits the summation sign,remembering that Greek letters run from 0 to 3 and Latin letters from 1 to 3.This way, we have
VµVµ = V0V
0 + V1V1 + V2V
2 + V3V3 (74)
as the expression for the square magnitude of a four-vector. Analogously, theLorentz scalar product of two different four-vectors is given by12
VµUµ = V0U
0 + V1U1 + V2U
2 + V3U3, (75)
which is invariant under rotations of the four-dimensional coordinate system.Just like the the interval between two events, this scalar product can be positive(timelike vectors), negative (spacelike) or zero (null or lightlike). In particular,the interval, which can be written as
ds2 = dxµdxµ = dx0dx
0 + dx1dx1 + dx2dx
2 + dx3dx3, (76)
is invariant, as stated before.The three space components of the four-vector V µ form the three dimen-
sional vector V, so we will use the notation
V µ ≡ (V 0,V), Vµ = (V0,−V) = (V 0,−V), (77)
so that the square magnitude of V µ may be given by
VµVµ = (V 0)2 − (V)2. (78)
12Note that the expression V µUµ is equal to VµUµ when there is a sum over µ, but onlythe latter gives this sum when it comes to matrix multiplication.
13
In particular, we have
xµ = (ct, r), xµ = (ct,−r), ds2 = xµxµ = c2t2 − r2, (79)
where
r = (x, y, z) = (x1, x2, x3). (80)
Let us now rewrite Lorentz transformation in (35) as
x′0 = γx0 − βγx1, x′1 = −βγx0 + γx1, x′2 = x2, x′3 = x3, (81)
in order to note that
∂x′0
∂x0= γ,
∂x′0
∂x1= −βγ, ... , (82)
so that ∂x′µ/∂xν are the entries of our transformation matrix13. We define
Λµν ≡∂x′µ
∂xν, (83)
and hence we can write
Λµν =
γ −βγ 0 0−βγ γ 0 0
0 0 1 00 0 0 1
. (84)
Lorentz transformation for the four-dimensional radius vector, as in (81), cannow be writen as
x′µ =∑ν
Λµνxν , (85)
or simply
x′µ = Λµνxν , (86)
using Einstein’s convention. In conclusion, by definition, a contravariant four-vector is a set of four quantities V µ which transform according to14
V ′µ = ΛµνVν . (87)
Remember now that, if φ = φ(x1, ..., xn) is a scalar function, then the differentialof φ is given by
13It is also important to note that ∂xµ/∂xν = ∂x′µ/∂x′ν =
1 if µ = ν0 if µ 6= ν
. It turns
out that the quantity on the right-hand side is a very special kind of four-dimensional tensor,which will be defined later.
14Equation (87) makes sense either as matrix multiplication or as a system of equationswhen µ, ν = 0, 1, 2, 3.
14
dφ =
n∑µ=1
∂φ
∂xµdxµ =
∂φ
∂xµdxµ = ∂µφdx
µ, (88)
where
∂µφ ≡∂φ
∂xµ. (89)
This partial derivative transforms as some sort of vector, but not as a contravari-ant one. From the chain rule, we know that
∂′µφ =∂φ
∂x′µ=
∂φ
∂xν∂xν
∂x′µ= ∂νφ
∂xν
∂x′µ. (90)
This new transformation is, by definition, the transformation of a covariantfour-vector Vµ:
V ′µ = Vν∂xν
∂x′µ. (91)
In the case of the Lorentz transformation, ∂xν/∂x′µ are the components of theinverse transformation matrix, and we write
∂xν
∂x′µ=(Λ−1
)νµ, (92)
so that the transformation of a covariant four-vector can be written as15
V ′µ = Vν(Λ−1
)νµ. (93)
2.2 Four-tensors
Either kind of vector is an example of a more general object called tensor,which has linear transformation rule. The simplest kind of tensor S is the oneunchanged under transformation, that is, S′ = S, and this is a characteristic ofa scalar. Rank is defined as the number of indices carried. This way, scalars aretensors of rank 0 and vectos are tensors of rank 1.
A four-dimensional tensor of the second rank V µν , also called four-tensor, isa set of 4× 4 = 16 quantities which transform like the products of componentsof two four-vectors under coordinate transformations. It’s worth reminding thatthe transformation of a contravariant four-vector is giving by
V ′µ = ΛµνVν , (94)
and a covariant one transforms like
V ′µ = Vν(Λ−1
)νµ, (95)
15Note that it makes sense writing V ′µ =(Λ−1
)νµVν relating the components, but not as
matrix multiplication.
15
and therefore, by definition, our tensor of rank 2 transforms like
V ′µν = ΛµαΛνβVαβ . (96)
The components of a second-rank tensor, however, can be written as V µν (con-travariant), Vµν (covariant) or V µν (mixed). Therefore, the contravariant onetransforms like (96), and the covariant and mixed ones transform, respectively,like
V ′µν = Vαβ(Λ−1
)αµ
(Λ−1
)βν (97)
and
V ′µν = ΛµαVαβ
(Λ−1
)βν . (98)
Raising or lowering space index (1, 2, 3) changes the sign of the component, andraising or lowering the time index (0) does not change the sign, so that
V0i = V 0i, V ij = −V ij , Vij = V ij , ... , (99)
where i, j = 1, 2, 3. If
V µν = V νµ, (100)
then the tensor V µν is called symmetric. Similarly, a tensor is called antisym-metric or skew symmetric if16
V µν = −V νµ. (101)
Clearly, the diagonal components V µµ (no sum here) of an antisymmetric tensorare zero since V µµ = −V µµ. For a symmetric mixed tensor, we have V µν = Vν
µ,so that we will simply write V µν .
From a mixed tensor V µν , we can form a scalar by doing an operation calledcontraction:
V µµ = V 00 + V 1
1 + V 22 + V 3
3. (102)
This scalar is called the trace of the tensor. Note that the formation of a scalarproduct of two vectors is also a contraction operation.
We define similarly four-tensors of higher rank. For example, a fourth-rankmixed tensor V µναβ is a set of 44 = 256 quantities which transform accordingto
V ′µνρσ = ΛµαΛνβVαβ
γδ
(Λ−1
)γν
(Λ−1
)δσ. (103)
From a tensor with, at least, one contravariant and one covariant components,one can do a contraction similarly as before, and each contraction will decreasethe rank of the tensor in 2. For instance, examples of contractions from the
16The definition is the same if Vµν = ±Vνµ, V µν = ±Vνµ, etc.. Note that the matricesassociated to these tensors are symmetric/antisymmetric themselves.
16
forth-rank tensor V µναβ are the second-rank tensors V µνµβ and V µβαβ , oreven the scalar V µνµν .
In a tensor equation, the two sides must contain identical free indices ofthe same type (contravariant or covariant). For example, V µν = UµWν makessence, while V µ = Uµ does not. The repeated indices may be replaced by anyother Greek or Latin letter (and remember that they are of different types).For example, V µνµν = V νµνµ = V αβαβ , while VµU
µ and ViUi are completely
different expressions.
2.3 Special Tensors
Let us define the unit four-tensor δµν , also known as Kronecker’ delta, as
δνµ =
1 if µ = ν0 if µ 6= ν
. (104)
The matrix form of δµν is the identity matrix,
δµν =
1 0 0 00 1 0 00 0 1 00 0 0 1
, (105)
and now we are able to affirm that
∂xµ
∂xν=∂x′µ
∂x′ν= δµν . (106)
Also, remembering that repeated indices are summed, one should note that
δµνVν = V µ (107)
and
δµνVµ = Vν , (108)
so the transformation law for δµν will be
δ′µν = Λµαδαβ
(Λ−1
)βν = Λµα
(Λ−1
)αν = δµν , (109)
since
Λµα(Λ−1
)αν =
∂x′µ
∂xα∂xα
∂x′ν=∂x′µ
∂x′ν= δµν , (110)
and so δ′µν = δ′µν , and it is therefore an invariant tensor.By raising the one index or lowering the other in δνµ, we can define the metric
tensors gµν ≡ δµν and gµν ≡ δµν . Considering the relations in (88), it is trivialthat
17
gµν = gµν =
1 0 0 00 −1 0 00 0 −1 00 0 0 −1
(111)
We have then17
gµνVν = Vµ, gµνVν = V µ, (112)
so that the metric tensor gµν can be used to lower index and gµν can be usedto raise index.
The completely antisymmetric unit tensor of fourth rank, εµνρσ is the tensorwhose components change sign under interchange of any pair of indices, andwhose nonzero components are ±1. Since εµνρσ is antisymmetric, it vanishes iftwo indices are the same. We set
ε0123 = +1, ε0123 = −1, (113)
and all the nonvanishing components can be brought to the arrangement 0, 1, 2, 3by an even or odd number of transpositions. Since there are 4! = 24 components,we have
εµνρσεµνρσ = −24 (114)
Strictly speaking, εµνρσ is not a tensor, but rather a pseudotensor : if wechange the sign of one or three of the coordinates, then the components εµνρσ
do not change, whereas some of the components of a tensor should changesign; on the other hand, with respect to rotations of the coordinate system, thequantities εµνρσ behave like the components of a tensor.
The product εµνρσεαβγδ form a tensor of rank 8, which is a true tensor.We can contract one or more pair of indices and obtain tensors of rank 6, 4, 2, 0(a tensor of rank 0 is a scalar). Since all these tensors have the same form inall coordinate systems, their components must be expressed as combinations ofproducts of components of the unit tensor δνµ.
The following equation and its particular cases, which will not be provedhere, can be found by starting from the symmetries that the quantities mustpossess under permutation of indices:
εµνρσεαβγδ = −
∣∣∣∣∣∣∣∣δµα δµβ δµγ δµδδνα δνβ δνγ δνδδρα δρβ δργ δρδδσα δσβ δσγ δσδ
∣∣∣∣∣∣∣∣ . (115)
In particular,
17Equations gµνV ν = Vµ and gµνVν = V µ do not make sense as matrix multiplication,only as a system of equation relating the components.
18
εµνρσεαβρσ = −2(δµαδνβ − δ
µβδνα), εµνρσεανρσ = −6δµα. (116)
Also, the product εijkεlmn, which is a true three-dimensional tensor of rank 6,is given by
εijkεlmn =
∣∣∣∣∣∣δil δim δinδj l δjm δjnδkl δkm δkn
∣∣∣∣∣∣ , (117)
so, in particular, we have
εijkεlmk = δilδjm − δimδj l, εijkεljk = 2δil, εijkεijk = 6. (118)
2.4 Differentiation
We define the four-vector operator ∂µ as
∂µ ≡∂
∂xµ=
(∂
∂x0,∂
∂x1,∂
∂x2,∂
∂x3
). (119)
Using previous notation, we can write
∂µ =
(1
c
∂
∂t,∇)
(120)
and
∂µ =
(1
c
∂
∂t,−∇
), (121)
where
∇ ≡(∂
∂x,∂
∂y,∂
∂z
). (122)
Let φ be a scalar function. The four-gradient of φ is the four-vector given by
∂µφ =
(1
c
∂φ
∂t,∇φ
). (123)
Using this definition, the differential of the scalar φ, which is given by
dφ =∂φ
∂xµdxµ, (124)
is a scalar, given by the Lorentz scalar product of two four-vectors. Let nowV µ =
(V 0, V 1, V 2, V 3
)=(V 0,V
)be a four-vector, then
∂µVµ =
1
c
∂V 0
∂t+∂V 1
∂x+∂V 2
∂y+∂V 3
∂z=
1
c
∂V 0
∂t+∇V = ∂µVµ. (125)
19
In particular, the operator [] = ∂µ∂µ = ∂µ∂µ is given by
[] ≡ 1
c2∂2
∂t2−∇2, (126)
also known as D’Alembertian.
20
3 Relativistic Mechanics
3.1 Four-velocity and Four-acceleration
The ordinary three-dimensional velocity is given by
v =dr
dt(127)
or
vi =dxi
dt, i = 1, 2, 3. (128)
From this, one can form a four-vector, but since dxµ is a four-vector and thequantity dτ is a scalar (not dt), we can define
Uµ =dxµ
dτ. (129)
From the chain rule, it follows that
Uµ =dxµ
dt
dt
dτ. (130)
Once dt/dτ = γ, we have
Uµ = γdxµ
dt, (131)
but since dxµ = (cdt, dr), we have
dxµ
dt=
d
dt(cdt, dr) = (c,v) , (132)
so that
Uµ =(U0,U
)= (γc, γv) (133)
and therefore
Uµ =(U0,−U
)= (γc,−γv) . (134)
The contraction UµUµ must be a scalar. In fact,
UµUµ = γ2c2 − γ2v2 = γ2c2
(1− v2
c2
)= γ2c2
1
γ2, (135)
or simply
UµUµ = c2. (136)
Geometrically, Uµ is a four-vector tangent to the world line of the particle. Ina similar way, one can define the four-acceleration as
21
Aµ =d2xµ
dτ2=dUµ
dτ(137)
and, analogously,
Aµ =dUµ
dt
dt
dτ= γ
d
dt(γc, γv) =
(γcdγ
dt, γdγ
dtv + γ2a
)(138)
or
Aµ =(γγc, γγv + γ2a
), (139)
where a = dv/dt is the ordinary three-dimensional acceleration of the particle.One may evaluate γ and find
γ =d
dt
(1− v2
c2
)− 12
=a · vc2
γ3, (140)
so that
Aµ =(γ4
a · vc, γ4
a · vc2
v + γ2a). (141)
Finally, differentiating (136) with respect to τ , we find
UµAµ = 0, (142)
and this means that the four-velocity and four-acceleration are mutually per-pendicular in our four-dimensional space.
3.2 Principle of Least Action
The principle of least action asserts that for each mechanical system there existsan integral S, defined as the action, which has minimum value for the actualmotion, so that the variation δS is zero. This integral must be invariant underLorentz transformation, since it must not depend on the choice of referencesystem, and so it depends on a scalar. Furthermore, this scalar is proportionalto ds since this is the only scalar that one can construct for a free particle. Theaction is then
S = −α∫ b
a
ds, (143)
where the integral is along the world line of the particle between two events,
and α is some constant which must be positive since∫ bads has its maximum
value along a straight world line. If we represent the action as
S = α
∫ t2
t1
Ldt, (144)
22
where L is the Lagrange function of the mechanical system, then using theresults in (11) and (18) we can write
S = −∫ t2
t1
αc
γdt, (145)
and comparing with (144), the Lagrangian of the free particle is
L = −αcγ
= −αc√
1− v2/c2. (146)
The constant α characterises the particle, but in classical mechanics each particleis characterized by its mass m. If we try to find a relation between α and m, thenwe should note that if c→∞ we must have the classical expression L = mv2/2.We can then expand L in powers of v/c,
L = −αc+αv2
2c, (147)
and note that constant terms do not affect the equation of motion, so that −αccan be omitted. Comparing with L = mv2/2, we have
α = mc, (148)
so that
S = −mc∫ b
a
ds (149)
and
L = −mc2
γ. (150)
3.3 Four-momentum and Energy
The three components of the momentum of a particle are the given by thederivatives of L with respect to the corresponding components of v. In otherwords,
pi =∂L
∂vi(151)
or, knowing that L = −mc2/γ = −mc2√
1− v2/c2,
p =∂L
∂v= −mc2
(1
2
)(1− v2
c2
)−1/2(−2v
c2
), (152)
or
p = γmv. (153)
23
One should note that if v c or c → ∞ then γ ≈ 1, so that we have p = mvabove. Also, if v → c then |p| → ∞. The force acting on the particle is givenby dp/dt. If one supposes that the force is directed perpendicular to v, so thatv2 is a constant, then
dp
dt= γm
dv
dt. (154)
On the other hand, if the force is parallel to v, then the velocity changes onlyin magnitude, so that the unit vector v = v/|v| is constant. If we write
p =mv√1− v2
c2
v, (155)
then, for a force parallel to v, we have
dp
dt= m
d
dt
v√1− v2
c2
v = m
1√1− v2
c2
dv
dt+ v
(−1
2
)(1− v2
c2
)−3/2(−2v
c2dv
dt
) v
(156)or simply
dp
dt= m
dv
dt
[γ +
v2
c2γ3]v = γ3ma
[1
γ2+v2
c2
]= γ3ma, (157)
and this means that the ratio of the force to acceleration is different in the twocases. The energy E of the particle is the quantity
E = p · v − L, (158)
and using the expressions for L and p, we find
E = γmv2 +mc2
γ= γmc2
(v2
c2+
1
γ2
), (159)
or simply
E = γmc2. (160)
This expression shows that if v = 0 then the energy of the free particle isE = mc2, which is defined as rest energy. Also, for small velocities v/c 1 onecan expand the expression for the energy and find
E =mc2
1− v2
c2
≈ mc2 +mv2
2, (161)
and this result was expected since the term mv2/2 is the classical expressionfor the kinetic energy of the particle. Squaring now equations (153) and (160),we have, respectively, p2 = γ2m2v2 and E2 = γ2m2c4. Comparing these two
24
equations, we have the relation between the energy and the momentum of theparticle,
E2
c2= p2 +m2c2. (162)
The energy expressed as a function of the momentum is called Hamiltonian Hof the system, so that in our case
H =√p2c2 +m2c4. (163)
Note that, if p mc, then the Hamiltonian is approximately given by
H ≈ mc2 +p2
2m(164)
which, except for the rest energy, is the classical expression for the Hamiltonian.Knowing now that p = γmv and E = γmc2, we find the relation between
the energy, velocity and momentum of the particle,
p =E
c2v. (165)
From the equations for the momentum and energy, if v = c then both of themare infinite, so that a particle with mass different from zero cannot move withvelocity v = c. On the other hand, from the expression relating the momentumand energy above, particles of zero mass can exist and for such particles we have
p =E
c. (166)
In four-dimensional form, according to the principle of least action we have
δS = −mcδ∫ b
a
ds = −mcδ∫ b
a
√dxµdxµ = 0 (167)
since ds2 = dxµdxµ. In other words,
δS = −mc∫ b
a
dxµδdxµ
ds= −mc
∫ b
a
Uµdδdxµ. (168)
Integrating by parts, we easily get
δS = −mcUµδxµ|ba +mc
∫ b
a
δxµdUµds
ds. (169)
The first term of this equation is zero since (δxµ)a = (δxµ)b = 0, so that wehave
δS = mc
∫ b
a
δxµdUµds
ds = 0, (170)
and hence
25
dUµds
= 0. (171)
Now, let us consider the point a as fixed, so that (δxµ)a = 0, and point b asvariable, so that we find
δS = −mcUµδxµ, (172)
where δxµ replaces (δxµ)b = 0. The momentum four-vector or four-momentumis the four-vector given by
Pµ = − ∂S
∂xµ. (173)
From classical mechanics, we know that pi = ∂S/∂xi are the three componentsof the momentum vector. Also, the derivative −∂S/∂t is the energy E of theparticle. Remembering that x0 = ct, we can now write
Pµ = (E/c,−p) (174)
and then
Pµ = (E/c,p). (175)
One can note that this can also be written as
Pµ = mUµ (176)
where Uµ is the four-velocity of the particle. The expression PµPµ must be a
scalar. In fact, we have
PµPµ = m2UµU
µ = m2c2. (177)
The force four-vector or four-force is, by analogy, defined as the derivative
Fµ =dpµ
ds= mc
dUµ
ds, (178)
and its components satisfy FµUµ = 0. In terms of the three-dimensional force
f = dp/dt, this can be written as
Fµ =
(γfv
c2,γf
c
), (179)
where the time component is related by the work done by the force.
26
4 Charges in Electromagnetic Fields
4.1 Four-potential of a Field
In an electromagnetic field, the action function of a particle is given by the action
S = −mc∫ bads for the free particle and a term describing the intercaction of the
particle with the field, determined by the charge of the particle. The propertiesof the field are characterised by the four-potential Aµ. The components ofthis four-vector are functions of the spatial coordinates and time. The spacecomponents of Aµ form the three-dimensional vector A, called vector potential,and the time component is denoted as A0 ≡ φ, called scalar potential. This way,we have
Aµ = (φ,A), Aµ = (φ,−A). (180)
The components of Aµ appear in the action function in the term
−qc
∫ b
a
Aµdxµ, (181)
where q is the charge of the particle. This way, the action has the form
S =
∫ b
a
(−mcds− q
cAµdx
µ)
=
∫ b
a
(−mcds+
q
cA · dr− qφdt
), (182)
using the expression for Aµ above and for the infinitesimal four-dimensionalradius vector dxµ = (cdt, dr). Substituting now dr = vdt and ds = cdτ = cdt/γabove, we can change the integral above to an integration over t and obtain
S =
∫ t2
t1
(−mc
2
γ+q
cA · v − qφ
)dt, (183)
and so the Lagrangian for a charge in an electromagnetic field is given by
L = −mc2
γ+q
cA · v − qφ. (184)
One can now find the components of the generalised momentum of the particle,
Pi =∂L
∂vi= −mc2
(1
2
)(1− v2
c2
)− 12(−2vi
c2
)+q
cAi, (185)
or, in other words,
P = γmv +q
cA = p +
q
cA. (186)
The Hamiltonian function can be found using the expression H = v · ∂L∂v − L,so that for a particle in a field we have
27
H = v ·(γmv +
q
cA)
+mc2
γ− qcA ·v+qφ = γmv2+
q
cA · v+
mc2
γ− qcA ·v+qφ
(187)or simply
H = γmc2 + qφ, (188)
since γmv2 + mc2/γ = γmc2(1/γ2 + v2/c2) = γmc2. One can now express Has a function of the generalised momentum P and find
H =
√m2c4 + c2
(P− e
cA)2
+ eφ. (189)
4.2 Equations of Motion of a Charge in a Field
The equations of motion of a particle with small charge q can be found usingthe Lagrange equations
d
dt
(∂L
∂v
)− ∂L
∂r= 0, (190)
where
L = −mc2
γ+q
cA · v − eφ. (191)
As seen before, we have
∂L
∂v= P = γmv +
q
cA (192)
and furthermore
∂L
∂r= ∇L =
q
c∇(A · v)− q∇φ. (193)
Using for A and v the identity ∇(a · b) = (a · ∇)b + (b · ∇)a + b× (∇× a) +a× (∇× b) for arbitrary vectors a and b, we find
∂L
∂r=q
c[(A · ∇)v + (v · ∇)A + v × (∇×A) + A× (∇×V)]− q∇φ. (194)
Now, note that v is constant in this differentiation, so that we simply have
∂L
∂r=q
c(v · ∇)A +
q
cv × (∇×A)− q∇φ. (195)
This way, the Lagrange equation in (190) becomes
d
dt
(p +
q
cA)− q
c(v · ∇)A− q
cv × (∇×A) + q∇φ = 0. (196)
28
The components of the potential vector A are functions of the spatial compo-nents and time, so that
dA =∂A
∂tdt+
∂A
∂rdr, (197)
or
dA =∂A
∂tdt+ (dr · ∇)A, (198)
which gives
dA
dt=∂A
∂t+ (v · ∇)A. (199)
Finally, equation (196) gives us
dp
dt= −q
c
∂A
∂t− q∇φ+
q
cv × (∇×A) = 0. (200)
The derivative of the momentum with respect to time, on the left hand sideof the above equation, is the force exerted on the charge in an electromagneticfield. The terms (q/c)∂A/∂t and q∇φ do not depend on v. We denote thisforce per unit charge as the electric field intensity E, so that
E = −1
c
∂A
∂t−∇φ. (201)
The term (q/c)v × (∇×A) depends on the velocity and is proportional andperpendicular to it. The factor of v/c per unit charge is called magnetic fieldintensity B, so that
B = ∇×A. (202)
Using this definitions, we can write the equation of motion of a charge in anelectromagnetic field as
dp
dt= qE +
q
cv ×B, (203)
which is called Lorentz force.
4.3 Gauge Invariance
To one and the same field, there may correspond different potentials. Let ustry to add to the components of the potential Aµ the quantity −∂µχ, whereχ = χ(t, x, y, z) is an arbitrary function. This way, our new potential four-vector is
A′µ = Aµ − ∂µχ. (204)
With this change, there appears in the action integral the term
29
q
c
∂χ
∂xµdxµ = d
(ecχ), (205)
and the last term is a total differential and hence it has no effect on the equationsof motion. Using the vector and scalar potentials, the transformation in (204)is the same as
A′ = A +∇χ, φ′ = φ− 1
c
∂χ
∂t. (206)
This way, the fields E = −(1/c)∂A/∂t − ∇φ and B = ∇ × A do not changesince ∇ · (∇ × V) = 0 and ∇ × (∇f) = 0 for any well behaved vector fieldV and scalar field f. This way, the potentials are not uniquely defined. Thetransformation in (206) is called Gauge Transformation. As an example, it isalways possible to choose the potentials so that the scalar field φ is zero.
4.4 Constant Electromagnetic Field
An electromagnetic field is said to be constant when it does not depend on thetime. This way, we have E = −∇φ and B = ∇×A, so that a constant electricfield is determined only by φ and a constant magnetic field is determined only byA. Let us now determine the energy of a charge in a constant electromagneticfield. First of all, if the field is constant, then the Lagrangian also does notdepend on the time explicitly, and in this case the energy is conserved andcoincides with the Hamiltonian. For a charge q in an electromagnetic field, wehave
E = γmc2 + qφ, (207)
so the presence of the field adds to the energy the term qφ, which is the potentialenergy of the charge in the field. The magnetic field does not affect the energy ofthe charge since the vector potential A does not appear in the above expressionfor E. If the field intensities are the same at all points in space, the it is calleduniform. If the electric field is uniform, the scalar potential can be expressed as
φ = −E · r (208)
since constant E implies∇(E · r) = (E · ∇)r = E. On the other hand, the vectorpotential can be expressed as
A =1
2B× r, (209)
since B constant implies ∇× (B× r) = B · ∇r− (B · ∇)r = 2B.
4.5 Motions in Constant Uniform Electromagnetic Field
The first kind of motion with which we will be dealing is the motion in a constantuniform electric field. Suppose there is a charge q in a uniform constant electric
30
field E. The direction of E can be said to be in the x-axis. Now, we know thatthe equation of motion is
dp
dt= qE +
q
cv ×H, (210)
so that in our case the equation of motion becomes only
dp
dt= qE, (211)
which is a set of two equations dpx/dt = qE and dpy/dt = 0. Solving thisdifferential equations we have, respectively,
px = qEt, py = p0, (212)
and the time reference point has been chosen at the moment when px = 0, andp0 is the momentum of the particle at that moment. The kinetic energy, whichis given by Ek =
√p2c2 +m2c4, will be, in our case,
Ek =√p20c
2 + (cqEt)2 +m2c4 =√E2
0 + (cqEt)2, (213)
where
E0 =√p20c
2 +m2c4 (214)
is the energy at time t = 0. Once the velocity of the particle is given by
v =pc2
E0, (215)
we have
vx =dx
dt=pxc
2
Ek=
c2qEt√E2
0 + (cqEt)2, (216)
and hence
x =
∫c2qEt√
E20 + (cqEt)2
dt =1
qE
√E2
0 + (cqEt)2. (217)
On the other hand, we have
vy =dy
dt=pyc
2
Ek=
p0c2√
E20 + (cqEt)2
, (218)
so that
y =p0c
qEsinh−1
(cqEt
E0
). (219)
Now, from the above equation we have
31
t = sinh
(qEy
p0c
)E0
cqE, (220)
and substituting this in the equation (217), gives us
x =E0
qEsinh
(qEy
p0c
), (221)
which is a catenary curve. The second kind of motion we shall deal is themotion in a constant uniform magnetic field. Consider the charge q in a uniformmagnetic field B, defined to be in the direction of the z-axis. The equation ofmotion given by dp/dt = qE + (q/c)v ×B simply becomes
dp
dt=q
cv ×B. (222)
Once we have v = pc2/E, the above equation can be written as
dv
dt
E
c2=q
cv ×B, (223)
which is a set of three equations
dvxdt
= ωvy,dvydt
= −ωvx,dvzdt
= 0, (224)
where ω = qcB/E. Multiplying the equation for vy by i and adding to theequation for vx gives us
d
dt(vx + ivy) = −iω(vx + ivy), (225)
which is a first order differential equation whose solution is
vx + ivy = ae−iωt, (226)
where a is a complex constant. If we set a = vre−iθ, then the above equation
becomes
vx + ivy = vre−i(ωt+θ), (227)
which can be easily separated into real and imaginary parts, giving
vx = vrcos(ωt+ θ), vy = −vrsin(ωt+ θ). (228)
Squaring both equations for vx and vy and adding them gives us
v2r = v2x + v2y, (229)
and this means that the velocity of the particle n xy-plane remains constant.Integrating now the equations for vx = dx/dt and vy = dy/dt, we have
x = x0 +vrωsin(ωt+ θ), y = y0 +−vr
vrωcos(ωt+ θ). (230)
32
Also, dvz/dt = 0 gives us vz = v0z = constant and hence z = z0 + v0zt. Thesethree equations for x, y, z combined show us that the charge moves along a helixhaving its axis along the direction of the magnetic field B. In particular, ifv0z = 0 then the charge moves along a circle in the plane perpendicular to thefield.
4.6 The Electromagnetic Field Tensor
In four-dimensional notation, the principle of least action states that
δS = δ
∫ b
a
(−mcds− q
cAµdx
µ)
= 0. (231)
Using the fact that ds =√dxµdxµ, we have then
δS = −∫ b
a
(mc
dxµdδxµ
ds+q
cAµdδx
µ +q
cδAµdx
µ
)= 0. (232)
Using now Uµ = dxµ/ds and integrating the first two terms by parts gives us
δS =
∫ b
a
(mcdUµδx
µ +q
cδxµdAµ −
q
cδAµdx
µ)−(mcUµδx
µ +q
cAµδx
µ)
= 0.
(233)Now, note that ∫ b
a
mcuµδxµ +
q
cAµδx
µ = 0 (234)
since the integral is varied with fixed coordinate values at the limits. Also, wehave
δAµ =∂Aµ∂xν
δxν (235)
and
dAµ =∂Aµ∂xν
dxν , (236)
and hence the expression for δS becomes
δS =
∫ b
a
(mcdUµδx
µ +e
c
∂Aµ∂xν
dxνδxµ − e
c
∂Aµ∂xν
δxνdxµ)
= 0. (237)
Now, let us use the fact that dUµ = (dUµ/ds)ds and dxµ = Uµds, and also thefact that summed indices can be exchanged, so that we can write
δS =
∫ b
a
[mc
dUµds− q
c
(∂Aν∂xµ
− ∂Aµ∂xν
)Uν]δxµds = 0. (238)
33
Once δxµ is arbitrary, the integrand must be igual to zero, so
mcdUµds− q
c
(∂Aν∂xµ
− ∂Aµ∂xν
)Uν = 0, (239)
or
mcdUµds
=q
c
(∂Aν∂xµ
− ∂Aµ∂xν
)Uν . (240)
We define then the electromagnetic field tensor Fµν as
Fµν =∂Aν∂xµ
− ∂Aµ∂xν
(241)
so that we can write the four-dimensional equation of motion as
mcdUµds
=q
cFµνU
ν . (242)
Setting ν = i = 1, 2, 3 in the above equation, we have the equation of motion
dp
dt= qE +
q
cv ×H, (243)
while setting ν = 0 gives us the known equation
dEkdt
= qE · v. (244)
In matrix notation, the electromagnetic field tensor is
Fµν =
0 Ex Ey Ez−Ex 0 −Bz By−Ey Bz 0 −Bx−Ez −By Bx 0
(245)
in covariant form, and
Fµν =
0 −Ex −Ey −EzEx 0 −Bz ByEy Bz 0 −BxEz −By Bx 0
(246)
in contravariant form. Note that all the diagonal components are zero, as ex-pected since Fµν is clearly antisymmetric.
One could now note that Fµν transform in each index as a four-vector.Expressing the components of this tensor in terms of the components of theelectric and the magnetic field, the formulas for the transformations are
E′x = Ex, E′y = γ(Ey −
v
cBz
), E′z = γ
(Ez −
v
cBy
)(247)
and
34
B′x = Bx, B′y = γ(By +
v
cEz
), B′z = γ
(Bz +
v
cEy
). (248)
As a particular case, if v c then
E′x = Ex, E′y = Ey −v
cBz, E′z = Ez −
v
cBy (249)
and
B′x = Bx, B′y = By +v
cEz, B′z = Bz +
v
cEy, (250)
which can be written as
E′ = E− 1
c×v (251)
and
B′ = B +1
cE× v (252)
4.7 Invariants of the Field
We can form scalars form the electric and magnetic field intensities. The firstinvariant quantity one can form is
FµνFµν , (253)
which can be easily computed remembering that Fµν = (E,B) and Fµν =(−E,B), giving
FµνFµν = B2 − E2. (254)
The second quantity we can form is given by
εµναβFµνFαβ = E ·B (255)
The equation E ·B = constant means that if the electric and the magnetic fieldsare perpendicular in one system, that is, E ·B = 0, then they are perpendicularin any other system. The equation B2 − E2 = constant implies that if E < Bor E > B in one system, then E < B or E > B in any other system. Also, onecan nothat that if E ·B = 0 then we can alwats find a reference system in whichE = 0 or B = 0; in other words, the field is purely magnetic or purely electric.
35
5 The Electromagnetic Field Equations
5.1 The First Pair of Maxwell’s Equations
We already know that
B = ∇×A, E = −1
c
∂A
∂t−∇φ, (256)
so, using the fact that ∇× (∇f) = 0 for all scalar function f , we have
∇×E = −1
c
∂(∇×A)
∂t−∇× (∇φ) = −1
c
∂(∇×A)
∂t(257)
and now using the fact that ∇ · (∇×V) = 0 for all vector field V gives us
∇ ·B = ∇ · (∇×A) = 0, (258)
and these equations are the first pair of Maxwells equations, which are homo-geneous. In gour-dimensional notation, let us first remember that
Fµν = ∂µAν − ∂νAµ, (259)
so that
∂ρFµν+∂µFνρ+∂νFρµ = ∂ρ∂µAν−∂ρ∂νAµ+∂µ∂νAρ−∂µ∂ρAν+∂ν∂ρAµ−∂ν∂µAρ = 0(260)
since the derivatives commute. The quantity ∂ρFµν + ∂µFνρ + ∂νFρµ is anti-symmetric in all three indices, and the only non zero components are those with
µ 6= ν 6= ρ, so they form the set of four equations, which are ∇×E = 1c∂(∇×A)
∂tand ∇ ·B = 0.
5.2 The Four-dimensional Current Vector and Equationof Continuity
The charge density ρ = ρ(x, y, z, t) is defined so that ρdV is the charge containedin the volume dV , which allows us to treat charge as a continuously distributedquantity in the space. Charges, however, are pointlike, which allows us to write
ρ(r) =∑a
qaδ(r− ra), (261)
where ra is the radius vector of the charge qa. Multiplying now the quantitydQ = ρdV by dxµ gives us
dQdxµ = ρdV dxµ = ρdV dtdxµ
dt. (262)
Now, note that dQdxµis a four-vector, and hence the quantity ρdV dtdxµ/dtmust also be a four-vector. Once the quantity dV dt is a scalar, we conclude
36
that the quantty ρdxµ/dt is a four-vector. We define this vector as Jµ, whichis called current four-vector or four-current, and hence
Jµ = ρdxµ
dt, (263)
we can now evaluate
J0 = ρdx0
dt= ρ
d(ct)
dt= cρ (264)
and
J i = ρdxi
dt= ρv = j, (265)
where j is the current density vector. This way, we can write
Jµ = (cρ, j). (266)
Finally, the total charge, which is equal to∫ρdV , can also be written in four-
dimensional form as ∫ρdV =
1
c
∫J0dV =
1
c
∫JµdSµ (267)
taken over the four-dimensional hyperplane perpendicular to x0-axis.Let us now consider the change with time of the total charge, that is, the
quantity
∂
∂t
∫ρdV. (268)
First, one should note that the quantity of charge which passes in unit timethrough the element dS of the surface bounding our volume is given by ρv · dS,where v is the velocity of the charge where dS is located. The quantity ρv · dSis positive if charge leaves and negative otherwise. Remembering that j = ρv,we can write then
∂
∂t
∫ρdV = −
∮j · dS, (269)
where the integral on the right hand side extends over the whole boundary ofthe volume. This is called equation of continuity, in integral form. Applyingnow Gauss’ theorem on the right hand side gives us∫
j · dS =
∫(∇ · j)dV, (270)
and hence we have
∂
∂t
∫ρdV = −
∫(∇ · j)dV (271)
37
or ∫ (∂ρ
∂t+∇ · j
)dV = 0, (272)
which allows us to write
∂ρ
∂t+∇ · j = 0 (273)
since dV is arbitrary. This is the equation of continuity in differential form. Letus write now the above equation in the form
1
c
∂(cρ)
∂t+∂J1
∂x1+∂J2
∂x2+∂J3
∂x3= 0, (274)
and now note that cρ = J0 and ∂/∂(ct) = ∂/∂x0, so that we can write
∂µJµ = 0, (275)
which is the equation of continuity in four-dimensional form.
5.3 The Action Function for the Electromagnetic Field
Considering the electromagnetic field and the particles, the action function con-sists of three parts,
S = Sf + Sm + Smf , (276)
where Sf depends on the properties of the field itself, in the absence of charges,Sm depends only on the properties of the particle, and Smf depends on theinteraction between the particles and the field. This way, if there are manyparticles, the total action Sm is the sum of the actions for each free particle:
Sm = −∑
mc
∫ds. (277)
On the other hand, for a system of particles we will also have
Smf = −∑ q
c
∫Aµdx
µ. (278)
Now, we wish to stablish the form of the action Sf . In order to do that, let ususe the fact that the electromagnetic field satisfies the principle of superposition,which asserts that the field produced by a system of charges is the result of asimple composition of the fields produced by each of the particles individually.As we know, a linear differential equation has this property, and the linearcombination of any solution is also a solution. This way, under the integral signof Sf there must stand an expression quadratic in the field, and Sf must be theintegral of some function of the field tensor Fµν . In order to have a scalar asthe action, the quantity we look for is FµνF
µν , so that
38
Sf = a
∫FµνF
µνdxdydzdt, (279)
where a is a constant which depends on the choice of units. In the Gaussiansystem of units, we have a = −1/16π. If we define now
dΩ = cdtdxdydz, (280)
we have then
Sf = − 1
16πc
∫FµνF
µνdΩ, (281)
or, using the fact that FµνFµν = 2(B2 − E2),
Sf =1
8π
∫(E2 −B2)dV dt, (282)
and hence the total action for the fied and particles is
S = Sf + Sm + Smf = − 1
16πc
∫FµνF
µνdΩ−∑
mc
∫ds−
∑ q
c
∫Aµdx
µ.
(283)
5.4 The Second Pair of Maxwell’s Equations
In the expression (283) for the action function, we can introduce the currentfour-vector. In place of the point charges q, let us introduce the continuousdistribution of charge with density ρ, and write this term as
−1
c
∫ρAµdx
µdV, (284)
replacing the sum by an integral. We can rewrite this as
−1
c
∫ρdxµ
dtAµdV dt, (285)
or simply
− 1
c2
∫AµJ
µdΩ. (286)
The action then will be of the form
S = − 1
16πc
∫FµνF
µνdΩ−∑
mc
∫ds−
∑ 1
c2
∫AµJ
µdΩ. (287)
Now, note that the variation of the term −∑mc∫ds− is clearly zero, and so,
using the fact that FµνδFµν = FµνδFµν , we have
39
δS = −∫ (
1
8πcFµνδFµν +
1
c2δAµJ
µ
)dΩ = 0. (288)
Using now Fµν = ∂µAν − ∂νAµ gives us
δS = −∫ (
1
8πcFµν∂µδAν −
1
8πcFµν∂νδAµ +
1
c2δAµJ
µ
)dΩ = 0. (289)
We can now interchange the indices µ and ν in the expression Fµν∂µδAν andthen replacing Fµν by −F νµ, which gives us
δS = −∫ (− 1
4πcFµν∂νδAµ +
1
c2δAµJ
µ
)dΩ = 0. (290)
Integrating by parts the first term of this integral, remembering that the limitsof integration are infinity, where the field is zero, gives us the expression
δS = −∫ (
1
4πc∂νFµν +
1
c2Jµ)δAµdΩ = 0. (291)
or simply
−1
c
∫ (1
4π∂νFµν +
1
cJµ)δAµdΩ = 0. (292)
Since δAµ is arbitrary, its coefficients must be zero, so that we have
∂νFµν = −4π
cJµ, (293)
which is a set of four equations. If we set µ = 1 then we have the equation
∂Bz∂y− ∂By
∂z− 1
c
∂Ex∂t
=4π
cJx, (294)
and similarly if i = 2, 3, so that, in vector equation, we have
∇×B =1
c
∂E
∂t+
4π
cJ. (295)
On the other hand, If µ = 0, we have
∇ ·E = 4πρ, (296)
and these equations are called the second pair of Maxwell’s equations, or the in-homogeneous pair. It is easy to obtain the continuity equation from the Maxwellequations. Taking the divergence of equation (295), gives us
∇ · (∇×B) =1
c
∂(∇ ·E)
∂t+
4π
c(∇ · J). (297)
Once ∇ · (∇×B) = 0 and ∇ ·E = 4πρ, we have then
40
0 =1
c
∂(4πρ)
∂t+
4π
c(∇ · J), (298)
or simply
∇ · J +∂ρ
∂t= 0, (299)
which is the equation of continuity.
5.5 Energy Density
Let us consider the equations
∇×E = −1
c
∂B
∂t, ∇×B =
1
c
∂E
∂t+
4π
cj, (300)
multiply respectively by B and E and combine them, getting
B · (∇×E)−E · (∇×B) = −1
cB · ∂B
∂t− 1
cE · ∂E
∂t+
4π
cj ·E, (301)
and, using the formula ∇ · (a× b) = b · ∇ × a− a · ∇b, we can write
∇ · (E×B) = −4π
cj ·E− 1
2c
∂
∂t(E2 +B2), (302)
or simply
∂W
∂t= −j ·E−∇ · S, (303)
where the Poynting vector S is defined as
S =c
4πE×B (304)
and the energy density W as
W =E2 +B2
8π, (305)
which is the energy per unit volume of the field.
5.6 Energy-momentum tensor of the Electromagnetic Field
Consider a system whose action integral is
S =
∫Λ (q, ∂µq) dV dt =
1
c
∫ΛdΩ, (306)
where Λ is a function of q and their first derivatives. The integral∫
ΛdV is theLagrangian of the system, so that Λ can be interpreted as Lagrangian density.The equations of motion are obtained by varying S:
41
δS =1
c
∫ (∂Λ
∂qδq +
∂Λ
∂(∂µq)δ(∂µq)
)dΩ (307)
or
δS =1
c
∫ [∂Λ
∂qδq + ∂µ
(∂Λ
∂(∂µq)δq
)− δq∂µ
∂Λ
∂(∂µq)
]dΩ = 0. (308)
The term ∫∂µ
(∂Λ
∂(∂µq)δq
)dΩ (309)
vanishes, and by arbitrarity of dΩ and δq, the equation of motion is then
∂Λ
∂q− ∂µ
∂Λ
∂(∂µq)= 0. (310)
Write now
∂Λ
∂xµ=∂Λ
∂q
∂q
∂xµ+
∂Λ
∂(∂νq)
∂(∂νq)
∂xµ, (311)
and, using the equation of motion and the fact that ∂ν∂µq = ∂µ∂νq, we have
∂Λ
∂xµ=
∂
∂xν
(∂Λ
∂(∂νq)
)∂µq +
∂Λ
∂(∂νq)
∂(∂µq)
∂xν=
∂
∂xν
(∂µq
∂Λ
∂(∂νq)
). (312)
Also, we can write
∂Λ
∂xµ= δνµ
∂Λ
∂xν, (313)
and hence, if we define
T νµ = ∂µq∂Λ
∂(∂νq)− δνµΛ, (314)
then we can write
∂T νµ∂xν
= 0. (315)
We wish to apply now these relations to the electromagnetic field. First of all,remember that for the electromagnetic field we have
Λ = − 1
16πFνρF
νρ, (316)
so, using relation (314), the tensor of the electromagnetic field is
42
T νµ =∂Aρ∂xµ
∂Λ
∂(∂Aρ∂xν
) − δνµΛ, (317)
which gives, after finding the variation δΛ,
T νµ = − 1
4π
∂Aρ∂xµ
F νρ +1
16πδνµFρσF
ρσ, (318)
or, in contravariant form,
Tµν = − 1
4π
∂Aρ∂xµ
F νρ +1
16πgµνFρσF
ρσ, (319)
which is not, however, a symmetric tensor, so that we add the quantity
1
4π
∂Aµ
∂xρF νρ =
1
4π
∂
∂xρ(AµF νρ), (320)
and hence we have the final expression for the energy-momentum tensor of theelectromagnetic field,
Tµν =1
4π
(−FµρF νρ +
1
4gµνFρσF
ρσ
), (321)
which is symmetric and whose trace Tµµ = 0.
43