course : b.sc. (hons.) physics sem-vi, march-2020

79
Course : B.Sc. (Hons.) Physics Sem-VI, March-2020 Paper Code/Name: 32227626/Classical Dynamics Unit 3: Relativity Pranav Kumar Kindly read the following sections from the chapter attached and do examples therein. 1. Lectures for the dates 16.03.2020 to 23.03.2020 (a) Section 11.1, 11.1.1, 11.2, 11.4, 11.5 from reference [2] . (b) Section 13.3, 13.4, subsection 13.4.1, 13.4.2, 13.4.3 from reference [1] 2. Lectures for the dates 24.03.2020 to 31.03.2020 (a) Section 11.6,11.7,11.9 from reference [2]. (b) Section 13.6, (consider t as a zeroth coordinate only, i.e.x o = ct), subsection 13.6.1, 13.6.2 from reference [1] Kindly read the text and communicate to me if you have any problem in understanding the topic(s). I would suggest all of you to make a group on “ZOOM Cloud Meeting” APP and discuss together. I will be available on ZOOM as per the timetable. Or you can share common problem, i shall try to present the topics in further detail. Thank you for the cooperation. References [1] Tai L. Chow-Classical Mechanics-CRC Press (2013). Page No.2 - 49 [2] Introduction to Classical Mechanics, David Morin, Page No.50 - 79 1

Upload: others

Post on 27-Dec-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

Course : B.Sc. (Hons.) Physics

Sem-VI, March-2020Paper Code/Name: 32227626/Classical Dynamics

Unit 3: Relativity

Pranav Kumar

Kindly read the following sections from the chapter attached and do examples therein.

1. Lectures for the dates 16.03.2020 to 23.03.2020

(a) Section 11.1, 11.1.1, 11.2, 11.4, 11.5 from reference [2] .

(b) Section 13.3, 13.4, subsection 13.4.1, 13.4.2, 13.4.3 from reference [1]

2. Lectures for the dates 24.03.2020 to 31.03.2020

(a) Section 11.6,11.7,11.9 from reference [2].

(b) Section 13.6, (consider t as a zeroth coordinate only, i.e.xo = ct), subsection 13.6.1, 13.6.2from reference [1]

Kindly read the text and communicate to me if you have any problem in understanding the topic(s).I would suggest all of you to make a group on “ZOOM Cloud Meeting” APP and discuss together. Iwill be available on ZOOM as per the timetable.

Or you can share common problem, i shall try to present the topics in further detail.Thank you for the cooperation.

References

[1] Tai L. Chow-Classical Mechanics-CRC Press (2013). Page No.2− 49

[2] Introduction to Classical Mechanics, David Morin, Page No.50− 79

1

Page 2: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

427

13 Theory of Special Relativity

We now examine the modification of the structure of classical mechanics brought about by the special theory of relativity. We do not intend to present a comprehensive discussion of the special theory of relativity; only its essential parts are outlined in the following section. Our main interest is to see how to incorporate the special theory of relativity into the framework of classical mechanics.

13.1 HISTORICAL ORIGIN OF SPECIAL THEORY OF RELATIVITY

Before Einstein, the concept of space and time were those described by Galileo and Newton. In any unaccelerated frame of reference, called the inertial reference frame, Newton’s laws of motion are valid, especially the first law, which states that free objects maintain a state of uniform motion. Time was assumed to have an absolute or universal nature in the sense that any two inertial observ-ers who have synchronized their clocks will always agree on the time of any event. An event is any happening that can be given space and time coordinates.

The Galileo transformation asserts that any one inertial frame is as good as any other one describ-ing the laws of classical mechanics. However, physicists of the 19th century were not able to grant the same freedom to electromagnetic theory, which did not seem to be Galilean invariant. It is worth-while to spend some time examining this inconsistency of electromagnetism and Galilean relativity. Classical electromagnetism is summarized in Maxwell’s four differential equations, which have the form (in Gaussian units in which the electric and magnetic field vectors have the same dimensions)

∇ × = − ∂∂

∇ ⋅ =

∇ × = ∂∂

+

��

��

Ec

Bt

B

Hc

Dt

j

10

14

,

,

,

π ⋅⋅ =�D 4πρ

where c is the ratio of electromagnetic and electrostatic units of charge. If we consider only empty space, we have

� � �D E B H= =, , and in the absence of charges and currents, we find

∇ × = − ∂

∂∇ ⋅ =

∇ × = ∂∂

∇ ⋅ =

��

��

Ec

Ht

H

Hc

Et

E

10

10

, ,

,

. (13.1)

Hence,

∇ × ∇ × = ∇ ∇⋅ −∇ = ∂∂

∇ × = − ∂∂

( ) ( )� � � �

H H Hc t

Ec

H

t2

2

2

2

1 1

so that the auxiliary magnetic vector �H satisfies the wave equation

∇ = ∂∂

22

2

2

1��

Hc

H

t (13.2)

Page 3: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

428 Classical Mechanics

with a similar equation for �E. It follows that disturbances in the fields propagate with velocity c. This

suggests the identification of light and electromagnetic radiation, and such identification gives a very satisfactory explanation of optical phenomena. But the wave equation 13.2 contains no refer-ence to the velocity of the source of the light, and this naturally suggests that the velocity of the light must be independent of the velocity of its source. This is in agreement with observations in astronomy. For example, there exist certain binary stars consisting of two stars moving in orbits about their common center of mass. At one point in the orbit, one star will be traveling toward the Earth, and another, directly away. If the center of mass is at a distance d from the Earth, the light will reach the Earth at a time of order d/c after it has been emitted, where c is the velocity of light. We will show, in a moment, that for any small change v in c, we have a change in the time of arrival by an amount given by

Δt = −dv/c2.

This change would produce apparent irregularities in the motion of such stars. No such irregulari-ties have been observed, and we are, therefore, to conclude that the velocity of light is independent of the velocity of the source.

To derive the simple result of change in arrival time, let us assume, for the sake of simplicity, that there are two stars, A and B, of equal mass moving on opposite sides of a circular orbit around their common center of mass C. Let us consider an observer P, at a very great distance d, from the center of the orbit of the stars, so that the angle θ subtended by their orbits is always very small (Figure 13.1). We consider only the light rays that are emitted in such a way as eventually to reach the point P (the speed of light always being c relative to the emitting star). Let us begin with those rays that reach P after being emitted at the time t1 when both stars are in the line of sight, that is, at A and B. The time taken for light from the nearer star, A, to reach P will be

td a

cA1 = −

where a is the radius of the orbit, and that light from the farther star, B, will be

td a

cB1 = +.

At the time t2 when the two stars are A′ and B′ (the diameter of the orbit is perpendicular to the line of sight), the light from A′, which is receding from P, will have a velocity of c − v, and that from B, which is approaching P, will have a velocity of c + v (where v is the speed of rotation of the orbit). The times taken for this light to reach P will be

θ

A

B

B CA

Pd

FIGURE 13.1 Motion of a binary star supports the fact that the velocity of light is independent of the veloc-ity of the source.

Page 4: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

429Theory of Special Relativity

td

c vt

dc vA B2 2=

−=

+.

We now compute the time difference for the same star at two different positions, say, star A.

Δt t td

c vd a

cac

dc

vc

dv

cA A= − =

−− − ≅ − ≈ −2 1 2

because d is an astronomical order of distance, and the radius of the orbit a is negligibly small in comparison with d.

Let us return to the finding of the independence of velocity of light and the velocity of the source. Now, the independence of velocity of light and the velocity of the source pose the problem of the frame of reference with respect to which c is to be measured. In the theory of sound, a similar prob-lem arises, but there, it is easily resolved because the speed is to be measured relative to the still air. In the 19th century, it seemed reasonable to give a similar answer in the case of light. Because of the works of Young and Fresnel, light was viewed as a mechanical wave, so its propagation required a physical medium that was called the ether. Because light can travel through space, it was assumed that ether fills all of space, and the velocity of light must be measured with respect to the stationary unobserved ether. This would have the great advantage of linking the hitherto separated theories of mechanics and electromagnetism. A difficulty, at once, arises. As mechanics holds in every one of the inertial frames, the Maxwell equations should then hold in every one of the inertial frames. It is easy to see that this cannot be so. Let us apply the Lorentz transformation

x′ = x − vt, y′ = y, z′ = z, t′ = t

to the Maxwell’s equations. For example, the first component of the equation

∇ × = ∂∂

��

Hc

Et

1

becomes

∂∂ ′

−∂∂ ′

= ∂∂ ′

− ∂ ′∂ ′

′ ′ ′H

y

H

z cEt

vEx

z y x1. (13.3)

However, there is no other equation in the set linking ∂ ∂ ′′E tx / and ∂ ∂ ′′E xx / , so it is impossible to transform the right-hand side, by transformation of the field vectors, in such a way that the trans-formed equation would read

∂∂ ′

−∂∂ ′

= ∂∂ ′

′ ′ ′H

y

H

z cEt

z y x1.

Alternatively, we can use a much simpler argument. Consider that a source at rest in an inertial frame S emits a light wave, which travels out as a spherical wave at a constant speed. But, observed in a frame S′ that moves uniformly with respect to S, the light wave is no longer spherical, and the speed of light is also different.

Therefore, for electromagnetic phenomena, inertial frames of reference are not equivalent under the Galilean transformation. A number of attempts were proposed to resolve this conflict. These include the following:

Page 5: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

430 Classical Mechanics

1. The Maxwell equations are wrong and need to be modified to obey the Galilean transformation.

2. There is a preferred frame of reference: that of stationary unobserved ether. The Maxwell equations require modification in other inertial frames of reference.

3. The Maxwell equations are correct and have the same form in all inertial reference frames. There is some transformation other than the Galilean transformation that makes both elec-tromagnetic and mechanical equations transform in an invariant way.

As we know now, the third proposal is the correct one, but it was not accepted without resistance. The first attempt was abandoned rather quickly. When it was tried, the new terms that make the Maxwell equations Galilean invariant led to predictions of new phenomena that did not exist when tested experimentally. So the attempt was abandoned. The second attempt was ruled out only after extensive experiments. In the following section, we will review a critical experiment that eventually led us to give up the hypothetical ether, the Michelson–Morley experiment.

Before we proceed to describe the Michelson–Morley experiment, let us make a minor observa-tion: The ether has to be assumed to have contradictory mechanical properties; it is the softest and also the hardest substance. It must be the softest because all material bodies can pass through it without any resistance from the ether. Otherwise, for example, the Earth would have slowed down and fallen into the sun during the billions of years of its traveling around the sun. On the other hand, ether must be harder than any material because light (ether vibration) travels with such a high speed that its elastic constant must be the highest of all known materials. Surprisingly, such contradictions did not prevent physicists of the 19th century from clinging to their belief in the hypothetical ether.

13.2 MICHELSON–MORLEY EXPERIMENT

The existence of ether and the law of velocity addition (according to the Galilean transformation) suggest that it should be possible to detect some variation of the speed of light as emitted by some terrestrial source. As the Earth travels through space at 30 km/s in an almost circular orbit around the sun, it is bound to have some relative velocity with respect to the ether. If this relative veloc-ity is added to that of the light emitted from the source, then light emitted simultaneously in two perpendicular directions should be traveling at different speeds, corresponding to the two relative velocities of the light with respect to the ether.

In one of the most famous and important experiments in physics, Michelson set out, in 1887, to detect this variation in the velocity of the propagation of light. Michelson’s ingenious way of doing this depends on the phenomenon of interference of light to determine whether the time taken for light to pass over two equal paths at right angles was different or not. He designed and constructed an interferometer, schematically shown in Figure 13.2. The interferometer is essentially composed of a light source S, a half-silvered glass plate A, and two mirrors B and C, all mounted on a rigid base. The two mirrors B and C are placed at equal distances L from the plate A. Light from S enters A and splits into two beams. One goes to mirror B, which reflects it back; the other beam goes to mirror C, also to be reflected back. On arriving back at A, the two reflected beams are recombined as two superposed beams, D and F, as indicated. If the time taken for light from A to B and back equals the time from A to C and back, the two beams D and F will be in phase and will reinforce each other. If the two times differ slightly, the two beams will be slightly out of phase, and they produce an interference pattern. A typical interference pattern is sketched in Figure 13.3.

We now calculate the two times to see whether they are the same or not. First, calculate the time required for the light to go from A to B and back. If the line AB is parallel to the Earth’s motion in its orbit, and if the Earth is moving at a speed u and the speed of light in the ether is c, the time is

tLc u

Lc u

L

c u c

Lc

u

cAB AB AB AB

1 2

2

2

2

1

21=

−+

+=

−≈ +

[ ( ) ]/

(13.4)

Page 6: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

431Theory of Special Relativity

where (c − u) is the upstream speed of light with respect to the apparatus, and (c + u) is the down-stream speed.

Our next calculation is of the time t2 for the light to go from A to C. We note that while light goes from A to C, the mirror C moves to the right relative to the ether through a distance d = ut2 to the position C′; at the same time, the light travels a distance ct2 along AC′. For this right triangle, we have

( ) ( )ct L utAC22 2

22= +

Light source

Mirror

D

L

F

Telescope

Half-silvered mirror

Collimating lens

L

A A

MirrorC C´

B

d

FIGURE 13.2 Schematic diagram of the Michelson–Morley experiment.

FIGURE 13.3 Sketch of a typical interference pattern.

Page 7: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

432 Classical Mechanics

from which we obtain

tL

c u

AC2

2 2=

−.

Similarly, as the light is returning to the half-silvered plate, the plate moves to the right to the posi-tion B′. The total path length for the return trip is the same as can be seen from the symmetry of Figure 13.2. Therefore, if the return time is also the same, the total time for light to go from A to C and back is then 2t2, which we denote by t3:

tL

c u

L

c u c

L

c

u

cAC AC AC

32 2 2

2

2

2 2

1

21

2=

−=

−≈ +

( )/. (13.5)

In Equations 13.3 and 13.4, the first factors are the same and represent the time that would be taken if the apparatus were at rest relative to the ether. The second factors represent the modifications in the times caused by the motion of the apparatus. Now, the time difference Δt is

Δt t tL L

cLc

Lc

AC AB AC AB= − = − + −3 12 22 2( ) β β (13.6)

where β = u/c.It is most likely that we cannot make LAB = LAC = L exactly. In that case, we can rotate the appa-

ratus 90°, so that AC is in the line of motion, and AB is perpendicular to the motion. Any small difference in length becomes unimportant. Now, we have

Δ ′ = ′ − ′ = − + −t t tL L

cLc

Lc

AB AC AB AC3 1

2 22 2( ) β β . (13.7)

Thus,

Δ Δ′ − = +t t

L Lc

AB AC( ) β2. (13.8)

This difference yields a shift in the interference pattern across the crosshairs of the viewing tele-scope. If the optical path difference between the beams changes by one λ (wavelength), for example, there will be a shift of one fringe. If δ represents the number of fringes moving past the crosshairs as the pattern shifts, then

δλ λ

β βλ

= ′ − = + =+

c t t L L

L LAB AC

AB AC

( )

( )

Δ Δ 22

/. (13.9)

In the Michelson–Morley experiment of 1887, the effective length L was 11 m; sodium light of λ = 5.9 × 10−5 cm was used. The orbit speed of the Earth is 3 × 104 m/s, so β = 10−4. From Equation 13.6, the expected shift would be about 4/10 of a fringe:

δ = ××

=−

−22 10

5 9 100 37

4 2

5

m ( )

.. . (13.10)

Page 8: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

433Theory of Special Relativity

A shift of 0.005 fringe can be detected by the Michelson–Morley interferometer. However, no fringe shift in the interference pattern was observed. Thus, no effect at all resulting from the Earth’s motion through the ether was found. This null result was very puzzling and most disturbing at the time. How could it be? It was suggested, including by Michelson, that the ether might be dragged along by the Earth, eliminating or reducing the ether wind in the laboratory. This is hard to square with the picture of the ether as an all-pervasive, frictionless medium. The ether’s status as an abso-lute reference frame was also gone forever. Many attempts to save the ether failed (Resnick and Halliday 1985); we just mention one here—namely, the contraction hypothesis.

George Francis Fitzgerald pointed out in 1892 that a contraction of bodies along the direction of their motion through the ether by a factor (1 − u2/c2)1/2 would give the null result. Because Equation 13.4 must be multiplied by the contraction factor (1 − u2/c2)1/2, Δt, Equation 13.6, reduces to zero. The magnitude of this time difference is completely unaffected by rotation of the apparatus through 90°.

Lorentz obtained a contraction of this sort in his theory of electrons. He found the field equations of electron theory remain unchanged if a contraction by the factor (1 − u2/c2)1/2 takes place, provided also that a new measure of time is used in a uniformly moving system. The outcome of the Lorentz theory is that an observer will observe the same phenomena, no matter whether he or she is at rest in the ether or moving with velocity. Thus, different observers are equally unable to tell whether they are at rest or moving in the ether. This means that for optical phenomena, just as for mechanics, ether is unobservable.

Poincare offered another line of approach to the problem. He suggested that the result of the Michelson–Morley experiment was a manifestation of a general principle that absolute motion can-not be detected by laboratory experiments of any kind, and the laws of nature must be the same in all inertial frames.

13.3 POSTULATES OF SPECIAL THEORY OF RELATIVITY

Einstein realized the full implications of the Michelson–Morley experiment, the Lorentz theory, and Poincare’s principle of relativity. Instead of trying to patch up the accumulating difficulties and contradictions connected with the notion of ether, Einstein rejected the ether idea as unnecessary or unsuitable for the description of the physical world and returned to the pre-ether idea of a com-pletely empty space. Along with the exit of ether from the stage of physics, the notion of absolute motion through space is also gone. The Michelson–Morley experiment proves unequivocally that no such special frame of reference exists. All frames of reference in uniform relative motion are equivalent for mechanical motions and also for electromagnetic phenomena. Einstein extended this as a fundamental postulate, now known as the principle of relativity. Furthermore, he argued that the velocity of light predicted by electromagnetic theory must be a universal constant, the same for all observers. He took an epoch-making step in 1905 and developed the special theory of relativity from these two basic postulates (assumptions), which are rephrased as follows:

1. The principle of relativity: the laws of physics are the same in all inertial frames. No preferred inertial frame exists.

Two people standing in the aisle of an airplane going 700 km/h can play catch exactly as they would on the ground. On that airplane, if you drop a heavy and a light ball together, they fall at the same rate as they would if you dropped them on the ground, and they hit the cabin floor at the same time. When you are moving uniformly, you do not experience the physical sensation of motion. Experiments indicate that the principle of relativity also applies to electromagnetism; it is very general. That radios and tape recorders work the same on an airplane as they do in the house is a simple example.

2. The principle of the constancy of the velocity of light: the velocity of light in empty space is the same in all inertial frames and is independent of the motion of the emitting body.

Page 9: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

434 Classical Mechanics

According to Einstein, sometime in 1896, after he entered the Zurich Polytechnic Institute to begin his education as a physicist, he asked himself the question of what would happen if he could catch up to a light ray, that is, move at the speed of light. Maxwell’s theory says that light is a wave of electric and magnetic fields that moves like a water wave through space. But if you could catch up to one of Maxwell’s light waves the way a surfboard rider catches an ocean wave for a ride, then the light wave would not be moving relative to you, but instead would be standing still. The light wave would then be a standing wave of electric and magnetic fields that is not allowed if Maxwell’s theory is right. So, he reasoned, there must be something wrong with the assumption that you can catch a light wave as you can catch a water wave. This idea was a seed from which the fundamental postulate of the constancy of the speed of light and the special theory of relativity grew nine years later.

All the seemingly very strange results of special relativity came from the special nature of the speed of light. Once we understand this, everything else in relativity makes sense. So we take a brief look at this special nature of the speed of light. The speed of light is very great, 3 × 108 m/s. However, the bizarre fact of the speed of light is that it is independent of the motion of the observer or the source emitting the light. Michelson hoped to determine the absolute speed of the Earth through ether by measuring the difference in times required for light to travel across equal distances that are at right angles to each other. What did he observe? No difference in travel times for the two perpendicular light beams. It was as if the Earth were absolutely stationary. The conclusion is that the speed of light does not depend on the motion of the observer. We saw earlier that the speed of light is also independent of the speed of the source. The special nature of the speed of light is not something we would expect from common sense. The same common sense was once objecting to the idea that the Earth is round. Hence, common sense is not always right.

13.3.1 TIME IS NOT ABSOLUTE

The constancy of the velocity of light puts an end to the notion of absolute time. We know that Newtonian mechanics abolished the notion of absolute space but not of absolute time. Now, time is also not absolute anymore. Because all inertial observers must agree on how fast light travels but not how far light travels, space is not absolute. Now, time taken is the distance light has traveled divided by the speed of light; thus, they must disagree over the time that the journey took. And time lost its universal nature. In fact, we shall see later that moving clocks run slow. This is known as time dilation.

13.4 LORENTZ TRANSFORMATIONS

Because the Galilean transformations are inconsistent with Einstein’s postulate of the constancy of the speed of light, we must modify it in such a way that the new transformation will incorporate Einstein’s two postulates and make both mechanical and electromagnetic equations transform in an invariant way.

To this end, we consider two inertial frames S and S′. For simplicity, let the corresponding axes of the two frames be parallel with frame S′ moving at a constant velocity u relative to S along the x1-axis. The apparatuses used to measure distances and times in the two frames are assumed identi-cal, and both clocks are adjusted to read zero at the moment the two origins coincide. Figure 13.4 represents the viewpoint of observers in S.

Suppose that an event occurred in frame S at the coordinates (x, y, z, t) that is observed at (x′, y′, z′, t′) in frame S′. Because of the homogeneity of space and time, we expect the transformation relationships between the coordinates (x, y, z, t) and (x′, y′, z′, t′) to be linear; otherwise, there would not be a simple one-to-one relationship between events in S and S′ frames. For instance, a nonlinear

Page 10: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

435Theory of Special Relativity

transformation would predict acceleration in one system even if the velocity were constant in the other, obviously an unacceptable property for a transformation between inertial systems.

We consider the transverse dimensions first. Because the relative motion of the coordinate systems occurs only along the x-axis, we expect the linear relationships are of the forms y′ = k1y and z′ = k2 z. The symmetry requires that y = k1y′ and z = k2 z′. These can both be true only if k1 = 1 and k2 = 1. Therefore, for the transverse direction, we have

y′ = y and z′ = z. (13.11)

These relationships are the same as in Galilean transformations.For the longitudinal dimension, we expect that the relationship between x and x′ must involve

some change in the time coordinate, so we consider the most general linear relationship

x′ = ax + bt. (13.12)

Now the origin O′, where x′ = 0, corresponds to x = ut. Substituting these into Equation 13.12, we have

0 = aut + bt

from which we obtain

b = –au

and Equation 13.12 simplifies to

x′ = a(x − ut). (13.13)

By symmetry, we also have

x = a(x′+ ut′). (13.14)

Now we apply Einstein’s second postulate of the constancy of the speed of light. If a pulse of light is sent out from the origin O of frame S at t = 0, its position along the x-axis later is given by x = ct, and its position along the x′-axis is x′ = ct′. Putting these in Equations 13.13 and 13.14, we obtain

ct′ = a(c − u)t and ct = a(c + u)t′.

yy´

S´St t´

zz´

O´O

x x´

FIGURE 13.4 Relative motion of two coordinate systems.

Page 11: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

436 Classical Mechanics

From these, we obtain

tt

ca c u

tt

a c uc′

=− ′

= +( )

( )and .

Therefore,

c

a c ua c u

c( )( )

−= +

.

Solving for a,

au c

=−

1

1 2( )/.

Then,

b auu

u c= − = −

−1 2( )/.

Substituting these into Equations 13.13 and 13.14 gives

′ = −

−x

x ut

1 2β (13.14a)

and

x

x ut= ′ + ′

−1 2β (13.14b)

where β = u/c. Eliminating either x or x′ from Equations 13.14a and 13.14b, we obtain

′ = −

−t

t ux c/ 2

21 β (13.14c)

and

tt ux c= ′ + ′

/ 2

21 β. (13.14d)

Combining all of these results, we obtain the Lorentz transformations:

′ = − = ′ +′ = = ′′ = = ′′ = −

x x ut x x ut

y y y y

z z z z

t t ux c

γ γ

γ

( ) ( )

( / 22 2) ( )t t ux c= ′ +

γ /

(13.15)

Page 12: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

437Theory of Special Relativity

where

γ β β( )= − =1 1 2/ , /u c (13.16)

is the Lorentz factor.If β ≪ 1, γ ≅ 1, then Equation 13.15 reduces to the Galilean transformations. Thus, the Galilean

transformation is a first approximation to the Lorentz transformations for β ≪ 1.When the velocity,

�u , of S′ relative to S is in some arbitrary direction, Equation 13.15 can be

given a more general form in terms of the components of �r and

�′r perpendicular and parallel to

�u:

� � � � � �

� � � �� � � �′ = − = ′ +′ = ′ =

′⊥ ⊥ ⊥ ⊥

r r ut r r ut

r r r r

γ γ( ) ( )

tt t u r c t t u r c= − ⋅ = ′ + ⋅

γ γ( ) ( )

� � � �/ /2 2

. (13.17)

The Lorentz transformations are valid for all types of physical phenomena at all speeds. As a consequence of this, all physical laws must be invariant under a Lorentz transformation.

The Lorentz transformations that are based on Einstein’s postulates contain a new philosophy of space and time measurements. We now examine the various properties of these new transforma-tions. In the following discussion, we still use Figure 13.4.

13.4.1 RELATIVITY OF SIMULTANEITY, CAUSALITY

Two events that happen at the same time but not necessarily at the same place are called “simultaneous.” Now, consider two events in S′ that occur at ( ′x1, ′t1) and ( ′x2, ′t2), and they would appear in frame S at (x1, t1) and (x2, t2). The Lorentz transformations give

t t t tu x x

c2 1 2 1

2 12

− = ′ − ′ + ′ − ′

γ ( )

( ). (13.18)

It is easy to observe the following:

(1) If the two events take place simultaneously in S′, then ′ − ′ =t t2 1 0. But the events do not occur simultaneously in the S frame, for there is a finite time lapse:

Δt t t

u x x

c= − = ′ − ′

2 12 1

2γ ( )

.

Thus, two spatially separated events that are simultaneous in S′ would not be measured to be simultaneous in S. In other words, the simultaneity of spatially separated events is not an absolute property as it was assumed to be in Newtonian mechanics.

Moreover, depending on the sign of ( )′ − ′x x2 1 , the time interval Δt can be positive or negative, that is, in the frame S, the “first” event in S′ can take place earlier or later than the “second” one. The sole exception is the case when two events occur coincidentally in S′, then they also occur at the same place and at the same time in frame S.

(2) If the order of events in frame S is not reversed in time, then Δt = t2 − t1 > 0, which implies that

( )( )′ − ′ + ′ − ′ >t t

u x x

c2 1

2 12

0

Page 13: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

438 Classical Mechanics

or

′ − ′′ − ′

<x xt t

cu

2 1

2 1

2

which will be true as long as

′ − ′′ − ′

<x xt t

c2 1

2 1

.

Thus, the order of events will remain unchanged if no signal can be transmitted with a speed greater than c, the speed of light.

The preceding discussion also illustrates clearly that the theory of relativity is incompatible with the notion of action at a distance.

13.4.2 TIME dILATION, RELATIVITY OF CO-LOCALITY

Two events that happen at the same place but not necessarily at the same time are called co-local. Now consider two co-local events in S′ taking place at times ′t1 and ′t2 but at the same place. For simplicity, consider this to be on the x′-axis so that y′ = z′ = 0. These two events would appear in frame S at (x1, t1) and (x2, t2). According to the Lorentz transformations, we have

ΔΔ

Δ ΔΔ

Δxu t

u t tt

t= ′

−= ′ = ′

−= ′

1 12 2βγ

βγ, (13.19)

where β = u/c, Δ ′ = ′ − ′t t t2 1, and so forth. It is easy to observe the following:

(1) The two events that happened at the same place in S′ do not occur at the same place in S, and so t1 and t2 must be measured by spatially separated synchronized clocks. Einstein’s prescription for synchronizing two stationary separated clocks is to send a light signal from clock 1 at a time t1 (measured by clock 1) and reflected back from clock 2 at a time t2 (mea-sured by clock 2). If the reflected light returns to clock 1 at a time t3 (measured by clock 1), then clocks 1 and 2 are synchronous if t2 − t1 = t3 − t2, that is, if the time measured for light to go one way is equal to the time measured for light to go in the opposite direction.

(2) The time interval between two events taking place at the same point in an inertial reference is measured by a single clock at that point, and it is called the proper time interval between two events. In the second equation of Equation 13.19, Δ ′ = ′ − ′t t t2 1 is the proper time inter-val between the events in S′. Because γ ≥ 1, the time interval Δt = t2 − t1 in S is longer than Δt′; this is called time dilation, often described by the statement that “moving clocks run slow” (as seen by the stationary observer in S). This apparent asymmetry between S and S′ in time is a result of the asymmetric nature of the time measurement.

From the above discussion, we see that proper time interval is the minimum time interval that any observer can measure between two events. Note that in Equation 13.19, Δt′ is the proper time interval.

In addition to the common types of clock with which we are all familiar, there are atomic and nuclear processes that can and are being used for measuring time intervals. Among them are the emission and absorption of radiation and the decay of subatomic particles. These particles are

Page 14: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

439Theory of Special Relativity

usually in motion, and a measurement of a time interval, particularly when v ≈ c, will be greatly influenced by its velocity. One of these events is the decay of muons (μ) that has become a classical demonstration of time dilation. The muon is an unstable particle that spontaneously decays into an electron and two neutrinos:

µ± ±→ + +e v v

where e− stands for electron, e+ for positron, v for neutrino, and v for antineutrino. The decay of the muon is typical of radioactive decay processes: if there are N(0) muons at t = 0, the number at time t is

N(t) = N(0)et/τ

where τ, the mean lifetime, is 2.15 × 10−6 s. Muons can be observed by stopping them in dense absorbers and detecting the decay electron, which comes off with an energy of approximately 40 MeV.

Muons are formed in abundance when high-energy cosmic ray protons enter the Earth’s upper atmosphere. The protons lose energy rapidly, and at the altitude of a typical mountaintop, 2000 m, there are few left. But the muons penetrate far through the Earth’s atmosphere and may reach the ground. The muons descend through the atmosphere with a velocity close to c. The minimum time to descend 2000 m is approximately

T = ××

= × −2 10

3 107 10

3

56m

m/ss.

This is more than 3 times the mean lifetime, τ, of the muon.The experiment consists of comparing the flux of muons at the mountaintop with the flux at sea

level:

flux at sea level

flux at mountaintop= ′eT /τ .

Here τ′ is different from τ, which is the mean lifetime to decay of a muon at rest. When the muon moves rapidly in the atmosphere, the lifetime τ′ observed is increased by time dilation according to the relationship

′ =−

=τ τ

βγτ

1 2.

The measured flux ratio is 0.7. To account for this measured ratio, we require γ = 10. This was found to be the case: by measuring the energy of the muons, γ was determined to be 10 within the experimental error.

Time dilation between observers in uniform relative motion is not an artifact of the clock we choose to construct. It is a very real thing. All processes, including atomic and biological processes, slow down in moving systems.

13.4.3 LENGTH CONTRACTION

Consider a rod of length L0 in the S′ frame in which the rod lies at rest along the x′ axis: L x x0 2 1= ′ − ′. L0 is the proper length of the rod measured in the rod’s rest frame S′. Now, the rod is moving lengthwise

Page 15: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

440 Classical Mechanics

with velocity u relative to the S frame. An observer in the S frame makes a simultaneous measure-ment of the two ends of the rod. The Lorentz transformations give

′ = − ′ = −x x ut x x ut1 1 1 2 2 2γ γ[ ] [ ],

from which we get

′ − ′ = − − −x x x x u t t2 1 2 1 2 1γ γ[ ] ( ).

Let L(u) = x2 − x1, the length of the rod moving with speed u relative to the observer in S, and remem-ber that t1 = t2, γ = 1/(1 − β2)1/2. The result then becomes

L u L( ) = −1 20β . (13.20)

Thus, the length of a body moving with velocity u relative to an observer is measured to be shorter by a factor of (1 − β2)1/2 in the direction of motion relative to the observer.

Because all inertial frames of reference are equally valid, if L′ = γL, does not the expression L = γL′ have to be equally true? The answer is no. The reason is that the measurement was not carried out in the same way in the two frames of reference. The two events of marking the positions of the two ends of the rod were simultaneous in the S frame but not simultaneous in the S′ frame. This difference gives the asymmetry of the result. Length L′ was equal to length L0 only because the rod was at rest in the S′ frame. As a general expression, Δx′ = γΔx is not true. The full expression relat-ing distances in two frames of reference is Δx′ = γ (Δx − uΔt). The symmetrical inverse relationship is Δx = γ(Δx′ + uΔt′). In the case that was considered earlier, Δt = 0, so Δx′ = γΔx; but Δt′ ≠ 0, so Δx ≠ Δx′.

A body of proper volume V0 can be divided into thin rods parallel to u. Each one of these rods is reduced in length by a factor (1 − β2)1/2 so that the volume of the moving body measured by an observer in S is V = (1 − β2)1/2V0.

Example 13.1

A ruler of length L0 lies in the x′y′-plane of its rest system and makes an angle θ0 with the x′-axis. What is the length and orientation of the ruler in the observer’s system with respect to which the ruler moves to the right with velocity u?

Solution:

Let the ends of the ruler be designated by A and B. In the system S′ in which the ruler lies, A and B have the following coordinates:

( , ) ( , ) ( , ) ( cos , sin )′ ′ = ′ ′ =x y x y L LA A B B0 0 0 0 0 0θ θ .

Using x′ = γ(x − ut) and y′ = y, we obtain the coordinates of A and B in the observer’s (S) system at time t:

′ = = − ′ =x x ut y yA A A A0 γ( ),

′ = = − ′ = =x L x ut y L yB B B B0 0 0 0cos ( ) sinθ γ θ, .

From these two equations, we find

Page 16: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

441Theory of Special Relativity

xB – xA = γ−1 L0 cos θ0, yB – yA = L0 sin θ0.

The length L of the ruler in the system S is

L x x y y LB A B A= − + − = −( ) ( ) cos2 20

2 201 β θ

which indicates that the moving ruler is contracted. And the angle θ that the ruler makes with the x-axis is

θ γ θ= −−

= ( )− −tan tan tan1 10

y yx x

B A

B A

.

The moving ruler is both contracted and rotated.Length contraction opens the possibility of space travel. The nearest star, besides the sun, is

Alpha Centauri, which is about 4.3 light years away: light from Alpha Centauri takes 4.3 years to reach us. Even if a spaceship could travel at the speed of light, it would take 4.3 years to reach Alpha Centauri. This is certainly true from the point of view of an observer on Earth. But from the point of view of the crew of the spaceship, the distance between the Earth and Alpha Centauri is shortened by a factor γ = (1 − β2)1/2, where β = v/c and v is the speed of the spaceship. If v is, say, 0.99c, then γ = 0.14, and the distance appears to be only 14% of the value as seen from the Earth. The crew, therefore, deduces that light from Alpha Centauri takes only 0.14 × 4.3 = 0.6 year to reach Earth. But the crew sees Alpha Centauri coming toward it with a speed of 0.99c and expects to get there in 0.60/0.99 = 0.606 year without having to suffer a long tedious journey. But, in practice, the power requirements to launch a spaceship near the speed of light are prohibitive.

13.4.4 VISUAL APPARENT SHAPE OF RAPIdLY MOVING OBjECT

An interesting consequence of the length contraction is the visual apparent shape of a rapidly moving object. This was shown first by Terrell (1959). The act of seeing involves the simultaneous reception of light from the different parts of the object. In order for light from different parts of an object to reach the eye or a camera at the same time, light from different parts of the object must be emitted at different times to compensate for the different distances the light must travel. Thus, taking a picture of a moving object or looking at it does not give a valid impression of its shape. Interestingly, the distor-tion that makes the Lorentz contraction seem to disappear instead makes an object seem to rotate by an angle θ = sin−1(u/c) as long as the angle subtended by the object at the camera is small. If the object moves in another direction or if the angle it subtends at the camera is not small, the apparent distortion becomes quite complex.

Figure 13.5 shows a cube of side l moving with a uniform velocity u with respect to an observer some distance away; the side AB is perpendicular to the line of sight of the observer. In order for light from corners A and D to reach the observer at the same instant, light from D, which must travel a distance l farther than from A, must have been emitted when D was at position E. The length DE is equal to (l/c)u = lβ. The length of the side AB is foreshortened by the Lorentz contrac-

tion to l 1 2−β . The net result corresponds to the view that the observer would have if the cube were rotated through an angle sin−1β. The cube is not distorted; it undergoes an apparent rotation. Similarly, a moving sphere will not become an ellipsoid; it still appears as a sphere. An interesting discussion of apparent rotations at high velocity is given by Weisskopf (1960).

13.4.5 RELATIVISTIC VELOCITY AddITION

Another very important kinematic consequence of the Lorenz transformation is that the Galilean transformation for velocity is no longer valid. The new and more complicated transformation for

Page 17: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

442 Classical Mechanics

velocities can be deduced easily. By definition, the components of velocity in S and S′ are given by, respectively,

vxt

x xt tx = = −−

dd

2 1

2 1

and

vyt

y yt ty = = −−

dd

2 1

2 1

′ = ′ = ′ − ′

′ − ′v

xt

x xt tx

dd

2 1

2 1

and

′ = ′ = ′ − ′′ − ′

vyt

y yt ty

dd

2 1

2 1

and so on.Applying the Lorentz transformations to x1 and x2 and then taking the difference, we get

dd d

,xx u t u

c= ′ + ′

−=

1 2ββ .

Similarly,

dd d′ = −

−x

x u t

1 2β.

Do the same for the time intervals dt and dt′:

dd /

tt udx c= ′ + ′

2

21 β.

(a) (b)

Observer

E D C

A B A´l

l u

l

l B´

lβ 1 − βl 2

θ

FIGURE 13.5 Visual apparent shape of a rapidly moving object. (a) A cub moving perpendicular to an observer’s sight, (b) the observer view the cube rotated through an angle sin–1β.

Page 18: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

443Theory of Special Relativity

and

dd d /′ = −

−t

t u x c2

21 β.

From these, we obtain

dd

d d

d d /

xt

x u t

t u x c= ′ + ′

′ + ′ 2.

Dividing both the numerator and the denominator of the right-hand side by dt′ yields the right trans-formation equation for the x component of the velocity:

vv u

uv cx

x

x

= ′ ++ ′1 2/

. (13.21a)

Similarly, we can find the transverse components:

vv

uv c

v

uv cy

y

x

y

x

=′ −+ ′

=′

+ ′1

1 1

2

2 2

βγ/ /( )

(13.21b)

vv

uv c

v

uv cz

z

x

z

x

= ′ −+ ′

= ′+ ′

1

1 1

2

2 2

βγ/ /( )

. (13.21c)

In these formulas, γ = (1 − β2)−1/2 as before. We note that the transverse velocity components depend on the x-component. For v ≪ c, we obtain the Galilean result v v ux x= ′ + .

Solving explicitly or merely switching the sign of u would yield ( )′ ′ ′v v vx y z, , in terms of (vx, vy, vz).It follows from the velocity transformation formulas that the value of an angle is relative and

changes in transition from one reference frame to another. For an object in the S frame moving in the xy-plane with velocity v that makes an angle θ with the x-axis, we have

tan θ = vy/vx, vx = v cos θ, vy = v sin θ.

In the S′ frame, we have

tansin

( cos )′ =

′′

=−

θ θγ θ

v

vv

v uy

x

(13.22)

where

γ β β= − =1 1 2/ and /u c.

As an application, consider the case of star light, that is, v = c; then,

tansin

(cos )′ =

−θ θ

γ θ u c/.

Page 19: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

444 Classical Mechanics

Let θ = π/2, θ′ = π/2 − φ (Figure 13.6); then,

tan sinφβ

φ= −

−= −u c u

c/

,1 2

which is the star aberration formula; to see a star overhead, tilt the telescope at angle φ.

Example 13.2: The Velocity of Light Is the Limiting Speed

A bullet is fired in the forward direction from a moving platform whose speed is u, as shown in Figure 13.7. The muzzle velocity of the bullet is ′v1. What is the velocity of the bullet relative to the ground?

Solution:

The S′ system is with the moving platform, and the S system is attached to the ground. The muzzle velocity of the bullet ′v1 is measured relative to the gun. The velocity of the bullet relative to the ground v1 is

vv uuv c11

121

= ′ ++ ′/

.

Light ray

θθ

SSu

ZZ´

x x

y y

FIGURE 13.6 Aberration. The angles of a light ray with x-axis and xʹ-axis are different.

u

v 1

FIGURE 13.7 Velocity of light is the same for all observers.

Page 20: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

445Theory of Special Relativity

Now, let us take a hypothetical case: u = 0.9c and ′ =v c1 0 9. . Then,

vv uuv c

c cc c c1

1

12 21

0 9 0 91 0 9 0 9

0= ′ ++ ′

= ++

=/ /

. .( . )( . )

.99945c c<

whereas, according to the Galilean transformation, v v u c1 1 1 8= ′ + = . . In the limiting case, if we take both u and ′v1 to be c, then

vv uuv c

c cc c c

c11

12 21 1

= ′ ++ ′

= ++ ⋅

=/ /

which agrees with the postulate originally built into the Lorentz transformations: the velocity of light is the same for all observers. Thus, the relativistic transformation of velocities ensures that we cannot exceed the velocity of light by changing reference frames.

13.5 dOPPLER EFFECT

The Doppler effect occurs for light and sound. It is a shift in frequency resulting from the motion of the source or the observer. Knowledge of the motion of distant receding galaxies comes from stud-ies of the Doppler shift of their spectral lines. The Doppler effect is also used for satellite tracking and radar speed traps. We examine the Doppler effect in light only.

Consider a source of light or radio waves moving with respect to an observer or a receiver, at a speed u and at an angle θ with respect to the line between the source and the observer (Figure 13.8). The light source flashes with a period τ0 in its rest frame (the S′ frame in which the source is at rest). The corresponding frequency is v0 = 1/τ0, and the wavelength is λ0 = c/v0 = cτ0.

While the source is going through one oscillation, the time that elapses in the rest frame of the observer (the S frame) is τ = γτ0 because of time dilation, where γ = (1 − β2)1/2 and β = u/c. The emit-ted wave travels at speed c, and therefore, its front moves a distance γτ0c; the source moves toward the observer with a speed ucosθ, so a distance γτ0ucosθ. Then, the distance D that separates the fronts of the successive waves (the wavelength) is

D = γτ0c − γτ0ucos θ,

that is,

λ = γτ0c − γτ0ucos θ = γτ0c[1 − (u/c)cos θ],

but cτ0 = λ0; we can rewrite the last expression as

λ λ β θ

β= −

−0

2

1

1

cos. (13.23)

In terms of frequency, this Doppler effect formula becomes

µθ Direction to receiver

FIGURE 13.8 Doppler effect.

Page 21: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

446 Classical Mechanics

v v=−

−0

21

1

ββ θcos

. (13.24)

Here, v is the frequency at the observer, and θ is the angle measured in the rest frame of the observer. If the source is moving directly toward the observer, then θ = 0 and cos θ = 1. Equation 13.24 reduces to

v v v=−−

= +−0

2

0

1

111

ββ

ββ

. (13.24a)

For a source moving directly away from the observer, cos θ = −1, Equation 13.24 reduces to

v v v=−+

= −+0

2

0

1

111

ββ

ββ

. (13.24b)

At θ = π/2, that is, the source moving at right angles to the direction toward the observer, Equation 13.24 reduces to

ν ν β= −021 . (13.24c)

This transverse Doppler effect is a result of time dilation.

13.6 RELATIVISTIC SPACE–TIME (MINKOWSKI SPACE)

As we have seen previously, there is no absolute standard for the measurement of time or of space; the relative motion of observers affects both kinds of measurement. Lorentz transformations treat (x, y, z) and t as equivalent variables. In 1907, Hermann Minkowski proposed that the three dimen-sions of space and the dimension of time should be treated together as four dimensions of space–time. Minkowski said, “Henceforth space by itself and time by itself, are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality.” He called the four dimensions of space–time “the world space” (also known as the Minkowski space); an event is a point in a space–time diagram, and the path of an individual particle is a “world line” as shown in Figure 13.9. Clearly, we cannot draw in four dimensions, but we can draw easily a diagram in which we measure time in vertical direction and distance in horizontal direction. In Figure 13.9, dots are events. Line A represents a stationary particle, line B a particle moving with constant veloc-ity, and line C an accelerating particle (starting from rest). The space–time diagram is a valuable aid in visualizing many relativity problems. We will describe one such graph in the last section as it is not our main concern.

By analogy, with the three-dimensional case, the coordinates of an event (ct, x, y, z) can be con-sidered as the components of a four-dimensional radius vector, or a radius four-vector (for short) in a four-dimensional Minkowski space. The square of the length of the radius four-vector for any event E is a variable quantity:

s2 = c2t2 – (x2 + y2 + z2). (13.25)

Page 22: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

447Theory of Special Relativity

This quadratic expression is commonly known as an interval, and we denote it by s2; it is an invari-ant, independent of the frame used to measure these coordinates. To see it, let us calculate the value of x2 − c2t2 (for simplicity) in S frame in terms of x′ and t′ in S′ frame:

x c t x ut c t ux c

c u

2 2 2 2 2 2 2 2 2

2 2 2

− = ′ + ′ − ′ +

= − − ′

γ γ

γ

( ) ( )

( )

/

tt ux t ux t x u c

x c t

2 2 2 2

2 2 2

2 2+ ′ ′ − ′ ′ + ′ −{ }= ′ − ′

(

.

/ 1)

As an interval is an invariant quantity, we may use it to classify the possible events in space–time. In particular, we may divide events into three types: those for which the invariant label is positive, zero, or negative. Events for which the interval is positive are called time-like with respect to the origin because they are of the class containing those with x = 0 and t ≠ 0, which are just changes in time of a clock at the origin of space. Those for which the invariant is negative are called space-like with respect to the origin because that includes events with t = 0 and x ≠ 0, which are just spatially separated but simultaneous with an event at the origin of space–time. The final class of events with the interval being zero is called light-like with respect to the origin because a ray of light can pass to or from the origin of space–time to them. This division of space–time into three regions is shown in Figure 13.10, the light cone of a two-dimensional space and a one-dimensional time continuum:

(1) c2t2 − x2 > 0: time-like with respect to the origin (2) c2t2 − x2 < 0: space-like with respect to the origin (3) c2t2 − x2 = 0: light-like with respect to the origin

Such a classification of events can also be considered from the point of view of causality.It is now a common practice to treat t as a zeroth or a fourth coordinate:

x0 = ct, x1 = x, x2 = y, x3 = z

or

x1 = x, x2 = y, x3 = z, x4 = ix0.

Time

Space (distance)

A B C

FIGURE 13.9 In the Minkowski space, the path of an individual particle is a world line.

Page 23: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

448 Classical Mechanics

Then the Lorentz transformations take on the form

x x i x

x x

x x

x i x x

1 1 4

2 2

3 3

4 1 4

′′′

= +

=== − +

γ β

γ β

( )

( )

or

x x x

x x x

x x

x x

0 0 1

1 0 1

2 2

3 3

′′

= −

= − +

==

γ β

γ β

( )

( )

.

(13.26)

In matrix form, we have

x

x

x

x

i

i

1

2

3

4

0 0

0 1 0 00 0 1 0

0 0

′′′′

=

γ βγ

βγ γ

x

x

x

x

1

2

3

4

or

x

x

x

x

0

1

2

3

0 0

0 0

0 0 1 00 0 0 1

′′′′

=

−−

γ βγβγ γ

. (13.27)

It is customary to use Greek indices (μ and v, etc.) to label four-dimensional variables and Latin indices (i and j, etc.) to label three-dimensional variables.

Space-like

Light-

like

Time-likepast

y

t

Time-likefuture

x

FIGURE 13.10 Space–time diagram of a three-dimensional world, showing the light cone.

Page 24: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

449Theory of Special Relativity

The Lorentz transformations can be distilled into a single equation:

x L x L x vvv

vvµ µ µ

ν

µ′ = = ==∑ , , , ,1 2 3 4

1

4

(13.28)

where Lvµ is the Lorentz transformation matrix in Equation 13.27. The summation sign is elimi-

nated by Einstein’s summation convention: the repeated indexes appearing once in the lower and once in the upper position are automatically summed over. However, the indexes repeated in the lower part or upper part alone are not summed over.

If Equation 13.28 reminds us of the orthogonal rotations, it is no accident. The general Lorentz transformations can indeed be interpreted as an orthogonal rotation of axes in Minkowski space. Note that the xt-submatrix of the Lorentz matrix in Equation 13.27 is

γ β γβ γ γ

i

i−

which is to be compared with the xy-submatrix of the two-dimensional rotation about the z-axis:

cos sinsin cos

θ θθ θ−

.

Upon identification of matrix elements cos θ = γ and sin θ = iβγ, we see that the rotation angle θ (for the rotation in the xt-plane) is purely imaginary.

Some books prefer to use a real angle of rotation φ, defining φ = −iθ. Then note that

cos coshθ φθ θ φ φ

= + = + =− −e e e ei i

2 2

sin[ ]

sinhθ φθ θ φ φ

= − = − =− −e e

ii e ei i

2 2

and the submatrix becomes

cosh sinh

sinh cosh

φ φφ φ

i

i−

.

We should note that the mathematical form of Minkowski space looks exactly like a Euclidean space; however, it is not physically so because of its complex nature as compared to the real nature of the Euclidean space.

13.6.1 FOUR-VELOCITY ANd FOUR-ACCELERATION

How do we define four vectors of velocity and acceleration? It is evident that the set of the four quantities d /dx tµ does not have the properties of a four-vector because dt is not an invariant. But we know that the proper time dτ is an invariant. Although observers in different frames may disagree about the time interval between events because each is using his or her own time axis, all agree on

Page 25: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

450 Classical Mechanics

the value of the time interval that would be observed in the frame moving with the particle. The components of the four-velocity are therefore defined as

uxµ

µ

τ= d

d. (13.29)

The second equation of Equation 13.19 relates the proper time dτ (was dt′ there) to the time dt read by a clock in frame S relative to which the object (S′ frame) moves at a constant u:

d dτ β= −t 1 2 .

We can rewrite uμ completely in terms of quantities observed in frame S as

uxt

xt

µµ µ

γβ

= =−

dd

dd

1

1 2 (13.30)

where

γ β= −1 1 2/ .

In terms of the ordinary velocity components v1, v2, v3, we have

uμ = (γc, γvi), i = 1, 2, 3. (13.31)

The length of four-velocity must be invariant. We can verify this easily:

( )u cµ

µ

2 2

1

4

= −=∑ . (13.32)

Similarly, a four-acceleration is defined as

wx uµ

µ µ

τ τ= =d

d

dd

2

2 . (13.33)

Now differentiating Equation 13.32 with respect to τ, we obtain

wμuμ = 0. (13.34)

That is, the four vectors of velocity and acceleration are mutually perpendicular.

13.6.2 FOUR-ENERGY ANd FOUR-MOMENTUM VECTORS

Now, it is obvious that Newtonian dynamics cannot hold totally. How do we know what to retain and what to discard? The answer is found in the generalizations that grew from the laws of motion but transcend it in their universality. These are the conservation of momentum and energy.

Thus, we now generalize the definitions of momentum and energy so that, in the absence of external forces, the momentum and energy of a system of particles are conserved. In Newtonian

Page 26: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

451Theory of Special Relativity

mechanics, the momentum �p of a particle is defined as mv

�, the product of the particle’s inertial

mass and its velocity. A plausible generalization of this definition is to use the four-velocity uμ and an invariant scalar m0 that truly characterize the inertial mass of the particle and define the momen-tum four-vector (four-momentum, for short) P μ as

P μ = m0uμ. (13.35)

To ensure that the “mass” of the particle is truly a characteristic of the particle, this mass must be that measured in the frame of reference in which the particle is at rest. Thus, the mass of the particle must be its proper mass. We customarily call this mass the rest mass of the particle and denote it by m0. Using Equation 13.31, we write Pμ in component form:

Pm v

m v j

Pm c

m c

j o jj

o

=−

= =

=−

=

11 2 3

1

20

0

20

βγ

βγ

, , ,

. (13.36)

We see that as β = v/c → 0, the first three components of the four-momentum P μ reduce to movj, the components of the ordinary momentum. This indicates that Equation 13.35 appears to be a reason-able generalization.

Let us write the time component of P0 as

Pm c E

c0 0

21=

−=

β. (13.37)

Now, what is the meaning of the quantity E? For low velocities, the quantity E reduces to

Em c

m c m v=−

≅ +02

20

20

2

1

12β

.

The second term on the right-hand side is the ordinary kinetic energy of the particle; the first term can be interpreted as the rest energy of the particle (it is an energy the particle has even when it is at rest), which must contain all forms of internal energy of the object, including heat energy, internal potential energy of various kinds, or rotational energy if any. Hence, we can call the quantity E the total energy of the particle (moving at speed v).

We now write the four-momentum as

PEc

P jµ =

, . (13.38)

The length of the four-momentum must be invariant, just as the length of the velocity four-vector is invariant under the Lorentz transformations. We can show this easily:

P P m u m u m cµ µ

µ

µ µ

µ∑ ∑= = −( )( )0 0 0

2 2. (13.39)

But Equation 13.38 gives

Page 27: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

452 Classical Mechanics

P P PE

cµ µ

µ∑ = −2

2

2.

Combining this with Equation 13.39, we arrive at the relationship

PE

cm c2

2

2 02 2− = −

or

E P c m c2 2 202 4− = . (13.40)

The total energy E and the momentum P μ of a moving body are different when measured with respect to different reference frames. But the combination P2 – E 2/c2 has the same value for all frames of reference, namely, m c0

2 2. This relationship is very useful. Another very useful relationship is � �P v E c= ( )/ 2 . From Equation 13.35, we see that γm0 = E/c2; combining this with the first equation

of Equation 13.36 gives the very useful relationship � �P v E c= ( )/ 2 .

The relativistic momentum, however, is not quite the familiar form found in general physics because its spatial components contain the Lorentz factor γ. We can bring it into the old sense, and the traditional practice was to introduce a “relativistic mass” m:

m mm= =−

00

21γ

β.

With this introduction of m, Pj takes the old form: Pj = mvj. However, some feel that the introduction of the relativistic mass is not a purely methodological issue; it often causes misunderstanding and vague interpretations of relativistic mechanics. So they prefer to include the factor γ with vj forming the proper four-velocity component uj and treating the mass as simply the invariant parameter m0. For details, see Okun (1989).

13.6.3 PARTICLES OF ZERO REST MASS

A surprising consequence of the relativistic energy–momentum generalization is the possibility of “massless” particles, which possess momentum and energy but no rest mass. From the expression for the energy and momentum of a particle,

Em c

v cP

m v

v c=

−=

−0

2

2 2

0

2 21 1/,

/

� �. (13.41)

We can define a particle of zero rest mass possessing finite energy and momentum. To this purpose, we allow v → c in some inertial system S and m0 → 0 in such a way that

m

v c

0

2 21−=

/χ (13.42)

remains constant. Then, Equation 13.41 takes the simple form

E c P ce= =χ χ2�

ˆ

Page 28: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

453Theory of Special Relativity

where e is a unit vector in the direction of motion of the particle. Eliminating χ from the last two equations, we obtain

E = Pc (13.43)

which is consistent with Equation 13.40: E P c m c2 2 202 4− ⋅ = .

Now as (E/c, �P) is a four-vector, (χc, χce) is also a four-vector, the energy and momentum four-

vector of a zero rest-mass particle in frame S and in any other inertial frame such as S′. It can be shown that the transformation of the energy and momentum four-vector (χc, χce ) of a zero rest-mass particle is identical with that of a light wave, provided χ is made proportional to the frequency v. Thus, if we associate a zero rest-mass particle with a light wave in one inertial frame, this associa-tion holds in all other inertial frames. The ratio of the energy of the particle to the frequency has the dimensions of action (or angular momentum). This suggests that we can write this association by the following equations:

E = hv and P = χc = hv/c

where h is Planck’s constant. This massless particle of light is called a photon. Einstein introduced it in his pioneering paper on the photoelectric effect published a few months before his work on special rel-ativity out of concern with the photoelectric effect and consideration of Planck’s quantum hypothesis.

13.7 EQUIVALENCE OF MASS ANd ENERGY

We have learned that Einstein’s theory of special relativity drastically revised our concepts about space and time. So we must follow Einstein and rethink the notions of mass, energy, and other important quantities. The equivalence of mass and energy is the best-known relationship Einstein gave in his theory of special relativity in 1905:

E = mc2 (13.44)

where E is the energy, m is the mass, and c is the speed of light.We can get this general idea of the equivalence of mass and energy from the consideration of

electromagnetic theory. An electromagnetic field possesses energy E and momentum p, and there is a simple relationship between E and p:

P = E/c.

Thus, if an object emits light in one direction with momentum p, in order to conserve momentum, the object itself must recoil with a momentum –p. If we stick to the definition of momentum as p = mv, we may associate a “mass” with a flash of light:

mpv

pc

E

c= = =

2

which leads to the famous formula

E = mc2.

This mass is not merely a mathematical fiction. Let us consider a simple thought experiment. Refer to Figure 13.11; there is an observer S in S frame and an observer S′ and an atom at rest in the

Page 29: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

454 Classical Mechanics

S′ frame. The atom emits two flashes of light (photons) of equal energy, traveling back-to-back and normal to the direction of the frames’ relative motion. Figure 13.8 depicts what S and S′ observe. Before the lights are emitted, the atom is at rest in S′ frame and moves rightward with speed v as observed by observer S in the S system.

We now analyze the emission process, first from the point of view of S′ and then according to the point of view of S. In S′ frame, the sum of the momentum of the two lights is zero, and the atom remains at rest after the emission of lights. As observed by observer S, the situation is quite dif-ferent. The two flashes of light move along diagonal directions. Each light’s component of velocity parallel to the atom’s velocity is v. Thus, the momentum of the two flashes of light parallel to the atom’s velocity is

2Ec

vc

.

We next study energy and momentum changes for the atom as observed by S. The atom’s energy is decreased by an amount equal to – 2E:

ΔEatom = 2E

and its momentum change is

Δ ΔpEc

vc

Ev

catom atom= − =2

2 .

Now, the definition of momentum for the atom (patom = matm v) implies that

Δpatom = Δmatom · v.

Comparing the two expressions for the change in the atom’s momentum, we obtain

Δ ΔE

v

cm vatom atom2

= ⋅

which gives

ΔEatom = Δmatom · c2.

Before thephotonsare emitted

As observedby S

As observedby S

v

v

c

v

After thephotons havebeen emitted

FIGURE 13.11 Thought experiment to show the “equivalence” mass and energy.

Page 30: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

455Theory of Special Relativity

In short,

ΔE = Δm · c2.

Einstein also provided a thought experiment some time ago. An emitter and an absorber of light are firmly attached to the ends of a box of mass M and length L (Figure 13.12). The box is initially stationary but is free to move. If the emitter sends a short light pulse of energy ΔE and momentum ΔE/c toward the right, the box will recoil toward the left by a small distance Δx with momentum px = −ΔE/c and velocity vx, where vx is given by

vx = px/M = − ΔE/cM.

The light pulse reaches the right end of the box approximately in time Δt = L/c and is absorbed. The small recoil distance is then given by

Δx = vxΔt = − ΔEL/Mc2.

Now, the center of mass of the system cannot move by purely internal changes, and there are no external forces. It must be that the transport of energy ΔE from the left end of the box to the right end is accompanied by the transport of mass Δm, so the change in the position of the center of mass of the box (denoted by δx) vanishes. The condition for this is

δx = 0 = ΔmL + MΔx.

From this, we find

l

Δx

(a)

(b)

FIGURE 13.12 Einstein’s thought experiment. (a) A box with an emitter, (b) after the emitter sends out a light pulse, the box recoiled to the left by a distance x.

Page 31: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

456 Classical Mechanics

Δ ΔΔ

ΔmML

xML

EL

McE c= − = =

22/

or

ΔE = Δm ⋅c2.

We should not confuse the notions of equivalence and identity. The energy and mass are different physical characteristics of particles; here, the word “equivalence” established only their proportion-ality to each other. This is similar to the relationship between the gravitational mass and inertial mass of a body: the two masses are indissolubly connected with each other and proportional to each other but, at the same time, have different characteristics. The equivalence of mass and energy has been beautifully verified by experiments in which matter is annihilated and converted totally into energy. For example, when an electron and a positron, each with a rest mass m0, come together, they disintegrate, and two gamma rays emerge, each with the measured energy of m0c2.

Based on Einstein’s mass–energy relationship E = mc2, we can show that the mass of a particle depends on its velocity. Let a force F act on a particle of momentum mv. Then,

Fdt = d(mv). (13.45)

If there is no loss of energy by radiation resulting from acceleration, then the amount of energy transferred in dt is

dE = c2dm.

This is put equal to the work done by the force F to give

Fvdt = c2dm.

Combining this with Equation 13.45, we have

vd(mv) = c2dm.

Multiply this by m.

vmd(mv) = c2mdm.

Integrating, we have

(mv)2 = c2m2 + K

where K is the integration constant. Now, m = m0 when v = 0; we find K c m= − 202, and

m v c m m2 2 2 202= −( ).

m0 is known as the rest mass of the particle. Solving for m, we obtain the so-called relativistic mass of the particle:

mm

v c=

−0

21 ( )/. (13.46)

Page 32: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

457Theory of Special Relativity

It is now easy to see that a material body cannot have a velocity greater than the velocity of light. If we try to accelerate the body, as its velocity approaches the velocity of light, its mass becomes larger and larger as it becomes increasingly more difficult to accelerate it further. In fact, because the mass m becomes infinite when v = c, we can never accelerate the body up to the speed of light.

Example 13.3: Variation of Mass with Velocity

For the sake of simplicity, we shall consider the case of a central inelastic collision between two particles of equal mass whose velocities, as viewed in the S′ system, are w and -w and are parallel to the x-axis. The center of mass of this two-particle system will remain fixed; that is, its velocity will be zero at all times, including the instant of collision. In the S system, the masses of the two particles will not be the same; let us call the masses m1 and m2, and their velocities u2 and –u2, respectively. At the instant of collision, the two particles are at rest in the S′ system but have a common velocity v in the S system, the velocity of S′ relative to S (Figure 13.13). Applying the conservation of momentum in the S system, we obtain

m1u1 + (−m2u2) = (m1 + m2)v.

Solving for m1/m2,

m1/m2 = (1 + u2/v)/(1 – u1/v).

Now,

u1 = (v + w)/(1 + vw/c2), u2 = (−v + w)/(1 – vw/c2).

After collision

Rest

Before collision

m mm m

v

v

SS´

w −w

u −u1

1 2

20

S

FIGURE 13.13 Collision of two particles in frame S and S .

Page 33: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

458 Classical Mechanics

Substituting u1 and u2 into the expression for m1/m2, we obtain

mm

vw cvw c

1

2

2

2

11

= +−

//

.

What is desired is a relationship between the mass of a particle and its velocity in the same S system. This can be done by expressing vw/c2 in terms of u1 and u2. Let us first compute ( ) ( )1 11

2 222 2− −u c u c/ and / :

1 11

11 11

2

2 2 2

2 2 2 2

− = − ++

= − −uc c

v wvw c

v c w c/

/ /( )( 22

2 21)

( )+ vw c/.

Similarly, we have

11 1

122

2

2 2 2 2

2 2− = − −−

uc

v c w cvw c

( )( )( )

/ //

.

From these two equations, we find

11

11

22 2

12 2

2 2

2 2

−−

= +−

u cu c

vw cvw c

//

//

( )( )

or

11

1

1

2

222 2

1 2

12 2

1 2

+−

=−( )−( )

vw cvw c

u c

u c

//

/

/

/

/ .

Accordingly, the ratio m1/m2 becomes

mm

vw cvw c

u c

u c

1

2

2

222 2

1 2

12 2

1 2

11

1

1= +

−=

−( )−( )

//

/

/

/

/ .

Suppose we now consider the special case of a collision between these two particles when one of them, say, particle 2, has a velocity u2 = 0. Then the last equation reduces to

mm

u c1

0

121

=− ( )/

where m0 is the mass of particle 2 when it is at rest. It is also the rest mass of particle 1 when it is at rest. Therefore, in general, if m0 is the rest mass of a particle, its mass m when its velocity is v will be given by

mm

v c=

−0

21 ( )/. (13.47)

High-energy physicists do not use relativistic mass, and they prefer the quantity momentum as it is measurable.

Page 34: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

459Theory of Special Relativity

13.8 CONSERVATION LAWS OF ENERGY ANd MOMENTUM

It is now clear that the linear momentum and energy of a particle should not be regarded as differ-ent entities but simply as two aspects of the same attributes of the particle because they appear as separate components of the same four-vector Pμ, which transforms according to Equation 13.28:

′ =P L Pµνµ ν

in matrix form

P

P

P

P

i

i

1

2

3

4

0 0

0 1 0 00 0 1 0

0 0

′′′′

=

γ βγ

βγ γ

=

+P

P

P

P

P i P

P

P

1

2

3

4

1 4

2

3

γ β

γ

(

(( )− +

i P Pβ 1 4

.

Thus,

P1′ = γ(P1 + iβP 4) = γ(P1 – βP 0),

P 2′ = P 2,

P3 ′ = P 3 (13.48a)

P 4′ = γ(−iβP 1 + P 4) or P 0′γ(P 0 – βP1) (13.48b)

which show clearly that what appears as energy in one frame appears as momentum in another frame, and vice versa.

So far, we have not discussed explicitly the conservation laws. Because linear momentum and energy are not regarded as different entities but as two aspects of the same attributes of an object, it is no longer adequate to consider linear momentum and energy separately. A natural relativistic generalization of the conservation laws of momentum and energy would be the conservation of the four-momentum. Consequently, the conservation of energy becomes one part of the law of conser-vation of four-momentum. This is exactly what has been found to be correct experimentally, and in addition, this generalized conservation law of four-momentum holds for a system of particles even when the number of particles and their rest energies are different in the initial and final states. It should be emphasized that what we mean by energy E is the total energy of an object. It consists of rest energy, which contains all forms of internal energy of the body, and kinetic energy. The rest energies and kinetic energies need not be individually conserved, but their sum must be. For exam-ple, in an inelastic collision, kinetic energy may be converted into some form of internal energy or vice versa; accordingly, the rest energy of the object may change.

Energy and momentum conservation go together in special relativity; we cannot have one with-out the other. This seems a bit puzzling for some readers for in classical mechanics, the conservation laws of energy and momentum are on different footing. That is because energy and momentum are regarded as different entities. Moreover, classical mechanics does not talk about rest energy at all.

13.9 GENERALIZATION OF NEWTON’S EQUATION OF MOTION

A natural relativistic generalization of the Newton’s equations of motion follows:

KP

m u mxµ

µµ

µ

τ τ τ= = =d

ddd

d

d( )0 0

2

2 (13.49)

where Kμ is the four-force vector; it is also called the Minkowski force.

Page 35: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

460 Classical Mechanics

Using Equations 13.19 and 13.36, we obtain the three components for μ = 1, 2, 3:

Kt

m vFj j

j11

2 0

2− =

−=β

β

dd

(13.50)

and K4 = iK 0, and K 0 is given by

KP

cE

c

Et

00

2

1

1= = =

dd

dd

ddτ

γτ β

. (13.51)

Thus, K 0 is proportional to the time rate of change of the energy.In practical calculations, we usually do not need to deal with the four-force but prefer to use just

the three components given by Equation 13.50. In vector form, we have

� �F

tm v=−

dd

0

21 β. (13.52)

This relativistic equation of motion reduces to the Newtonian form � �F P t= d /d , provided we use the

relativistic momentum �P given by the first equation of Equation 13.36.

We can show that the dot product of the four-force Kμ with the four-velocity uμ vanishes:

K um u

u mu u m cµ µ

µ

µ

µ

µµ µ

µτ τ∑ ∑ ∑= = = −d

dd /

dd

d( ) ( ) ( )0

00

222 ττ

µ∑ = 0. (13.53)

But the dot product can also be written as

K uF v

K uF v

K u K iKµ µ

µβ β

=∑ = ⋅

−+ = ⋅

−− =

1

4

24 4

20 0 4 0

1 1

� � � �( , u iu

F v cK

4 0

2

0

21 1

=

= ⋅−

−−

)

.

� �

β β

Combining this with Equation 13.53, we find

� �F v cK⋅−

−−

=1 1

02

0

2β β

from which we obtain

KF v

c

0

21= ⋅

� �

β. (13.54)

Now,

KP

tm c0

0

2

02

2

1

1 1= =

− −

dd

ddτ β β

.

Page 36: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

461Theory of Special Relativity

Combining this with Equation 13.54 gives a very useful relationship:

ddt

m cF v0

2

21−= ⋅

β

� �. (13.55)

13.9.1 FORCE TRANSFORMATION

Consider a particle of rest mass m0 having velocity �′v and momentum

�′P relative to frame S′ and

having velocity �v and momentum

�P relative to frame S. Then, the force acting on the particle mea-

sured in S′ is, from Equation 13.50,

′ = ′ ′F P tx xd /d (13.56)

where

′ = ′ ′ = ′= − =F F P P m v v vx x x x1 1 02

11, / β ( ).

Now, Equation 13.48 gives, with some modifications in notations,

′ = −

′ = ′ = ′ = −P Puc

Ec

P P P P E E uPx x y y z z xγ γ, , , ( ).

Hence, Equation 13.56 becomes, after it is made use of the first transformation relationship,

′ =′

−Ft

P uE cx xγ dd

/( )2 . (13.57)

But

d

ddd

dd′

=′ttt t

and

dd

dd

/′ = −

= −tt t

tux

cuv cxγ γ

221( ).

Substituting these into Equation 13.57,

′ =−

=−

Fuv c t

P uE c

uv cF

u

c

xx

x

xx

γγ ( )

( )

( )

1

1

1

22

2 2

/

dd

/

/

dddEt

.

Using Equations 13.55 and 13.36, we have

ddEt

F v= ⋅� �

.

Page 37: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

462 Classical Mechanics

Hence,

′ =−

− ⋅

Fuv c

Fu

cF vx

xx

1

1 2 2( )/

� �.

Now,

� �F v F v F v F vx x y y z z⋅ = + + .

Hence,

′ = −−

−−

F Fuv

c uv cF

uv

c uv cFx x

y

xy

z

xz2 2 2 21 1( ) ( )/ /. (13.58)

Next, we consider

′ =′′

=′

=′

=FP

t

P

t

P

ttt

P Pyyy y y

y

d

d

d

d

d

ddd

, ( ).

But

dd

/ /tt

uv cx′= −1 1 2γ ( ).

Hence,

′ =−

FF

uv cy

y

xγ ( )1 2/. (13.59)

Similarly, because

′ =P Pz z

we have

′ =−

FF

uv cz

z

xγ ( )1 2/. (13.60)

The inverse transformations are

F Fuv

c uv cF

uv

c uv cFx x

y

xy

z

x

= ′ +′

+ ′′+ ′

+ ′′

2 2 2 21 1( ) ( )/ /zz (13.61)

FF

uv cy

y

x

=′

+ ′γ ( )1 2/ (13.62)

FF

uv cz

z

x

= ′+ ′γ ( )1 2/

(13.63)

Page 38: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

463Theory of Special Relativity

E c P ce= =χ χ2�

ˆ.

13.10 RELATIVISTIC LAGRANGIAN ANd HAMILTONIAN FUNCTIONS

As in nonrelativistic mechanics, equations of motion can be written in generalized coordinates in the form of Lagrange’s or Hamilton’s equations. To do this, we must first find the Lagrangian or Hamiltonian. This is a relatively easy task for a free particle. Because the action integral for a free particle must be invariant under Lorentz transformations, it follows that the action integral must be taken over a scalar, and the latter must have the form of a differential of the first order. The only scalar of this kind that can be associated with a free particle is a quantity proportional to the interval ds. So for a free particle, the action must have the form

S sa

b

= ∫α d (13.64)

where α is some constant characterizing the particle, and the integral is along the world line of the par-ticle between two world points (i.e., between two particular events of the arrival of the particle at the

initial position and at the final position at definite times t1 and t2). Now d d / dτ β= = −s c t 1 2 , and the action integral (Equation 13.63) can be rewritten as an integral with respect to the time:

S c v c t L tt

t

t

t

= − = ∫∫α 1 2 2

1

2

1

2

/ d d

where v is the velocity of the particle. We arrive at the conclusion that the Lagrangian for a free relativistic particle is

L c v c= −α 1 2 2/ . (13.65)

At the limit ≪ c, our expression for L must be reduced to the Newtonian expression L = 2 m0v2. To carry out this transition, we expand Equation 13.64 in powers of v/c. Ignoring the terms of higher orders, we obtain

L c v c c v c= − ≅ −α α α1 22 2 2/ / .

We may discard the constant term that does not affect the equation of motion. Consequently, in the Newtonian approximation L = −av2/2c, a comparison with the Newtonian expression shows that α = −m0c. We have thus established the form of the Lagrangian for a free particle:

L m c v c= − −02 2 21 / . (13.66)

The next problem is to extend the free particle Lagrangian so that it includes the effects of external forces that act on the particle. If the forces acting on the particle were conservative forces indepen-dent of velocity, a suitable Lagrangian for such a particle would be

L m c v c V= − − −02 2 21 / (13.67)

where V is the potential energy of the particle, depending on position only. Note that the Lagrangian is no longer L = T − V. That this is the correct Lagrangian can be shown by demonstrating that the

Page 39: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

464 Classical Mechanics

Lagrange’s equation resulting from it agrees with Equation 13.50. We shall leave this as homework for the reader.

Having established the Lagrangian, we can now find the Hamiltonian H. If the Lagrangian L does not depend explicitly on time, then, by definition,

H q p L

m v v c m c v c V

j j

j

= −

= − + − +

∑ �

02 2 2

02 2 21 1/ / /

which, on collecting terms, reduces to

H m c v c V T V= − + = +02 2 21/ / . (13.68)

The Hamiltonian H is seen again to be the total energy.The relativistic Hamiltonian can be expressed in terms of the momentum of the particle. By

means of Equation 13.36, we have

p p p p m v v c12

22

32 2

02 2 2 2 11+ + = = − −( )/ .

Then a simple calculation gives

H p c m c V= + +2 202 4 . (13.69)

When p ≫ m0c, the Hamiltonian of a free particle attains the simple form

H ≅ pc.

A motion with such large momentum, for which the above approximation is valid, is called an ultra-relativistic motion. It is clear that for particles with zero rest mass, the expression is valid.

Example 13.4

As an application of the Lagrangian Equation 13.66, we consider a particle of rest mass m0 moving under a central force - dV/dr = −V′. As in the nonrelativistic case, the orbit is in a plane, and so we employ plane polar coordinates (r, θ). Then L is

L m cr r

cV r= − − +

−0

22 2 2

2

1 2

1� �θ

/

( ) (13.70)

from which we find

∂∂

= ∂∂

= − ∂∂

∂∂

= ∂∂

=

Lr

m rLr

m rVr

Lm r

L�

� �

��

γ γ θ

θγ θ

θ

0 02

02 0..

The Lagrange’s equation for coordinate r gives

Page 40: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

465Theory of Special Relativity

dd( )γ γ θ� �r

tr

Vm

− + ′ =2

0

0 (13.71)

where

γ θ= − +

1

2 2 2

2

1 2� �r r

c

/

. (13.72)

Because θ is cyclic, the Lagrange’s equation for coordinate θ yields an integral of the motion:

∂∂

= =Lm r A�

�θ

γ θ02 cons t( tan ). (13.73)

This is the relativistic law of area, from which we have

�θγ

= Am r0

2 . (13.74)

Then, we can write

dd

dd

dd

ddt t

= =θ

θ θθ

and so forth.In terms of the new variable y = 1/r, Equation 13.71 becomes

dd

2

20

2 2 0y

ym VA yθγ+ − ′ = . (13.75)

It is desirable to eliminate γ from Equation 13.74. To this end, we seek help from the energy inte-gral Equation 13.67, which can be written as

H = γm0c2 + V

from which we have

γ = −H Vm c0

2 .

Substituting this into Equation 13.75, we obtain the differential equations for the general central orbit:

dd

2

2 2 2 2 0y

yH V Vc A yθ

+ − − ′ =( ). (13.76)

If V(r) = −k/r = − ky (the inverse square law of force; k > 0 for an attractive force), Equation 13.75 reduces to the form

dd

2

2 1 0y

y a Cθ

+ − − =( ) (13.77)

Page 41: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

466 Classical Mechanics

where

ak

c AC

kHc A

= =2

2 2 2 2, .

With an appropriate choice of initial conditions, its solution is

ry

l= =+

11 ε ηθcos

(13.78)

with

η2 = 1 − a, l = (1 − a)/C. (13.79)

The apses of the orbit are

rl

min , , ,...=+

=1

0 2 4ε

ηθ π π, at (13.80)

and

rl

max , ,...=−

=1

ηθ π π, at . (13.81)

The angle between successive apse lines is given by π/η:

πη

π=−1 a

(13.82)

which reduces to ≅ (π + k2/2c2A2) for the case where a is small. Thus, the orbit advances the peri-helion as shown in Figure 13.14, where the circles have radii rmax and rmin.

0

2∆

FIGURE 13.14 Perihelion advance due to the change of mass with velocity Δ = k2/m2c2A2.

Page 42: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

467Theory of Special Relativity

13.11 RELATIVISTIC KINEMATICS OF COLLISIONS

The subject of collisions is of considerable interest in experimental high-energy physics. Although the interactions (forces) between elementary particles are non-classical, as long as the particles involved in a reaction are outside the region of mutual interaction, their mean motion can be described by classical mechanics. The fundamental principles involved in the analysis of collisions are the conservation laws of momentum and energy. Because energy and momentum form a four-momentum vector, the conservation equations can be written as one four-vector equation. We shall see that it is easy to work with this four-vector equation.

We shall restrict our discussion to two-particle collisions that can be illustrated symbolically by the reaction

A + B → C + D + E + ….

This is a generalization of the two-body collision considered in Chapter 9; it allows for the pos-sibility of two particles A and B colliding and producing a group of particles C, D, E, and so on. Conservation of the energy–momentum four-vector gives

� � � � �p p q q qA B C D E+ = + + + ...

where �′p s denote the momenta before the collision, and

�′q s denote the momenta after the collision.

The problem is that we know the four-vector for A and B; we are given some information about the four-vectors of C, D, and so forth, and we have to find the unknown momenta and energies. The solution to such a problem can be found by the use of a particular technique. A simple example will illustrate the flavor of this technique. For simplicity and clarity, we consider the case where only two particles, which may or may not be identical, are produced after a collision. Suppose we are told nothing about the four-momentum vector of particle D, but we are asked about the dynamic state of particle C. The technique is to rearrange the equation for the conservation of the four-momentum vectors so that the four-momentum vector for the particle, which we are not interested in (i.e., par-ticle D), stands alone on one side of the equation:

� � � �q p p qD A B C= + − .

Taking the scalar product of each side with itself and using the result that the length of the four-momentum vector squared is an invariant and equal to (rest mass × c)2, we obtain

( ) ( ) ( ) ( )m c m c m c m c p p p qD A B C A B A C2 2 2 2 2 2 2= + + + ⋅ − ⋅ −

� � � � �� �p qB C⋅

in which the only four-vectors remaining are those we know or the one we wish to find, that is, � �p pA B, , and

�qC. Let us illustrate this technique by looking at a few specific examples.

Because our examples involve photons, let us first take a brief review of the nature of photons. As shown a little earlier, a consequence of the relativistic energy–momentum relationship is the possibility of “massless” particles that possess energy and momentum but no rest mass: E = pc. Furthermore, massless particles must travel at the speed of light.

Photons interact electromagnetically with electrons and other charged particles. Neutrinos and gravitons (not yet detected) are two other possible massless particles.

Example 13.5: Compton Scattering

As a simple example, we consider the Compton scattering, where an electron at rest scatters an incident photon (Figure 13.15). Given the incident photon energy, what is the energy of the scat-tered photon?

Page 43: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

468 Classical Mechanics

Solution:

We denote the incident photon by γ1 and the scattered photon by γ2. Because we are not inter-ested in the electron after the scattering, the energy–momentum four-vector conservation equa-tion is written as

� � � �q p p qe e= + −γ γ1 2. (13.83)

Taking the scalar product of each side with itself and using the invariance of the square of the length and the special properties of the photon four-vector, we obtain

m c m c cE

cmcc

cE

c

E

c

p q

2 2 2 2 22

2

22

2 22 2

2

1 2 2

1

= + −

+ ⋅

γ γ γ

γ γ� �

22

22 22

2 2− cmcc

E

(13.84)

where Eγ1 and Eγ2 are the energies of the incident photon and the scattered photon, respectively. If φ is the angle between the scattered photon and the incident photon, then

� �p q p q

E

c

E

cγ γ γ γγ γφ

1 1 1 1

1 22 2⋅ = =cos . (13.85)

Solving Equations 13.84 and 13.85 for Eγ2, we find

EE

E mcγγ

γ φ2

1

11 12=

+ −( )( cos )/. (13.86)

For a photon E = hc/λ, Equation 13.86 can be rewritten as

λ λλ

φ2 11

1 1= + −

hmc

( cos )

from which we obtain

λ2 − λ1 = λc (1 – cosϕ) (13.87)

where λc = h/mc is known as the Compton wavelength of the electron. Here, we have treated photons as quantum particles. In classical physics, a photon would be treated as a wave, and consequently, there would be no change in wavelength after scattering.

The electron was assumed free and at rest. For sufficiently high photon energies, this is a good approximation for electrons in the outer shells of light atoms. If the binding energy of the electron is comparable to the photon energy, momentum and energy can be transferred to the atom as a whole, and the photon can be completely absorbed; there would be no scattering.

mγφ

e−1

γ2

FIGURE 13.15 Compton scattering.

Page 44: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

469Theory of Special Relativity

The special theory of relativity was not widely accepted in the 1920s partly because of the radi-cal nature of its space–time concepts but also because of a lack of experimental evidence. Now, the result of Compton scattering left little doubt the relativistic dynamics are valid.

Example 13.6: Electron–Positron Pair Annihilation

Positrons, which are antiparticles of electrons, can be found in nature. They are detected in cos-mic radiation and as products of radioactivity from a few radioactive elements. However, they cannot live free very long because of their interaction with electrons. When positrons and elec-trons are in near proximity, they annihilate each other to produce two photons:

e+ + e+ → γ + γ.

The reaction is illustrated in Figure 13.16, where the positron is at rest.Conservation of the energy–momentum four-vector gives

� � � �p p q q+ −+ = +γ γ1 2. (13.88)

Of course, there is no difference between the two photons, so let us label them 1 and 2. If we want to know what the energy of photon 1 is as a function of the angle between the two photons, we rearrange Equation 13.88 in the following form:

� � � �q p p qγ γ2 1

= + −+ −

and then take the square of both sides with the following result:

� � � � � � � � �

�q q q p p p p q q

pγ γ γ γ γ2 2 2 1 1

2

2

= ⋅ = ⋅ + ⋅ + ⋅

+ ⋅+ + − −

+�� � � � �p p q p q− + −− ⋅ − ⋅2 2

1 1γ γ . (13.89)

Equation 13.88 can be simplified with the following facts:

1. The energy-momentum four-vector is an invariant; E2 = p2c2 + m2c4. 2. The photon has no rest mass. 3. The energy and momentum of a photon satisfy the simple relation E = pc. 4. p+ = 0 (the positron is initially at rest), and E+ = mc2.

With these aids, Equation 13.89 reduces to

0 2

2 2

2 2 2 2 22

2

2

22 2

22

1

= + +

− −

+ −

m c m c cEc

mcc

cEc

E

cc

Ec

Eγ γ11

12 2c

p q+ ⋅−� �

γ .

γ

γ1φ1

φ22

ee +−

FIGURE 13.16 Electron–positron pair annihilation.

Page 45: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

470 Classical Mechanics

If φ1 is the angle between the direction of the propagation of photon 1 and the direction of the incoming electron, then

21 1 11 1

� �p q p q p E c− − −⋅ = =γ γ γφ φcos cos / .

We finally obtain

E

mc E mcE mc cpγ φ1

2 2

21

= ++ −

− −

( )cos

.

For fixed values of E− and p−, the photon energy will be at a maximum when it is emitted in the forward direction (i.e., φ1 = 0, cosφ1 = 1) and is at a minimum when it is emitted in the backward direction.

The results just given should apply equally well to photon 2. If photon 1 is emitted in the for-ward direction with the maximum energy, then, because of conservation of energy and momen-tum and because the initial momentum is in the forward direction, photon 2 must be emitted in the backward direction with minimum energy.

13.12 COLLISION THRESHOLd ENERGIES

The majority of collision experiments in high-energy physics produce one, two, or more particles of the same or different types as the initial ones; the total mass-produced often is greater than the mass of the particles producing the interaction. How can this reaction occur? It could happen because some of the incoming kinetic energy is converted to mass. This leads to the concept of the threshold energy for a reaction (i.e., the minimum kinetic energy of the incoming particle so that the reaction will just occur).

Consider a simple reaction

1 + 2 → 3 + 4

where particle 2 is initially at rest in the laboratory frame, and incident particle 1 has a total energy E1 and momentum

�p1. We use unprimed quantities for the laboratory frame and primed quantities

for the center-of-mass frame that is moving with a velocity v relative to the laboratory frame along the x-axis.

The energy–momentum four-vector (E/c, �p) is an invariant:

E2/c2 – p2 = M 2c2.

For any system of particles, the total energy and total momentum also form a four-vector and is an invariant. Now the total energy is

E0 = M2c2 + E1

and in the center-of-mass frame, the total momentum is zero. The invariance of the energy–momentum four-vector gives us

E c p E c02 2

12

02 2/ /− = ′ .

Thus, we obtain

′ = − = + − − = +E E p c M c E E M c M c M02

02

12 2

22

12

12

12 4

12 4

22( ) ( ) cc M E c4

2 122+ .

Page 46: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

471Theory of Special Relativity

Solving for E1,

EE M c M c

M c1

02

12 4

22 4

222

= ′ − −. (13.90)

The center-of-mass frame is moving with a velocity v relative to the laboratory frame along the x-axis. The Lorentz transformation then gives

E E0 0= ′γ

where

γβ

β=−

= =+

1

1 2

1

1 22

andvc

cp

E M c. (13.91)

Before we proceed further, let us digress for a moment to derive the last expression. For a particle with velocity v, we have

p = γMv = γMcβ

and

E = γMc2 = c(p2 + M 2c2)1/2

where the last step is from the invariance of the energy–momentum four-vector. From the last expressions, we obtain

β = cpE

which can be applied to the two-particle system and yields the second expression in Equation 13.91.

We now return to our main problem. The energy released, denoted by Q, of the reaction is

Q = (M1 + M2 – M3 – M4)c2.

If Q is positive, the reaction can proceed for all values of E1. But if Q is negative, the reaction has a threshold; that is, there is a minimum value of E1, denoted by E1th, for which the reaction can occur. At threshold,

′ = +E M M c0 3 42( )

and then, Equation 13.90 gives

EM M M M c

Mth1

3 42

12

22 2

22=

+ − − ( )

or

Page 47: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

472 Classical Mechanics

TM M M M c

Mth

13 4

21 2

2 2

22= + − +[( ) ( ) ]

where T1 is the kinetic energy of the incident particle M1.

PROBLEMS

1. Two inertial frames S and S′ have their respective x-axes parallel with S′ moving with constant velocity v along the positive x-direction of the system S. A rod makes an angle of 30° with respect to the x′-axis. What is the value of v if the rod makes an angle of 45° with respect to the x-axis?

2. The wave equation

∇ − =22

2

2

10ψ ∂ ψ

∂c t

is satisfied in a vacuum for any component of �E or

�B , and it is a consequence of Maxwell’s

equations. Show that this wave equation is invariant under a Lorentz transformation. For simplicity, you can restrict yourself to the case in which the wave propagates along the x-axis.

3. A light signal is emitted by observer 0 at exactly the moment when observer 0′ passes him with velocity

�v (relative to 0). Show that the velocity of light in any direction is the same as

measured by 0 and 0′. 4. A track star on Earth runs the 100-m dash in 10.0 s, as measured on Earth. What is his or

her time as measured by a clock on a spaceship receding from the Earth with speed 0.99c? 5. A spaceship moves away from the Earth with constant speed v c= / 2. The astronaut

observes that a rod-like external probe is 6 m long and makes an angle of 45° with the ship’s line of motion. To an observer on Earth, how long is the probe and what angle does it make with the line of motion?

6. A spaceship moving away from the Earth at a velocity v1 = 0.75c with respect to the Earth launches a rocket with a velocity v2 = 0.75c in the direction away from the Earth. What is the velocity of the rocket with respect to the Earth?

7. An observer on Earth sees two spaceships A and B approaching her along the same straight line: A approaches from the left with speed c/2, and B from the right with speed 3c/4. With what speed does each spaceship approach the other?

8. A man on a station platform sees two trains approaching each other at the rate 7c/5, but an observer on one of the trains sees the other train approaching her with a velocity 35c/37. What are the velocities of the trains with respect to the station?

vv u

uv c= ′ +

+ ′1 2/

9. Show that if two Lorentz transformations with relative velocities given by v1 and v2, respec-tively, are carried out consecutively, the result is the same as that of a single Lorentz trans-formation with relative velocity v given by

β β ββ β

= ++

1 2

1 21

where

Page 48: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

473Theory of Special Relativity

β = V/c, β1 = v1/c, and β2 = v2/c.

10. Show that even if the relativistic mass is used, the expression mv2/2 is not the correct rela-tivistic energy.

11. To an observer O, a particle moves with the velocity �v and orientation specified by the

usual spherical coordinates θ and ϕ. Find its apparent velocity �′v for a second observer O′

moving with a uniform velocity u in the positive x-direction relative to O. 12. A pion of mass π comes to rest and then decays into a muon of mass μ and a neutrino of

mass zero. Show that the kinetic energy of the muon is c2(π − μ)2/2π. 13. Two identical bodies, each with rest mass m0, approach each other with equal velocities u,

collide, and stick together in a perfectly inelastic collision.(a) What is the rest mass of the composite body?(b) What is the rest mass of the composite body as determined by an observer who is at

rest with respect to one of the initial bodies? 14. A particle of rest mass m0 and velocity v = 3c/5 collides with a stationary particle of rest

mass m0. Assuming that, after collision, the two particles stick together, find the velocity and the rest mass of the composite particle.

15. Consider a system of non-interacting particles. Show that its rest mass M exceeds the sum of rest masses of constituent particles by the total kinetic energy of the particles (divided by c2):

M mc

Tj j

jj

= + ∑∑ 12 .

16. A particle of rest mass m0 moves under the action of a constant force. Find the time depen-dence of the particle’s velocity.

17. A particle of rest mass M0 moves with an instantaneous velocity �v under the action of a

force �F. Show that

(a) If �F is parallel to

�v, then

� �F M a= 0

3γ , where � �a v t= d /d , and γ =

1

1 2 2v c/.

(b) If �F is perpendicular to

�v, then

�F M a= 0γ .

18. As viewed by an observer in the laboratory, a proton collides with another proton initially at rest. After the collision, a proton and an antiproton come off in addition to the original protons. Find the minimum kinetic energy that the incoming proton must have to make this reaction energetically possible.

19. The relativistic Doppler effect. A light source flashes with period τ0 = 1/v0 in its rest frame, and the source moves toward an observer with velocity v. Show that the frequency of the pulses v received by the observer is

v vv cv c

= +−

0

1 211

//

/

and that if the observer is at angle θ from the line of motion, vD is given by

v vv cv c

= −−0

2 2 1 211( )( cos )

///θ

.

20. Find the relativistic path of an electron moving in the field of a fixed-point charge. 21. Derive an expression for the acceleration four-vector that gives the rate of change of the

velocity four-vector of a particle with respect to its proper time. What are its components in the case when either the magnitude or the direction of velocity is invariable?

Page 49: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

474 Classical Mechanics

22. The result of any two Galilean transformations is another transformation of the same type.(a) Show that the same is true of two Lorentz transformations if the directions of

�v1 and

�v2

are the same.(b) If

�v1 and

�v2 have different directions, then there is a small resultant space rotation,

known as the Thomas precession. Show that the small angle of the space rotation is given by

d /� � �θ = ×v v c1 2

22 .

This result was applied by Thomas to the case of the spinning electron in the hydrogen atom.

REFERENCES

Okun, L.B. The concept of mass. Phys. Today, 31–36, 1989.Resnick, R., and Halliday, D. Basic Concepts in Relativity, Wiley, 1985.Terrell, J.L. The invisibility of the Lorentz contraction, Phys. Rev., 116, 1041, 1959.Weisskopf, V.F. The visual appearance of rapidly moving objects. Phys. Today, 14, 9, 24, 1960.

Page 50: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

Chapter 11Relativity (Kinematics)

We now come to Einstein’s theory of relativity. This is where we find outthat everything we’ve done so far in this book has been wrong. Well, per-haps “incomplete” would be a better word. The important point to realize isthat Newtonian physics is a limiting case of the more correct relativistic theory.Newtonian physics works perfectly fine when the speeds we’re dealing with aremuch less than the speed of light, which is about 3 · 108 m/s. It would be silly, toput it mildly, to use relativity to solve a problem involving the length of a baseballtrajectory. But in problems involving large speeds, or in problems where a highdegree of accuracy is required, we must use the relativistic theory.1 This is thesubject of the remainder of this book.

The theory of relativity is certainly one of the most exciting and talked-abouttopics in physics. It is well known for its “paradoxes,” which are quite con-ducive to discussion. There is, however, nothing at all paradoxical about it. Thetheory is logically and experimentally sound, and the whole subject is actuallyquite straightforward, provided that you proceed calmly and keep a firm hold ofyour wits.

The theory rests upon certain postulates. The one that most people find coun-terintuitive is that the speed of light has the same value in any inertial (that is,nonaccelerating) reference frame. This speed is much greater than the speed ofeveryday objects, so most of the consequences of this new theory aren’t notice-able. If we instead lived in a world identical to ours except for the fact thatthe speed of light was 50 mph, then the consequences of relativity would beubiquitous. We wouldn’t think twice about time dilations, length contractions,and so on.

I have included a large number of puzzles and “paradoxes” in the problems andexercises. When attacking these, be sure to follow them through to completion,and don’t say, “I could finish this one if I wanted to, but all I’d have to do wouldbe such-and-such, so I won’t bother,” because the essence of the paradox may

1 You shouldn’t feel too bad about having spent so much time learning about a theory that’s just

the limiting case of another theory, because you’re now going to do it again. Relativity is also the

limiting case of another theory (quantum field theory). And likewise, quantum field theory is the

limiting case of yet another theory (string theory). And likewise . . . well, you get the idea. Who

knows, maybe it really is turtles all the way down.

501

Page 51: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

502 Relativity (Kinematics)

very well be contained in the such-and-such, and you will have missed out on allthe fun. Most of the paradoxes arise because different frames of reference seemto give different results. Therefore, in explaining a paradox, you not only needto give the correct reasoning, you also need to say what’s wrong with incorrectreasoning.

There are two main topics in relativity. One is Special Relativity (which doesn’tdeal with gravity), and the other is General Relativity (which does). We’ll dealmostly with the former, but Chapter 14 contains some of the latter. SpecialRelativity may be divided into two topics, kinematics and dynamics. Kinematicsdeals with lengths, times, speeds, etc. It is concerned only with the space andtime coordinates of an abstract particle, and not with masses, forces, energy,momentum, etc. Dynamics, on the other hand, does deal with these quantities.This chapter covers kinematics. Chapter 12 covers dynamics. Most of the funparadoxes fall into the kinematics part, so the present chapter is the longer of thetwo. In Chapter 13, we’ll introduce the concept of 4-vectors, which ties much ofthe material in Chapters 11 and 12 together.

11.1 Motivation

Although it was obviously a stroke of genius that led Einstein to his theoryof relativity, it didn’t just come out of the blue. A number of things going onin nineteenth-century physics suggested that something was amiss. There weremany efforts made by many people to explain away the troubles that were arising,and at least a few steps had been taken toward the correct theory. But Einsteinwas the one who finally put everything together, and he did so in a way thathad consequences far beyond the realm of the specific issues that people weretrying to understand. Indeed, his theory turned our idea of space and time on itshead. But before we get into the heart of the theory, let’s look at two of the majorproblems in late nineteenth-century physics.2

11.1.1 Galilean transformations, Maxwell’s equations

v

x

y

zx'

y'

z'S S'

Fig. 11.1

Imagine standing on the ground and watching a train travel by with constantspeed v in the x direction. Let the train frame be S � and the ground frame be S, asshown in Fig. 11.1. Consider two events that happen on the train. For example,one person claps her hands, and another person stomps his feet. If the spaceand time separations between these two events in the frame of the train are �x�and �t�, what are the space and time separations, �x and �t, in the frame ofthe ground? Ignoring what we’ll be learning about relativity in this chapter, theanswers are “obvious” (well, in that incorrectly obvious sort of way, as we’ll see

2 If you can’t wait to get to the postulates and results of Special Relativity, you can go straight to

Section 11.2. The present section can be skipped on a first reading.

Page 52: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

11.1 Motivation 503

in Section 11.4.1). The time separation, �t, is the same as on the train, so we have�t = �t�. We know from everyday experience that nothing strange happens withtime. When you see people exiting a train station, they’re not fiddling with theirwatches, trying to recalibrate them with a ground-based clock.

The spatial separation is a little more exciting, but still nothing too compli-cated. The train is moving, so everything in it (in particular, the second event)gets carried along at speed v during the time �t� between the two events. So wehave �x = �x� + v�t�. As a special case, if the two events happen at the sameplace on the train (so that �x� = 0), then we have �x = v�t�. This makes sense,because the spot on the train where the events occur simply travels a distancev�t by the time the second event happens. The Galilean transformations aretherefore

�x = �x� + v�t�,

�t = �t�.(11.1)

Also, nothing interesting happens in the y and z directions, so we have �y = �y�and �z = �z�.

The principle of Galilean invariance says that the laws of physics are invariantunder the above Galilean transformations. Alternatively, it says that the lawsof physics hold in all inertial frames.3 This is quite believable. For example,Newton’s second law holds in all inertial frames, because the constant relativevelocity between any two frames implies that the acceleration of a given particleis the same in all frames.

Remarks: Note that the Galilean transformations aren’t symmetric in x and t. This isn’t auto-matically a bad thing, but it turns out that it will in fact be a problem in Special Relativity,where space and time are treated on a more equal footing. We’ll find in Section 11.4.1 that theGalilean transformations are replaced by the Lorentz transformations (at least in the world welive in), and the latter are indeed symmetric in x and t (up to factors of the speed of light, c).

Note also that Eq. (11.1) deals only with the differences in x and t between two events, andnot with the values of the coordinates themselves. The values of the coordinates of a singleevent depend on where you pick your origin, which is an arbitrary choice. The coordinatedifferences between two events, however, are independent of this choice, and this allows us tomake the physically meaningful statement in Eq. (11.1). It makes no sense for a physical resultto depend on the arbitrary choice of origin, and so the Lorentz transformations we derive lateron will also involve only differences in coordinates. ♣

One of the great triumphs of nineteenth-century physics was the theory ofelectromagnetism. In 1864, James Clerk Maxwell wrote down a set of equationsthat collectively described everything that was known about the subject. Theseequations involve the electric and magnetic fields through their space and timederivatives. We won’t worry about the specific form of the equations here,4 but it

3 It was assumed prior to Einstein that these two statements say the same thing, but we will soon see

that they do not. The second statement is the one that remains valid in relativity.4 Maxwell’s original formulation involved a large number of equations, but these were later written

more compactly, using vectors, as four equations.

Page 53: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

504 Relativity (Kinematics)

turns out that if you transform them from one frame to another via the Galileantransformations, they end up taking a different form. That is, if you’ve writtendown Maxwell’s equations in one frame (where they take their standard nice-looking form), and if you then replace the coordinates in this frame by thosein another frame, using Eq. (11.1), then the equations look different (and notso nice). This presents a major problem. If Maxwell’s equations take a niceform in one frame and a not-so-nice form in every other frame, then why isone frame special? Said in another way, Maxwell’s equations predict that lightmoves with a certain speed c. But which frame is this speed measured withrespect to? The Galilean transformations imply that if the speed is c with respectto a given frame, then it is not c with respect to any other frame. The proposedspecial frame where Maxwell’s equations are nice and the speed of light is cwas called the frame of the ether. We’ll talk in detail about the ether in the nextsection, but what experiments showed was that light surprisingly moved withspeed c in every frame, no matter which way the frame was moving through thesupposed ether.

There were therefore two possibilities. Either something was wrong withMaxwell’s equations, or something was wrong with the Galilean transforma-tions. Considering how “obvious” the latter are, the natural assumption in thelate nineteenth century was that something was wrong with Maxwell’s equations,which were quite new, after all. However, after a good deal of effort by many peo-ple to make Maxwell’s equations fit with the Galilean transformations, Einsteinfinally showed that the trouble was in fact with the latter. More precisely, in 1905he showed that the Galilean transformations are a special case of the Lorentztransformations, valid only when the speed involved is much less than the speedof light.5 As we’ll see in Section 11.4.1, the coefficients in the Lorentz transfor-mations depend on both v and the speed of light c, where the c’s appear in variousdenominators. Since c is quite large (about 3 · 108 m/s) compared with everydayspeeds v, the parts of the Lorentz transformations involving c are negligible, forany typical v. This is why no one prior to Einstein realized that the transforma-tions had anything to do with the speed of light. Only the terms in Eq. (11.1) werenoticeable.

As he pondered the long futile fight

To make Galileo’s world right,

In a new variation

On the old transformation,

It was Einstein who first saw the light.

5 It was well known that Maxwell’s equations were invariant under the Lorentz transformations

(in contrast with their noninvariance under the Galilean transformations), but Einstein was the

first to recognize the full meaning of these transformations. Instead of being relevant only to

electromagnetism, the Lorentz transformations replaced the Galilean transformations universally.

Page 54: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

11.1 Motivation 505

In short, the reasons why Maxwell’s equations were in conflict with theGalilean transformations are: (1) The speed of light is what determines the scaleon which the Galilean transformations break down; (2) Maxwell’s equationsinherently involve the speed of light, because light is an electromagnetic wave.

11.1.2 Michelson–Morley experiment

As mentioned above, it was known in the late nineteenth century, after Maxwellwrote down his equations, that light is an electromagnetic wave and that it moveswith a speed of about 3 ·108 m/s. Now, every other wave that people knew aboutat the time needed a medium to propagate in. Sound waves need air, ocean wavesneed water, waves on a string of course need the string, and so on. It was thereforenatural to assume that light also needed a medium to propagate in. This proposedmedium was called the ether. However, if light propagates in a given medium,and if the speed in this medium is c, then the speed in a reference frame movingrelative to the medium will be different from c. Consider, for example, soundwaves in air. If the speed of sound in air is vsound, and if you run toward a soundsource with speed vyou, then the speed of the sound waves with respect to you(assuming it’s a windless day) is vsound + vyou. Equivalently, if you are standingdownwind and the speed of the wind is vwind, then the speed of the sound waveswith respect to you is vsound + vwind.

If this ether really exists, then a reasonable thing to do is to try to measureone’s speed with respect to it. This can be done in the following way (we’llwork in terms of sound waves in air here).6 Let vs be the speed of sound inair. Imagine two people standing on the ends of a long platform of length Lthat moves at speed vp with respect to the reference frame in which the air isat rest. One person claps, the other person claps immediately when he hearsthe first clap (assume that the reaction time is negligible), and then the firstperson records the total time elapsed when she hears the second clap. What isthis total time? Well, the answer is that we can’t say without knowing in whichdirection the platform is moving. Is it moving parallel to its length, or transverseto it (or somewhere in between)? Let’s look at these two basic cases. For bothof these, we’ll view the setup and do the calculation in the frame in which theair is at rest.

Consider first the case where the platform moves parallel to its length. In theframe of the air, assume that the person at the rear is the one who claps first. Thenit takes a time of L/(vs − vp) for the sound to reach the front person. This is truebecause the sound must close the initial gap of L at a relative speed of vs − vp,

6 As we’ll soon see, there is no ether, and light travels at the same speed with respect to any frame.

This is a rather bizarre fact, and it takes some getting used to. It’s hard enough to get away from the

old way of thinking, even without any further reminders, so I can’t bring myself to work through

this method in terms of light waves in an ether. I’ll therefore work in terms of sound waves in air.

Page 55: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

11.2 The postulates 509

indeed been measured. Note that it is the velocity of the telescope that matters here, andnot its position.12

However, if frame dragging were real, then the light from the star would get draggedalong with the earth and would therefore travel down a telescope that was pointed directlyat the star, in disagreement with the observed fact that the telescope must point at the slightangle mentioned above. Or even worse, the dragging might produce a boundary layer ofturbulence which would blur the stars. The existence of stellar aberration therefore impliesthat frame dragging doesn’t occur. ♣

11.2 The postulates

Let’s now start from scratch and see what the theory of Special Relativity is allabout. We’ll take the route that Einstein took and use two postulates as the basisof the theory. We’ll start with the speed-of-light postulate:

• The speed of light has the same value in any inertial frame.

I don’t claim that this statement is obvious, or even believable. But I do claimthat it’s easy to understand what the statement says (even if you think it’s too sillyto be true). It says the following. Consider a train moving along the ground atconstant velocity. Someone on the train shines a light from one point on the trainto another. Let the speed of the light with respect to the train be c (≈3 · 108m/s).Then the above postulate says that a person on the ground also sees the lightmove at speed c.

This is a rather bizarre statement. It doesn’t hold for everyday objects. If abaseball is thrown on a train, then the speed of the baseball is different in thedifferent frames. The observer on the ground must add the velocity of the ball(with respect to the train) and the velocity of the train (with respect to the ground)to obtain the velocity of the ball with respect to the ground.13

The truth of the speed-of-light postulate cannot be demonstrated from firstprinciples. No statement with any physical content in physics (that is, one thatisn’t purely mathematical, such as, “two apples plus two apples gives fourapples”) can be proven. In the end, we must rely on experiment. And indeed,all the consequences of the speed-of-light postulate have been verified countless

12 This aberration effect is not the same as the parallax effect in which the direction of the actual

position of an object changes, depending on the location of the observer. For example, people at

different locations on the earth see the moon at different angles (that is, they see the moon in line

with different distant stars). Although stellar parallax has been measured for nearby stars (as the

earth goes around the sun), its angular effect is much smaller than the angular effect from stellar

aberration. The former decreases with distance, whereas the latter doesn’t. For further discussion

of aberration, and of why it is only the earth’s velocity (or rather, the change in its velocity) that

matters, and not also the star’s velocity (since you might think, based on the title of this chapter,

that it is the relative velocity that matters), see Eisner (1967).13 Actually, this isn’t quite true, as the velocity-addition formula in Section 11.5.1 shows. But it’s

true enough for the point we’re making here.

Page 56: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

510 Relativity (Kinematics)

times during the past century. As discussed in the previous section, the mostwell-known of the early experiments on the speed of light was the one performedby Michelson and Morley. And in more recent years, the consequences of the pos-tulate have been verified continually in high-energy particle accelerators, whereelementary particles reach speeds very close to c. The collection of all the datafrom numerous experiments over the years allows us to conclude with near cer-tainty that our starting assumption of an invariant speed of light is correct (or isat least the limiting case of a more accurate theory).

There is one more postulate in the Special Relativity theory, namely the“relativity” postulate (also called the Principle of Relativity). It is much morebelievable than the speed-of-light postulate, so you might just take it for grantedand forget to consider it. But like any postulate, of course, it is crucial. It can bestated in various ways, but we’ll simply word it as:

• All inertial frames are “equivalent.”

This postulate basically says that a given inertial frame is no better than anyother. There is no preferred reference frame. That is, it makes no sense to saythat something is moving; it makes sense only to say that one thing is movingwith respect to another. This is where the “Relativity” in Special Relativity comesfrom. There is no absolute frame; the motion of any frame is defined only relativeto other frames.

This postulate also says that if the laws of physics hold in one inertial frame(and presumably they do hold in the frame in which I now sit),14 then they holdin all others. It also says that if we have two frames S and S �, then S should seethings in S � in exactly the same way as S � sees things in S, because we can justswitch the labels of S and S � (we’ll get our money’s worth out of this statementin the next few sections). It also says that empty space is homogeneous (that is,all points look the same), because we can pick any point to be, say, the origin of acoordinate system. It also says that empty space is isotropic (that is, all directionslook the same), because we can pick any axis to be, say, the x axis of a coordinatesystem.

Unlike the first postulate, this second one is entirely reasonable. We’ve gottenused to having no special places in the universe. We gave up having the earth asthe center, so let’s not give any other point a chance, either.

Copernicus gave his reply

To those who had pledged to deny.

“All your addictions

To ancient convictions

Won’t bring back your place in the sky.”

14 Technically, the earth is spinning while revolving around the sun, and there are also little vibrations

in the floor beneath my chair, etc., so I’m not really in an inertial frame. But it’s close enough

for me.

Page 57: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

11.3 The fundamental effects 511

The second postulate is nothing more than the familiar principle of Galileaninvariance, assuming that the latter is written in the “The laws of physics hold inall inertial frames” form, and not in the form that explicitly mentions the Galileantransformations, which are inconsistent with the speed-of-light postulate.

Everything we’ve said here about the second postulate refers to empty space.If we have a chunk of mass, then there is certainly a difference between theposition of the mass and a point a meter away. To incorporate mass into the theory,we would have to delve into General Relativity. But we won’t have anything tosay about that in this chapter. We will deal only with empty space, containingperhaps a few observant souls sailing along in rockets or floating aimlessly onlittle spheres. Though it may sound boring at first, it will turn out to be moreexciting than you’d think.

Remark: Given the second postulate, you might wonder if we even need the first. If all inertialframes are equivalent, shouldn’t the speed of light be the same in any frame? Well, no. For allwe know, light might behave like a baseball. A baseball certainly doesn’t have the same speedwith respect to different frames, and this doesn’t ruin the equivalence of the frames.

It turns out (see Section 11.10) that nearly all of Special Relativity can be derived byinvoking only the second postulate. The first postulate simply fills in the last bit of necessaryinformation by stating that something has the same finite speed in every frame. It’s actuallynot important that this thing happens to be light. It could be mashed potatoes or something else(well, it has to be massless, as we’ll see in Chapter 12, so they’d have to be massless potatoes,but whatever), and the theory would come out the same. So to be a little more minimalistic,it’s sufficient to state the first postulate as, “There is something that has the same speed in anyinertial frame.” It just so happens that in our universe this thing is what allows us to see.15 ♣

11.3 The fundamental effects

The most striking effects of our two postulates are (1) the loss of simultane-ity, (2) length contraction, and (3) time dilation. In this section, we’ll discussthese three effects using some time-honored concrete examples. In the followingsection, we’ll derive the Lorentz transformations using these three results.

11.3.1 Loss of simultaneityc c

l' l'

A

Bv

Fig. 11.4

Consider the following setup. In A’s reference frame, a light source is placedmidway between two receivers, a distance �� from each (see Fig. 11.4). The lightsource emits a flash. From A’s point of view, the light hits the two receivers atthe same time, ��/c seconds after the flash. Now consider another observer, B,who travels to the left at speed v. From her point of view, does the light hit thereceivers at the same time? We will show that it does not.

15 To go a step further, it’s actually not even necessary for there to exist something that has the same

speed in any frame. The theory will still come out the same if we write the first postulate as, “There

is a limiting speed of an object in any frame.” (See Section 11.10 for a discussion of this.) There’s

no need to have something that actually travels at this speed. It’s conceivable to have a theory that

contains no massless objects, so that everything travels slower than this limiting speed.

Page 58: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

518 Relativity (Kinematics)

It’s definitely true that when the two twins are standing next to each other (thatis, when they are in the same frame), we can’t have both B younger than A, andA younger than B. So what is wrong with the reasoning at the end of the previousparagraph? The error lies in the fact that there is no “one frame” that B is in. Theinertial frame for the outward trip is different from the inertial frame for the returntrip. The derivation of our time-dilation result applies only to one inertial frame.

Said in a different way, B accelerates when she turns around, and our time-dilationresult holds only from the point of view of an inertial observer.22 The symmetry in theproblem is broken by the acceleration. If both A and B are blindfolded, they can stilltell who is doing the traveling, because B will feel the acceleration at the turnaround.Constant velocity cannot be felt, but acceleration can be. (However, see Chapter 14on General Relativity. Gravity complicates things.)

The above paragraphs show what is wrong with the “A is younger” reasoning, butthey don’t show how to modify it quantitatively to obtain the correct answer. There aremany different ways of doing this, and you can tackle some of them in the problems(Exercise 11.67, Problems 11.2, 11.19, 11.24, and various problems in Chapter 14).Also, Appendix H gives a list of all the possible resolutions to the twin paradox thatI can think of.

Example (Muon decay): Elementary particles called muons (which are identicalto electrons, except that they are about 200 times as massive) are created in the upperatmosphere when cosmic rays collide with air molecules. The muons have an averagelifetime of about 2 · 10−6 seconds23 (then they decay into electrons and neutrinos),and move at nearly the speed of light. Assume for simplicity that a certain muon iscreated at a height of 50 km, moves straight downward, has a speed v = 0.99998 c,decays in exactly T = 2 · 10−6 seconds, and doesn’t collide with anything on theway down.24 Will the muon reach the earth before it (the muon!) decays?

Solution: The naive thing to say is that the distance traveled by the muon isd = vT ≈ (3 · 108 m/s)(2 · 10−6 s) = 600 m, and that this is less than 50 km,so the muon doesn’t reach the earth. This reasoning is incorrect, because of the time-dilation effect. The muon lives longer in the earth frame, by a factor of γ , which isγ = 1/

�1 − v2/c2 ≈ 160 here. The correct distance traveled in the earth frame is

therefore v(γ T ) ≈ 100 km. Hence, the muon travels the 50 km, with room to spare.The real-life fact that we actually do detect muons reaching the surface of the earthin the predicted abundances (while the naive d = vT reasoning would predict that

22 For the entire outward and return parts of the trip, B does observe A’s clock running slow, but

enough strangeness occurs during the turning-around period to make A end up older. Note, how-

ever, that a discussion of acceleration is not required to quantitatively understand the paradox, as

Problem 11.2 shows.23 This is the “proper” lifetime, that is, the lifetime as measured in the frame of the muon.24 In the real world, the muons are created at various heights, move in different directions, have

different speeds, decay in lifetimes that vary according to a standard half-life formula, and may

very well bump into air molecules. So technically we’ve got everything wrong here. But that’s no

matter. This example will work just fine for the present purpose.

Page 59: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

11.3 The fundamental effects 521

How does A view the situation? He sees the ground and the stick fly by at speed v.The time between the two ends passing him is �/γ v (because that is the time elapsedon his watch). To get the length of the stick in his frame, he simply multiplies thespeed by the time. That is, he measures the length to be (�/γ v)v = �/γ , which isthe desired contraction. The same argument also shows that length contraction impliestime dilation.

4. As mentioned earlier, the length contraction factor γ is independent of position on theobject. That is, all parts of the train are contracted by the same amount. This followsfrom the fact that all points in space are equivalent. Equivalently, we could put a largenumber of small replicas of the above source–mirror system along the length of thetrain. They would all produce the same value for γ , independent of the position onthe train.

5. If you still want to ask, “Is the contraction really real?” then consider the followinghypothetical undertaking. Imagine a sheet of paper moving sideways past the Mona Lisa,skimming the surface. A standard sheet of paper is plenty large enough to cover her face,so if the paper is moving slowly, and if you take a photograph at the appropriate time,then in the photo you’ll see her entire face covered by the paper. However, if the sheetis flying by sufficiently fast, and if you take a photograph at the appropriate time, thenin the photo you’ll see a thin vertical strip of paper covering only a small fraction of herface. So you’ll still see her smiling at you. ♣

Example (Passing trains): Two trains, A and B, each have proper length L andmove in the same direction. A’s speed is 4c/5, and B’s speed is 3c/5. A starts behindB (see Fig. 11.16). How long, as viewed by person C on the ground, does it take forA to overtake B? By this we mean the time between the front of A passing the backof B, and the back of A passing the front of B.

4c/5A

C

B3c/5

Fig. 11.16

Solution: Relative to C on the ground, the γ factors associated with A and B are5/3 and 5/4, respectively. Therefore, their lengths in the ground frame are 3L/5 and4L/5. While overtaking B, A must travel farther than B, by an excess distance equalto the sum of the lengths of the trains, which is 7L/5. The relative speed of the twotrains (as viewed by C on the ground) is the difference of the speeds, which is c/5.The total time is therefore

tC = 7L/5

c/5= 7L

c. (11.15)

Example (Muon decay, again): Consider the “Muon decay” example fromSection 11.3.2. From the muon’s point of view, it lives for a time of T = 2 · 10−6

seconds, and the earth is speeding toward it at v = 0.99998c. How, then, doesthe earth (which travels only d = vT ≈ 600 m before the muon decays) reach themuon?

Solution: The important point here is that in the muon’s frame, the distance to theearth is contracted by a factor γ ≈ 160. Therefore, the earth starts only 50 km/160 ≈300 m away. Since the earth can travel a distance of 600 m during the muon’s lifetime,the earth collides with the muon, with time to spare.

As stated in the third remark above, time dilation and length contraction are inti-mately related. We can’t have one without the other. In the earth’s frame, the muon’s

Page 60: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

522 Relativity (Kinematics)

arrival on the earth is explained by time dilation. In the muon’s frame, it is explainedby length contraction.

Observe that for muons created,

The dilation of time is related

To Einstein’s insistence

Of shrunken-down distance

In the frame where decays aren’t belated.

An extremely important strategy in solving relativity problems is to plantyourself in a frame and stay there. The only thoughts running through your headshould be what you observe. That is, don’t try to use reasoning along the linesof, “Well, the person I’m looking at in this other frame sees such-and-such.”This will almost certainly cause an error somewhere along the way, because youwill inevitably end up writing down an equation that combines quantities thatare measured in different frames, which is a no-no. Of course, you might wantto solve another part of the problem by working in another frame, or you mightwant to redo the whole problem in another frame. That’s fine, but once you decidewhich frame you’re going to use, make sure you put yourself there and stay there.

Another very important strategy is to draw a picture of the setup (in whateverframe you’ve chosen) at every moment when something significant happens, aswe did in Fig. 11.13. Once we drew the pictures there, it was clear what weneeded to do. But without the pictures, we almost certainly would have gottenconfused.

At this point you might want to look at the “Qualitative relativity questions” inAppendix F, just to make sure we’re all on the same page. Some of the questionsdeal with material we haven’t covered yet, but most are relevant to what we’vedone so far.

This concludes our treatment of the three fundamental effects. In the nextsection, we’ll combine all the information we’ve gained and use it to derive theLorentz transformations. But one last comment before we get to those:

Lattice of clocks and meter sticksIn everything we’ve done so far, we’ve taken the route of having observers sittingin various frames, making various measurements. But as mentioned earlier, thiscan cause some ambiguity, because you might think that the time when lightreaches the observer is important, whereas what we are generally concerned withis the time when something actually happens.

A way to avoid this ambiguity is to remove the observers and then imag-ine filling up space with a large rigid lattice of meter sticks and synchronized

Page 61: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

11.4 The Lorentz transformations 523

clocks. Different frames are defined by different lattices; assume that the latticesof different frames can somehow pass freely through each other. All the metersticks in a given frame are at rest with respect to all the others, so we don’t haveto worry about issues of length contraction within each frame. But the lattice ofa frame moving past you is squashed in the direction of motion, because all themeter sticks in that direction are contracted.

To measure the length of an object in a given frame, we just need to determinewhere the ends are (at simultaneous times, as measured in that frame) with respectto the lattice. As far as the synchronization of the clocks within each framegoes, this can be accomplished by putting a light source midway between anytwo clocks and sending out signals, and then setting the clocks to a certainvalue when the signals hit them. Alternatively, a more straightforward methodof synchronization is to start with all the clocks synchronized right next to eachother, and then move them very slowly to their final positions. Any time-dilationeffects can be made arbitrarily small by moving the clocks sufficiently slowly.This is true because the time-dilation γ factor is second order in v, but the timeit takes a clock to reach its final position is only first order in 1/v.

This lattice way of looking at things emphasizes that observers are not impor-tant, and that a frame is defined simply as a lattice of space and time coordinates.Anything that happens (an “event”) is automatically assigned a space and timecoordinate in every frame, independent of any observer. The concept of an “event”will be very important in the next section.

11.4 The Lorentz transformations

11.4.1 The derivationv

x

y

zx'

y'

z'S S'

Fig. 11.17

Consider a coordinate system, S �, moving relative to another system, S (seeFig. 11.17). Let the constant relative speed of the frames be v. Let the corre-sponding axes of S and S � point in the same direction, and let the origin of S �move along the x axis of S, in the positive direction. Nothing exciting happensin the y and z directions (see Problem 11.1), so we’ll ignore them.

Our goal in this section is to look at two events (an event is anything thathas space and time coordinates) in spacetime and relate the �x and �t of thecoordinates in one frame to the �x� and �t� of the coordinates in another. Wetherefore want to find the constants A, B, C, and D in the relations,

�x = A �x� + B �t�,

�t = C �t� + D �x�.(11.16)

The four constants here will end up depending on v (which is constant, given thetwo inertial frames). But we won’t explicitly write this dependence, for ease ofnotation.

Page 62: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

524 Relativity (Kinematics)

Remarks:

1. We have assumed in Eq. (11.16) that �x and �t are linear functions of �x� and �t�. Andwe have also assumed that A, B, C, and D are constants, that is, they depend at most onv, and not on x, t, x�, t�.

The first of these assumptions is justified by the fact that any finite interval can be builtup from a series of many infinitesimal ones. But for an infinitesimal interval, any termssuch as (�t�)2, for example, are negligible compared with the linear terms. Therefore, ifwe add up all the infinitesimal intervals to obtain a finite one, we will be left with onlythe linear terms. Equivalently, it shouldn’t matter whether we make a measurement with,say, meter sticks or half-meter sticks.

The second assumption can be justified in various ways. One is that all inertial framesshould agree on what “nonaccelerating” motion is. That is, if �x� = u� �t�, then we shouldalso have �x = u �t, for some constant u. This is true only if the above coefficientsare constants, as you can check. Another justification comes from the second of our tworelativity postulates, which says that all points in (empty) space are indistinguishable. Withthis in mind, let’s assume that we have a transformation of the form, say, �x = A �x� +B �t� + Ex� �x�. The x� in the last term implies that the absolute location in spacetime(and not just the relative position) is important. Therefore, this last term cannot exist.

2. If the relations in Eq. (11.16) turned out to be the usual Galilean transformations (which arethe ones that hold for everyday relative speeds v) then we would have �x = �x� + v �t,and �t = �t� (that is, A = C = 1, B = v, and D = 0). We will find, however, that underthe assumptions of Special Relativity, this is not the case. The Galilean transformationsare not the correct transformations. But we will show below that the correct transforma-tions do indeed reduce to the Galilean transformations in the limit of slow speeds, as theymust. ♣

The constants A, B, C, and D in Eq. (11.16) are four unknowns, and we cansolve for them by using four facts we found above in Section 11.3. The four factswe will use are:

Effect Condition Result Eq. in text

1 Time dilation x� = 0 t = γ t� (11.9)2 Length contraction t� = 0 x� = x/γ (11.14)3 Relative v of frames x = 0 x� = −vt�4 Rear clock ahead t = 0 t� = −vx�/c2 (11.6)

We have taken the liberty of dropping the �’s in front of the coordinates, lestthings get too messy. We will often omit the �’s, but it should be understood thatx really means �x, etc. We are always concerned with the difference betweencoordinates of two events in spacetime. The actual value of any coordinate isirrelevant, because there is no preferred origin in any frame.

You should pause for a moment and verify that the four “results” in the abovetable are in fact the proper mathematical expressions for the four effects, giventhe stated “conditions.”26 My advice is to keep pausing until you’re comfortable

26 We can state the effects in other ways too, by switching the primes and unprimes. For example,

time dilation can be written as “t� = γ t when x = 0.” But we’ve chosen the above ways of writing

things because they will allow us to solve for the four unknowns in the most efficient way.

Page 63: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

11.4 The Lorentz transformations 525

with all the entries in the table. Note that the sign in the “rear clock ahead” effectis indeed correct, because the front clock shows less time than the rear clock. Sothe clock with the higher x� value is the one with the lower t� value.

We can now use our four facts in the above table to quickly solve for theunknowns A, B, C, and D in Eq. (11.16).

Fact (1) gives C = γ .Fact (2) gives A = γ .Fact (3) gives B/A = v =⇒ B = γ v.Fact (4) gives D/C = v/c2 =⇒ D = γ v/c2.

Equations (11.16), which are known as the Lorentz transformations, are thereforegiven by

�x = γ (�x� + v �t�),

�t = γ (�t� + v �x�/c2),

�y = �y�,

�z = �z�,

(11.17)

where

γ ≡ 1�1 − v2/c2

. (11.18)

We have tacked on the trivial transformations for y and z, but we won’t botherwriting these in the future. Also, we’ll drop the �’s from now on, but rememberthat they’re always really there.

If we solve for x� and t� in terms of x and t in Eq. (11.17), then we see that theinverse Lorentz transformations are given by

x� = γ (x − vt),

t� = γ (t − vx/c2).(11.19)

Of course, which ones you label as the “inverse” transformations depends onyour point of view. But it’s intuitively clear that the only difference between thetwo sets of equations is the sign of v, because S is simply moving backward withrespect to S �.

The reason why the derivation of Eqs. (11.17) was so quick is that wealready did most of the work in Section 11.3 when we derived the fundamentaleffects. If we wanted to derive the Lorentz transformations from scratch, that is,by starting with the two postulates in Section 11.2, then the derivation would belonger. In Appendix I we give such a derivation, where it is clear what infor-mation comes from each of the postulates. The procedure there is somewhatcumbersome, but it’s worth taking a look at, because we will invoke the result ina very cool way in Section 11.10.

Page 64: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

526 Relativity (Kinematics)

Remarks:

1. In the limit v � c (or more precisely, in the limit vx�/c2 � t�, which means that even if v

is small, we have to be careful that x� isn’t too large), Eqs. (11.17) reduce to x = x�+vt andt = t�, which are the good old Galilean transformations. This must be the case, because weknow from everyday experience (where v � c) that the Galilean transformations workjust fine.

2. Equations (11.17) exhibit a nice symmetry between x and ct. With β ≡ v/c, we have

x = γ�x� + β(ct�)

�,

ct = γ�(ct�) + βx��.

(11.20)

Equivalently, in units where c = 1 (for example, where one unit of distance equals 3 · 108

meters, or where one unit of time equals 1/(3 · 108) seconds), Eqs. (11.17) take thesymmetric form,

x = γ (x� + vt�),

t = γ (t� + vx�).(11.21)

3. In matrix form, Eq. (11.20) can be written as

�xct

�=

�γ γβ

γβ γ

� �x�ct�

�. (11.22)

This looks similar to a rotation matrix. More about this in Section 11.9, and inProblem 11.27.

4. We did the above derivation in terms of a primed and an unprimed system. But whenyou’re doing problems, it’s usually best to label your coordinates with subscripts suchas A for Alice, or T for train. In addition to being more informative, this notation is lesslikely to make you think that one frame is more fundamental than the other.

5. It’s easy to get confused about the sign on the right-hand side of the Lorentz transforma-tions. To figure out if it should be a plus or a minus, write down xA = γ (xB ± vtB), andthen imagine sitting in system A and looking at a fixed point in B. This fixed point satisfies(putting the �’s back in to avoid any mixup) �xB = 0, which gives �xA = ±γ v�tB.So if the point moves to the right (that is, if it increases as time increases), then pick the“+.” And if it moves to the left, then pick the “−.” In other words, the sign is determinedby which way A (the person associated with the coordinates on the left-hand side of theequation) sees B (ditto for the right-hand side) moving.

6. One very important thing we must check is that two successive Lorentz transformations(from S1 to S2 and then from S2 to S3) again yield a Lorentz transformation (from S1to S3). This must be true because we showed that any two frames must be related byEqs. (11.17). If we composed two L.T.’s (along the same direction) and found that thetransformation from S1 to S3 was not of the form of Eqs. (11.17), for some new v, then thewhole theory would be inconsistent, and we would have to drop one of our postulates.27

You can show that the combination of an L.T. (with speed v1) and an L.T. (with speed v2)does indeed yield an L.T., and it has speed (v1 + v2)/(1 + v1v2/c2). This is the task ofExercise 11.47, and also Problem 11.27 (which is stated in terms of rapidity, introduced

27 This statement is true only for the composition of two L.T.’s in the same direction. If we composed

an L.T. in the x direction with one in the y direction, the result would interestingly not be an L.T.

along some new direction, but rather the composition of an L.T. along some direction and a

rotation through some angle. This rotation results in what is known as the Thomas precession.

See the appendix of Muller (1992) for a quick derivation of the Thomas precession. For further

discussion, see Costella et al. (2001) and Rebilas (2002).

Page 65: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

11.4 The Lorentz transformations 527

in Section 11.9). This resulting speed is one that we’ll see again when we get to thevelocity-addition formula in Section 11.5.1. ♣

Example: A train with proper length L moves at speed 5c/13 with respect to theground. A ball is thrown from the back of the train to the front. The speed of the ballwith respect to the train is c/3. As viewed by someone on the ground, how much timedoes the ball spend in the air, and how far does it travel?

Solution: The γ factor associated with the speed 5c/13 is γ = 13/12. The twoevents we are concerned with are “ball leaving back of train” and “ball arriving at frontof train.” The spacetime separation between these events is easy to calculate on thetrain. We have �xT = L, and �tT = L/(c/3) = 3L/c. The Lorentz transformationsgiving the coordinates on the ground are therefore

xG = γ (xT + vtT) = 13

12

�L +

�5c

13

��3L

c

��= 7L

3,

tG = γ (tT + vxT/c2) = 13

12

�3L

c+ (5c/13)L

c2

�= 11L

3c.

(11.23)

In a given problem, such as the above example, one of the frames usuallyallows for a quick calculation of �x and �t, so you simply have to mechanicallyplug these quantities into the L.T.’s to obtain �x� and �t� in the other frame,where they may not be as obvious.

Relativity is a subject in which there are usually many ways to do a problem. Ifyou’re trying to find some �x’s and �t’s, then you can use the L.T.’s, or perhapsthe invariant interval (introduced in Section 11.6), or maybe a velocity-additionapproach (introduced in Section 11.5.1), or even the sending-of-light-signalsstrategy used in Section 11.3. Depending on the specific problem and what yourpersonal preferences are, certain approaches will be more enjoyable than others.But no matter which method you choose, you should take advantage of theplethora of possibilities by picking a second method to double-check your answer.Personally, I find the L.T.’s to be the perfect option for this, because the othermethods are generally more fun when solving a problem for the first time, whilethe L.T.’s are usually quick and easy to apply (perfect for a double-check).28

The excitement will build in your voice,

As you rise from your seat and rejoice,

“A Lorentz transformation

Provides information,

As an alternate method of choice!”

28 I would, however, be very wary of solving a problem using only the L.T.’s, with no other check,

because it’s very easy to mess up a sign in the transformations. And since there’s nothing to do

except mechanically plug in numbers, there’s not much opportunity for an intuitive check, either.

Page 66: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

528 Relativity (Kinematics)

11.4.2 The fundamental effects

Let’s now see how the Lorentz transformations imply the three fundamentaleffects (namely, loss of simultaneity, time dilation, and length contraction) dis-cussed in Section 11.3. Of course, we just used these effects to derive the L.T.’s,so we know everything will work out. We’ll just be going in circles. But sincethese fundamental effects are, well, fundamental, let’s belabor the point anddiscuss them one more time, with the starting point being the L.T.’s.

Loss of simultaneityLet two events occur simultaneously in frame S �. Then the separation betweenthem, as measured by S �, is (x�, t�) = (x�, 0). As usual, we are not bothering towrite the �’s in front of the coordinates. Using the second of Eqs. (11.17), we seethat the time between the events, as measured by S, is t = γ vx�/c2. This is notequal to zero (unless x� = 0). Therefore, the events do not occur simultaneouslyin frame S.

Time dilationConsider two events that occur in the same place in S �. Then the separationbetween them is (x�, t�) = (0, t�). Using the second of Eqs. (11.17), we see thatthe time between the events, as measured by S, is

t = γ t� (if x� = 0). (11.24)

The factor γ is greater than or equal to 1, so t ≥ t�. The passing of one second onS �’s clock takes more than one second on S’s clock. S sees S � drinking his coffeevery slowly.

The same strategy works if we interchange S and S �. Consider two eventsthat occur in the same place in S. The separation between them is (x, t) = (0, t).Using the second of Eqs. (11.19), we see that the time between the events, asmeasured by S �, is

t� = γ t (if x = 0). (11.25)

Therefore, t� ≥ t. Another way to derive this is to use the first of Eqs. (11.17) towrite x� = −vt�, and then substitute this into the second equation.

Remark: If we write down the two above equations by themselves, t = γ t� and t� = γ t, theyappear to contradict each other. This apparent contradiction arises from the omission of theconditions they are based on. The former equation is based on the assumption that x� = 0. Thelatter equation is based on the assumption that x = 0. They have nothing to do with each other.It would perhaps be better to write the equations as

(t = γ t�)x�=0, and (t� = γ t)x=0, (11.26)

but this is somewhat cumbersome. ♣

Page 67: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

11.5 Velocity addition 529

Length contractionThis proceeds just like the time dilation above, except that now we want to setcertain time intervals equal to zero, instead of certain space intervals. We wantto do this because to measure a length, we calculate the distance between twopoints whose positions are measured simultaneously. That’s what a length is.

Consider a stick at rest in S �, where it has length ��. We want to find the length �

in S. Simultaneous measurements of the coordinates of the ends of the stick in Syield a separation of (x, t) = (x, 0). Using the first of Eqs. (11.19), we have

x� = γ x (if t = 0). (11.27)

But x is by definition the length in S. And x� is the length in S �, because the stickisn’t moving in S �.29 Therefore, � = ��/γ . And since γ ≥ 1, we have � ≤ ��,so S sees the stick shorter than S � sees it.

Now interchange S and S �. Consider a stick at rest in S, where it has length �.We want to find the length in S �. Measurements of the coordinates of the ends ofthe stick in S � yield a separation of (x�, t�) = (x�, 0). Using the first of Eqs. (11.17),we have

x = γ x� (if t� = 0). (11.28)

But x� is by definition the length in S �. And x is the length in S, because the stickis not moving in S. Therefore, �� = �/γ , so �� ≤ �.

Remark: As with time dilation, if we write down the two equations by themselves, � = ��/γand �� = �/γ , they appear to contradict each other. But as before, this apparent contradictionarises from the omission of the conditions they are based on. The former equation is based onthe assumptions that t = 0 and that the stick is at rest in S �. The latter equation is based on theassumptions that t� = 0 and that the stick is at rest in S. They have nothing to do with eachother. We should really write,

(x = x�/γ )t=0, and (x� = x/γ )t�=0, (11.29)

and then identify x� in the first equation with �� only after invoking the further assumption thatthe stick is at rest in S �. Likewise for the second equation. But this is a pain. ♣

11.5 Velocity addition

11.5.1 Longitudinal velocity addition

Consider the following setup. An object moves at speed v1 with respect toframe S �. And frame S � moves at speed v2 with respect to frame S, in the samedirection as the motion of the object (see Fig. 11.18). What is the speed, u, of theobject with respect to frame S?

v

S

S'

1 v2

Fig. 11.18

29 The measurements of the ends made by S are not simultaneous in the S � frame. In the S � frame,

the separation between the events is (x�, t�), where both x� and t� are nonzero. This doesn’t satisfy

our definition of a length measurement in S � (because t� �= 0), but the stick isn’t moving in S �, so

S � can measure the ends whenever he feels like it, and he will always get the same difference. So

x� is indeed the length in the S � frame.

Page 68: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

530 Relativity (Kinematics)

The Lorentz transformations can be used to easily answer this question. Therelative speed of the frames is v2. Consider two events along the object’s path (forexample, say it makes two beeps). We are given that �x�/�t� = v1. Our goal isto find u ≡ �x/�t. The Lorentz transformations from S � to S, Eqs. (11.17), are

�x = γ2(�x� + v2�t�), and �t = γ2(�t� + v2�x�/c2), (11.30)

where γ2 ≡ 1/

�1 − v2

2/c2. Therefore,

u ≡ �x

�t= �x� + v2�t�

�t� + v2�x�/c2

= �x�/�t� + v2

1 + v2(�x�/�t�)/c2

= v1 + v2

1 + v1v2/c2. (11.31)

This is the velocity-addition formula, for adding velocities along the same line.Let’s look at some of its properties.

• It is symmetric with respect to v1 and v2, as it should be, because we could switch the

roles of the object and frame S.

• For v1v2 � c2, it reduces to u ≈ v1 + v2, which we know holds perfectly well for

everyday speeds.

• If v1 = c or v2 = c, then we find u = c, as should be the case, because anything that

moves with speed c in one frame moves with speed c in another.

• The maximum (or minimum) of u in the region −c ≤ v1, v2 ≤ c equals c (or −c),

which can be seen by noting that ∂u/∂v1 and ∂u/∂v2 are never zero in the interior of

the region.

If you take any two velocities that are less than c and add them according toEq. (11.31), then you will obtain a velocity that is again less than c. This showsthat no matter how much you keep accelerating an object (that is, no matter howmany times you give the object a speed v1 with respect to the frame moving atspeed v2 that it was just in), you can’t bring the speed up to the speed of light.We’ll give another argument for this result in Chapter 12 when we discuss energy.

For a bullet, a train, and a gun,

Adding the speeds can be fun.

Take a trip down the path

Paved with Einstein’s new math,

Where a half plus a half isn’t one.

Remark: Consider the two scenarios shown in Fig. 11.19. If the goal is to find the velocity ofA with respect to C, then the velocity-addition formula applies to both scenarios, because thesecond scenario is the same as the first one, as observed in B’s frame.

vA

A

B

B

C

C

1

v1

v2

v2

Fig. 11.19

Page 69: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

11.5 Velocity addition 531

The velocity-addition formula applies when we ask, “If A moves at v1 with respect to B,and B moves at v2 with respect to C (which means, of course, that C moves at speed v2 withrespect to B), then how fast does A move with respect to C?” The formula does not apply if weask the more mundane question, “What is the relative speed of A and C, as viewed by B?” Theanswer to this is just v1 + v2.

In short, if the two velocities are given with respect to the same observer, say B, and ifyou are asking for the relative velocity as measured by B, then you simply have to add thevelocities.30 But if you are asking for the relative velocity as measured by A or C, then youhave to use the velocity-addition formula. It makes no sense to add velocities that are measuredwith respect to different observers. Doing so would involve adding things that are measuredin different coordinate systems, which is meaningless. In other words, taking the velocity ofA with respect to B and adding it to the velocity of B with respect to C, hoping to obtain thevelocity of A with respect to C, is invalid. ♣

Example (Passing trains, again): Consider again the scenario in the “Passing

trains” example in Section 11.3.3.

(a) How long, as viewed by A and as viewed by B, does it take for A to overtake B?

A

C

BD 3c/5

4c/5

Fig. 11.20

(b) Let event E1 be “the front of A passing the back of B”, and let event E2 be “the

back of A passing the front of B.” Person D walks at constant speed from the

back of B to the front (see Fig. 11.20), such that he coincides with both events

E1 and E2. How long does the “overtaking” process take, as viewed by D?

Solution:

A

B

A

5c/13 5c/13

start

B’s frame

end

Fig. 11.21

(a) First consider B’s point of view. From the velocity-addition formula, B sees A

move with speed

u =4c5 − 3c

5

1 − 45 · 3

5

= 5c

13. (11.32)

The γ factor associated with this speed is γ = 13/12. Therefore, B sees

A’s train contracted to a length 12L/13. During the overtaking, A must travel a

distance equal to the sum of the lengths of the trains in B’s frame (see Fig. 11.21),

which is L + 12L/13 = 25L/13. Since A moves at speed 5c/13, the total time

in B’s frame is

tB = 25L/13

5c/13= 5L

c. (11.33)

The exact same reasoning holds from A’s point of view, so we have

tA = tB = 5L/c.

30 Note that the resulting speed can certainly be greater than c. If I see a ball heading toward me at

0.9c from the right, and another one heading toward me at 0.9c from the left, then the relative

speed of the balls in my frame is 1.8c. In the frame of one of the balls, however, the relative speed

is (1.8/1.81)c ≈ (0.9945)c, from Eq. (11.31).

Page 70: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

532 Relativity (Kinematics)

A

Bv

v

D

A

B v

v

D

start

end

Fig. 11.22

(b) Look at things from D’s point of view. D is at rest, and the two trains move

with equal and opposite speeds v (see Fig. 11.22), because otherwise the second

event E2 wouldn’t be located at D. The relativistic addition of v with itself is

the speed of A as viewed by B. But from part (a), we know that this relative

speed equals 5c/13. Therefore,

2v

1 + v2/c2= 5c

13=⇒ v = c

5, (11.34)

where we have ignored the unphysical solution, v = 5c. Theγ factor associated

with v = c/5 is γ = 5/(2√

6 ). So D sees both trains contracted to a length

2√

6L/5. During the overtaking, each train must travel a distance equal to its

length, because both events, E1 and E2, take place right at D. The total time

in D’s frame is therefore

tD = 2√

6L/5

c/5= 2

√6L

c. (11.35)

Remarks: There are a few double-checks we can perform. The speed of D with respectto the ground can be obtained either via B’s frame by relativistically adding 3c/5 and c/5,or via A’s frame by subtracting c/5 from 4c/5. These both give the same answer, namely5c/7, as they must. (The c/5 speed can in fact be determined by this reasoning, instead ofusing Eq. (11.34).) The γ factor between the ground and D is therefore 7/2

√6. We can

then use time dilation to say that someone on the ground sees the overtaking take a timeof (7/2

√6)tD (we can say this because both events happen right at D). Using Eq. (11.35),

this gives a ground-frame time of 7L/c, in agreement with Eq. (11.15). Likewise, the γ

factor between D and either train is 5/2√

6. So the time of the overtaking as viewed byeither A or B is (5/2

√6)tD = 5L/c, in agreement with Eq. (11.33).

Note that we cannot use simple time dilation to relate the ground to A or B, becausethe two events don’t happen at the same place in the train frames. But since both eventshappen at the same place in D’s frame, namely right at D, it’s legal to use time dilation togo from D’s frame to any other frame. ♣

11.5.2 Transverse velocity addition

Consider the following general two-dimensional situation. An object moves withvelocity (u�

x, u�y) with respect to frame S �. And frame S � moves with speed v

with respect to frame S, in the x direction (see Fig. 11.23). What is the velocity,(ux, uy), of the object with respect to frame S?

v

S'

S

u'

u'

x

y

Fig. 11.23

The existence of motion in the y direction doesn’t affect the preceding deriva-tion of the speed in the x direction, so Eq. (11.31) is still valid. In the presentnotation, it becomes

ux = u�x + v

1 + u�xv/c2

. (11.36)

Page 71: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

11.6 The invariant interval 533

To find uy, we can again make easy use of the Lorentz transformations.Consider two events along the object’s path. We are given that �x�/�t� = u�

x,and �y�/�t� = u�

y. Our goal is to find uy ≡ �y/�t. The relevant Lorentztransformations from S � to S in Eq. (11.17) are

�y = �y�, and �t = γ (�t� + v�x�/c2). (11.37)

Therefore,

uy ≡ �y

�t= �y�

γ (�t� + v�x�/c2)

= �y�/�t�

γ (1 + v(�x�/�t�)/c2)

= u�y

γ (1 + u�xv/c2)

. (11.38)

Remark: In the special case where u�x = 0, we have uy = u�

y/γ . When u�y is small and v is

large, this result can be seen to be a special case of time dilation, in the following way. Considera series of equally spaced lines parallel to the x axis (see Fig. 11.24). Imagine that the object’sclock ticks once every time it crosses a line. Since u�

y is small, the object’s frame is essentiallyframe S �. So if S flies by to the left, then the object is essentially moving at speed v with respectto S. Therefore, S sees the clock run slow by a factor γ . This means that S sees the object crossthe lines at a slower rate, by a factor γ (because the clock still ticks once every time it crossesa line; this is a frame-independent statement). Since distances in the y direction are the samein the two frames, we conclude that uy = u�

y/γ . This γ factor will be very important when wedeal with momentum in Chapter 12.

vv

S'

S'

S

S

Fig. 11.24

To sum up: if you run in the x direction past an object, then its y speed is slower in yourframe (or faster, depending on the relative sign of u�

x and v). Strange indeed, but no strangerthan other effects we’ve seen. Problem 11.16 deals with the special case where u�

x = 0, butwhere u�

y is not necessarily small. ♣

11.6 The invariant interval

Consider the quantity,

(�s)2 ≡ c2(�t)2 − (�x)2. (11.39)

Technically, we should also subtract off (�y)2 and (�z)2, but nothing excitinghappens in the transverse directions, so we’ll ignore them. Using Eq. (11.17),we can write (�s)2 in terms of the S � coordinates, �x� and �t�. The result is(dropping the �’s)

c2t2 − x2 = c2(t� + vx�/c2)2

1 − v2/c2− (x� + vt�)2

1 − v2/c2

= t�2(c2 − v2) − x�2(1 − v2/c2)

1 − v2/c2

= c2t�2 − x�2. (11.40)

Page 72: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

534 Relativity (Kinematics)

We see that the Lorentz transformations imply that the quantity c2t2 − x2 doesn’tdepend on the frame. This result is more than we bargained for, for the followingreason. The speed-of-light postulate says that if c2t�2 − x�2 = 0, then c2t2 −x2 = 0. But Eq. (11.40) says that if c2t�2 − x�2 = b, then c2t2 − x2 = b, forany value of b, not just zero. This is, as you might guess, very useful. There areenough things that change when we go from one frame to another, so it’s niceto have a frame-independent quantity that we can hang on to. The fact that s2 isinvariant under Lorentz transformations of x and t is exactly analogous to the factthat r2 is invariant under rotations in the x-y plane. The coordinates themselveschange under the transformation, but the special combination of c2t2 − x2 forLorentz transformations, or x2 + y2 for rotations, remains the same. All inertialobservers agree on the value of s2, independent of what they measure for theactual coordinates.

“Potato?! Potahto!” said she,

“And of course it’s tomahto, you see.

But the square of ct

Minus x2 will be

Always something on which we agree.”

Anote on terminology: The separation in the coordinates, (c�t, �x), is usuallyreferred to as the spacetime interval, while the quantity (�s)2 ≡ c2(�t)2−(�x)2

is referred to as the invariant interval (or technically the square of the invariantinterval). At any rate, just call it s2, and people will know what you mean. Theinvariance of s2 is actually just a special case of more general results involvinginner products and 4-vectors, which we’ll discuss in Chapter 13. Let’s nowlook at the physical significance of s2 ≡ c2t2 − x2; there are three cases toconsider.

Case 1: s2 > 0 (timelike separation)In this case, we say that the two events are timelike separated. We have c2t2 > x2,and so |x/t| < c. Consider a frame S � moving at speed v with respect to S. TheLorentz transformation for x is

x� = γ (x − vt). (11.41)

Since |x/t| < c, there exists a v that is less than c (namely v = x/t) that makesx� = 0. In other words, if two events are timelike separated, it is possible tofind a frame S � in which the two events happen at the same place. (In short, thecondition |x/t| < c means that it is possible for a particle to travel from one eventto the other.) The invariance of s2 then gives s2 = c2t�2 − x�2 = c2t�2. So we seethat s/c is the time between the events in the frame in which the events occur atthe same place. This time is called the proper time.

Page 73: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

11.6 The invariant interval 535

Case 2: s2 < 0 (spacelike separation)In this case, we say that the two events are spacelike separated.31 We havec2t2 < x2, and so |t/x| < 1/c. Consider a frame S � moving at speed v withrespect to S. The Lorentz transformation for t� is

t� = γ (t − vx/c2). (11.42)

Since |t/x| < 1/c, there exists a v that is less than c (namely v = c2t/x) thatmakes t� = 0. In other words, if two events are spacelike separated, it is possibleto find a frame S � in which the two events happen at the same time. (This statementis not as easy to see as the corresponding one in the timelike case above. But ifyou draw a Minkowski diagram, described in the next section, it becomes clear.)The invariance of s2 then gives s2 = c2t�2 − x�2 = −x�2. So we see that |s| is thedistance between the events in the frame in which the events occur at the sametime. This distance is called the proper distance, or proper length.

Case 3: s2 = 0 (lightlike separation)In this case, we say that the two events are lightlike separated. We have c2t2 = x2,and so |x/t| = c. This holds in every frame, so in every frame a photon emittedat one of the events will arrive at the other. It is not possible to find a frame S �in which the two events happen at the same place or the same time, because theframe would have to travel at the speed of light.

Example (Time dilation): An illustration of the usefulness of the invariance of s2

is a derivation of time dilation. Let frame S � move at speed v with respect to frame S.Consider two events at the origin of S �, separated by time t�. The separation betweenthe events is

in S �: (x�, t�) = (0, t�),in S: (x, t) = (vt, t).

(11.43)

The invariance of s2 implies c2t�2 − 0 = c2t2 − v2t2. Therefore,

t = t��1 − v2/c2

. (11.44)

This method makes it clear that the time-dilation result rests on the assumption thatx� = 0.

Example (Passing trains, yet again): Consider again the scenario in the“Passing trains” examples in Sections 11.3.3 and 11.5.1. Verify that the s2 betweenthe events E1 and E2 is the same in all of the frames, A, B, C, and D (see Fig. 11.25).

A

C

BD 3c/5

4c/5

Fig. 11.25

31 It’s fine that s2 is negative in this case, which means that s is imaginary. We can take the absolute

value of s if we want to obtain a real number.

Page 74: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

536 Relativity (Kinematics)

Solution: The only quantity that we’ll need that we haven’t already found in the twoexamples above is the distance between E1 and E2 in C’s frame (the ground frame).In this frame, train A travels at a rate 4c/5 for a time tC = 7L/c, covering a distanceof 28L/5. But event E2 occurs at the back of the train, which is a distance 3L/5behind the front end (this is the contracted length in the ground frame). Therefore,the distance between events E1 and E2 in the ground frame is 28L/5 − 3L/5 = 5L.You can also apply the same line of reasoning using train B, in which the 5L resulttakes the form, (3c/5)(7L/c) + 4L/5.

Putting the previous results together, we have the following separations betweenthe events in the various frames:

A B C D

�t 5L/c 5L/c 7L/c 2√

6L/c�x −L L 5L 0

From the table, we see that �s2 ≡ c2�t2 − �x2 = 24L2 for all four frames, asdesired. We could have worked backwards, of course, and used the s2 = 24L2 resultfrom frames A, B, or D, to deduce that �x = 5L in frame C. In Problem 11.10, youare asked to perform the tedious task of checking that the values in the above tablesatisfy the Lorentz transformations between the six different pairs of frames.

11.7 Minkowski diagrams

Minkowski diagrams (sometimes called “spacetime” diagrams) are extremelyuseful in seeing how coordinates transform between different reference frames.If you want to produce exact numbers in a problem, you’ll probably have to useone of the strategies we’ve encountered so far. But as far as getting an overallintuitive picture of a setup goes (if there is in fact any such thing as intuition inrelativity), there is no better tool than a Minkowski diagram. Here’s how youmake one.

Let frame S � move at speed v with respect to frame S (along the x axis, asusual, and ignore the y and z components). Draw the x and ct axes of frame S.32

What do the x� and ct� axes of S � look like, superimposed on this diagram? Thatis, at what angles are the axes inclined, and what is the size of one unit on theseaxes? (There is no reason why one unit on the x� and ct� axes should have thesame length on the paper as one unit on the x and ct axes.) We can answer thesequestions by using the Lorentz transformations, Eqs. (11.17). We’ll first look atthe ct� axis, and then the x� axis.

32 We choose to plot ct instead of t on the vertical axis, so that the trajectory of a light beam lies at

a nice 45◦ angle. Alternatively, we could choose units where c = 1.

Page 75: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

11.7 Minkowski diagrams 537

ct �-axis angle and unit size (x',ct') = (0,1)

(x',ct') = (1,0)

1

2

ct ct'

x

x'u

u

b

b

Fig. 11.26

Look at the point (x�, ct�) = (0, 1), which lies on the ct� axis, one ct� unit fromthe origin (see Fig. 11.26). Equations (11.17) tell us that this point is the point(x, ct) = (γ v/c, γ ). The angle between the ct� and ct axes is therefore given bytan θ1 = x/ct = v/c. With β ≡ v/c, we have

tan θ1 = β. (11.45)

Alternatively, the ct� axis is simply the “worldline” of the origin of S �. (Aworldlineis the path an object takes as it travels through spacetime.) The origin moves atspeed v with respect to S. Therefore, points on the ct� axis satisfy x/t = v, orx/ct = v/c.

On the paper, the point (x�, ct�) = (0, 1), which we just found to be the point(x, ct) = (γ v/c, γ ), is a distance γ

�1 + v2/c2 from the origin. Therefore, using

the definitions of β and γ , we see that

one ct� unit

one ct unit=

�1 + β2

1 − β2, (11.46)

as measured on a grid where the x and ct axes are orthogonal. This ratioapproaches infinity as β → 1. And it of course equals 1 if β = 0.

x �-axis angle and unit sizeThe same basic argument holds here. Look at the point (x�, ct�) = (1, 0), whichlies on the x� axis, one x� unit from the origin (see Fig. 11.26). Equations (11.17)tell us that this point is the point (x, ct) = (γ , γ v/c). The angle between the x�and x axes is therefore given by tan θ2 = ct/x = v/c. So, as in the ct�-axis case,

tan θ2 = β. (11.47)

On the paper, the point (x�, ct�) = (1, 0), which we just found to be the point(x, ct) = (γ , γ v/c), is a distance γ

�1 + v2/c2 from the origin. So, as in the

ct�-axis case,

one x� unit

one x unit=

�1 + β2

1 − β2, (11.48)

as measured on a grid where the x and ct axes are orthogonal. Both the x� andct� axes are therefore stretched by the same factor, and tilted in by the sameangle, relative to the x and ct axes. This “squeezing in” of the axes in a Lorentztransformation is different from what happens in a rotation, where both axesrotate in the same direction.

Remarks: If v/c ≡ β = 0, then θ1 = θ2 = 0, so the ct� and x� axes coincide with the ct andx axes, as they should. If β is very close to 1, then the x� and ct� axes are both very close to

Page 76: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

538 Relativity (Kinematics)

the 45◦ light-ray line. Note that since θ1 = θ2, the light-ray line bisects the x� and ct� axes.Therefore (as we verified above), the scales on these axes must be the same, because a lightray must satisfy x� = ct�. ♣

We now know what the x� and ct� axes look like. Given any two points ina Minkowski diagram (that is, given any two events in spacetime), we can justread off the �x, �ct, �x�, and �ct� quantities that our two observers measure,assuming that our graph is accurate enough. Although these quantities must ofcourse be related by the Lorentz transformations, the advantage of a Minkowskidiagram is that you can actually see geometrically what’s going on.

There are very useful physical interpretations of the ct� and x� axes. If youstand at the origin of S �, then the ct� axis is the “here” axis, and the x� axis is the“now” axis (the line of simultaneity). That is, all events on the ct� axis take placeat your position (the ct� axis is your worldline, after all), and all events on the x�axis take place simultaneously (they all have t� = 0).

Example (Length contraction): For both parts of this problem, use a Minkowski

diagram where the axes in frame S are orthogonal.

(a) The relative speed of S � and S is v (along the x direction). A meter stick lies

along the x� axis and is at rest in S �. If S measures its length, what is the result?

(b) Now let the meter stick lie along the x axis and be at rest in S. If S � measures

its length, what is the result?

Solution:

ct ct'

x

x'

A B

C

D

leftend

rightend

u u

Fig. 11.27

(a) Without loss of generality, pick the left end of the stick to be at the origin in S �.Then the worldlines of the two ends are shown in Fig. 11.27. The distance AC

is 1 meter in the S � frame, because A and C are the endpoints of the stick at

simultaneous times in the S � frame; this is how a length is measured. And since

one unit on the x� axis has length�

1 + β2/�

1 − β2, this is the length on the

paper of the segment AC.

How does S measure the length of the stick? He writes down the x coordinates

of the ends at simultaneous times (as measured by him, of course), and takes

the difference. Let the time he makes the measurements be t = 0. Then he

measures the ends to be at the points A and B.33 Now it’s time to do some

geometry. We have to find the length of segment AB in Fig. 11.27, given that

segment AC has length�

1 + β2/�

1 − β2. We know that the primed axes are

tilted at an angle θ , where tan θ = β. Therefore, CD = (AC) sin θ . And since

33 If S measures the ends in a dramatic fashion by, say, blowing them up, then S � will see the right

end blow up first (the event at B has a negative t� coordinate, because it lies below the x� axis), and

then a little while later S � will see the left end blow up (the event at A has t� = 0). So S measures

the ends at different times in the S � frame. This is part of the reason why S � should not be at all

surprised that S’s measurement is smaller than one meter.

Page 77: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

11.8 The Doppler effect 539

∠BCD = θ , we have BD = (CD) tan θ = (AC) sin θ tan θ . Therefore (using

tan θ = β),

AB = AD − BD = (AC) cos θ − (AC) sin θ tan θ

= (AC) cos θ(1 − tan2θ)

=�

1 + β2

1 − β2

1�1 + β2

(1 − β2)

=�

1 − β2 . (11.49)

Therefore, S measures the meter stick to have length�

1 − β2, which is the

standard length-contraction result.

ct ct'

x

x'

A B

E

leftend

rightend

u

Fig. 11.28

(b) The stick is now at rest in S, and we want to find the length that S � measures.

Pick the left end of the stick to be at the origin in S. Then the worldlines of the

two ends are shown in Fig. 11.28. The distance AB is 1 meter in the S frame.

In measuring the length of the stick, S � writes down the x� coordinates of the

ends at simultaneous times (as measured by him), and takes the difference. Let

the time he makes the measurements be t� = 0. Then he measures the ends to

be at the points A and E. Now we do the geometry, which is easy in this case.

The length of AE is simply 1/ cos θ =�

1 + β2. But since one unit along the

x� axis has length�

1 + β2/�

1 − β2 on the paper, we see that AE is�

1 − β2

of one unit in the S � frame. Therefore, S � measures the meter stick to have

length�

1 − β2, which again is the standard length-contraction result.

The analysis used in the above example also works for time intervals.The derivation of time dilation, using a Minkowski diagram, is the task ofExercise 11.62. And the derivation of the Lv/c2 rear-clock-ahead result is thetask of Exercise 11.63.

11.8 The Doppler effect

11.8.1 Longitudinal Doppler effect

v

c

Fig. 11.29

Consider a source that emits flashes at frequency f � (in its own frame) whilemoving directly toward you at speed v, as shown in Fig. 11.29. With what fre-quency do the flashes hit your eye? In these Doppler-effect problems, you mustbe careful to distinguish between the time at which an event occurs in your frame,and the time at which you see the event occur. This is one of the few situationswhere we are concerned with the latter.

There are two effects contributing to the longitudinal Doppler effect. The firstis relativistic time dilation. There is more time between the flashes in your frame,which means that they occur at a smaller frequency. The second is the everyday

Page 78: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

11.9 Rapidity 543

Alternatively, the error can be stated as follows. The time dilation result, �t = γ�t�,rests on the assumption that the �x� between the two events is zero. This applies fineto two emissions of light from the source. However, the two events in question are theabsorption of two light flashes by your eye (which is moving in S �), so �t = γ�t� is notapplicable. Instead, �t� = γ�t is the relevant result, valid when �x = 0. (But we stillneed to invoke the fact that all the relevant photons travel the same distance, which meansthat we don’t have to worry about any longitudinal effects.) ♣

11.9 Rapidity

11.9.1 Definition

Let us define the rapidity, φ, by

tanh φ ≡ β ≡ v

c. (11.54)

A few properties of the hyperbolic trig functions are given in Appendix A.In particular, tanh φ ≡ (eφ − e−φ)/(eφ + e−φ). The rapidity defined in Eq.(11.54) is very useful in relativity because many of our expressions take on aparticularly nice form when written in terms of it. Consider, for example, thevelocity-addition formula. Let β1 = tanh φ1 and β2 = tanh φ2. Then if we addβ1 and β2 using the velocity-addition formula, Eq. (11.31), we obtain

β1 + β2

1 + β1β2= tanh φ1 + tanh φ2

1 + tanh φ1 tanh φ2= tanh(φ1 + φ2), (11.55)

where we have used the addition formula for tanh φ, which you can prove bywriting things in terms of the exponentials, e±φ . Therefore, while the velocitiesadd in the strange manner of Eq. (11.31), the rapidities add by standard addition.

The Lorentz transformations also take a nice form when written in terms ofthe rapidity. Our friendly γ factor can be written as

γ ≡ 1�1 − β2

= 1�1 − tanh2φ

= cosh φ. (11.56)

Also,

γβ ≡ β�1 − β2

= tanh φ�1 − tanh2φ

= sinh φ. (11.57)

Therefore, the Lorentz transformations in matrix form, Eqs. (11.22), become�

xct

�=

�cosh φ sinh φ

sinh φ cosh φ

� �x�ct�

�. (11.58)

This transformation looks similar to a rotation in a plane, which is given by�

xy

�=

�cos θ sin θ

− sin θ cos θ

� �x�y�

�, (11.59)

Page 79: Course : B.Sc. (Hons.) Physics Sem-VI, March-2020

544 Relativity (Kinematics)

except that we now have hyperbolic trig functions instead of trig functions.The fact that the interval s2 ≡ c2t2 − x2 does not depend on the frame isclear from Eq. (11.58), because the cross terms in the squares cancel, andcosh2φ − sinh2φ = 1. (Compare with the invariance of r2 ≡ x2 + y2 for rota-tions in a plane, where the cross terms from Eq. (11.59) likewise cancel, andcos2θ + sin2θ = 1.)

The quantities associated with a Minkowski diagram also take a nice formwhen written in terms of the rapidity. The angle between the S and S � axessatisfies

tan θ = β = tanh φ. (11.60)

And the size of one unit on the x� or ct� axes is, from Eq. (11.46),

�1 + β2

1 − β2=

�1 + tanh2φ

1 − tanh2φ=

�cosh2φ + sinh2φ =

�cosh 2φ. (11.61)

For large φ, this is approximately equal to eφ/√

2.

11.9.2 Physical meaning

The fact that the rapidity makes many of our formulas look nice and pretty isreason enough to consider it. But in addition, it turns out to have a very meaningfulphysical interpretation. Consider the following setup. A spaceship is initially atrest in the lab frame. At a given instant, it starts to accelerate. Let a be theproper acceleration, which is defined as follows. Let t be the time coordinatein the spaceship’s frame.36 If the proper acceleration is a, then at time t + dt,the spaceship is moving at speed a dt relative to the frame it was in at time t.An equivalent definition is that the astronaut feels a force of ma applied to hisbody by the spaceship. If he is standing on a scale, then the scale shows a readingof F = ma.

What is the relative speed of the spaceship and the lab frame at (the spaceship’s)time t? We can answer this question by considering two nearby times and usingthe velocity-addition formula, Eq. (11.31). From the definition of a, Eq. (11.31)gives, with v1 ≡ a dt and v2 ≡ v(t),

v(t + dt) = v(t) + a dt

1 + v(t)a dt/c2. (11.62)

36 This frame is of course changing as time goes by, because the spaceship is accelerating. The time

t is simply the spaceship’s proper time. Normally, we would denote this by t�, but we don’t want

to have to keep writing the primes over and over in the following calculation.