Telescopes and Optics IIObservational Astronomy 2018 Part 4 Prof. S.C. Trager
Fermat’s principle
Optics using Fermat’s principle
Fermat’s principle
The path a (light) ray takes is such that the time of travel between two fixed points is stationary with respect to small changes in that path
In other words, the travel time of a ray is infinitesimally close to that of a neighboring path
This is a generalization of the “principle of least time” or “principle of least action”
To be precise, given a surface lying between two points P0 and P1, consider two paths from P0 to P1
If the travel time from P0 to P1 is τ, then
where y and z are the coordinates of the intersection of the path with the surface
P0
P1
(y,z)
d⌧/dy = d⌧/dz = 0
If we replace the words “time of travel” from Fermat’s principle above with the words “optical path length (OPL)”, c dt, we see that Fermat’s principle is recovered
The OPL is defined as
where dt is an infinitesimal travel time, v is the the speed of light in a medium of index of refraction n, and ds is the infinitesimal geometrical path length
P0
P1
(y,z)
d(OPL) = c dt = (c/v)v dt = nds
Then
Note that n can be a function of position here, so consider n(y,z) and
Letting y′=dy/dz, Fermat’s principle is δ(OPL)=0 or
P0
P1
(y,z)
�
Z P1
P0
n(y, z)q
(1 + y02)dz = 0
ds =p
dy2 + dz2
OPL = c
Zdt =
Znds
If we call then
where
P0
P1
(y,z)
F (y, y0, z) = n(y, z)q
1 + y02,
�
Z P1
P0
F (y, y0, z)dz =
Z P1
P0
�F (y, y0, z)dz = 0,
�F =@F
@y�y +
@F
@y0�y0 =
@F
@y�y +
@F
@y0d
dz(�y)
Substituting and integrating by parts, we have
where the second term vanishes because δy=0 at P0 and P1, and then
P0
P1
(y,z)
Z P1
P0
@F
@y�ydz +
@F
@y0�y
����P1
P0
�Z P1
P0
d
dz
✓@F
@y0
◆�ydz = 0
Z P1
P0
@F
@y� d
dz
✓@F
@y0
◆��ydz = 0
This must vanish for arbitrary δy, so
Substituting we have (after some algebra)
P0
P1
(y,z)
@F
@y� d
dz
✓@F
@y0
◆= 0
F (y, y0, z) = n(y, z)q
1 + y02,
q1 + y02
@n
@y� n
d
dz
y0p
1 + y02
!� y0p
1 + y02dn
dz= 0
This must vanish for arbitrary δy, so
Substituting we have (after some algebra)
P0
P1
(y,z)
@F
@y� d
dz
✓@F
@y0
◆= 0
F (y, y0, z) = n(y, z)q
1 + y02,
q1 + y02
@n
@y� n
d
dz
y0p
1 + y02
!� y0p
1 + y02dn
dz= 0
Now, this looks ugly (and it is), but the presence of terms like suggests that trigonometric solutions would be useful:
Here α is the angle made by the tangent of the path with the z-axis
P0
P1
(y,z)
q(1 + y02)
tan↵ = dy/dz = y0, sin↵ = dy/ds = y0/q
(1 + y02)
cos↵ = dz/ds = 1/q
(1 + y02), d(sin↵)/dz = cos↵(d↵/dz)
Then noting that
we can write
P0
P1
(y,z)
dn
dz=
@n
@z+ y0
@n
@y,
cos↵@n
@y� sin↵
@n
@z� n cos↵
d↵
dz= 0
Finally, note that we can write the curvature K of a path as
and therefore
P0
P1
(y,z)
K =d↵
ds=
d↵
dz
dz
ds= cos↵
d↵
dz
nK = n cos↵d↵
dz= cos↵
@n
@y� sin↵
@n
@z(15)
This is the equation for the local curvature of a light ray subject to Fermat’s principle in a medium in which n is a smoothly varying function of position (in the y,z-plane)
As a special case, consider n=constant. Then K=0, and the path of a light ray in a homogeneous medium is a straight line — as expected
nK = n cos↵d↵
dz= cos↵
@n
@y� sin↵
@n
@z(15)
A useful example: atmospheric refraction Assume the atmosphere is a flat, layered medium with n=n(z) only, where the z-axis points to the center of the Earth, and hence the curvature of the atmosphere is negligible Then Eq. (15) becomes
ground
top of the atmosphere
z
n=1.00000
n=1.00029
nK = n cos↵d↵
dz= � sin↵
@n
@z
Because the change in n from the top of the atmosphere to the surface is very small, the path of a ray from a star will not be significantly deviated if α is not close to 90º Then we can integrate the expression on the last slide to find where α0 is the zenith angle at the top of the atmosphere
�↵ = � tan↵0 �n
ground
top of the atmosphere
z
n=1.00000
n=1.00029
α0
For a ray passing down through the atmosphere, δn>0 and so δα<0 Thus the ray is bent towards the z-axis If α0=45º, then δα=0.00029 rad≈60″ Note that n=n(λ), so different wavelengths get bent by different amounts
Very important for wide-band spectroscopy!
ground
top of the atmosphere
z
n=1.00000
n=1.00029
α0
Another useful example is a dispersing prism with n=n(λ) in air (n=1) There is some wavelength whose rays follow paths parallel to the prism base
For these rays, the diagram is symmetric about the prism’s vertical bisector, so that
α1 α2A
n(λ)
θ(λ)
L L
t
s1 s2
a1 a2
s1 = s2 = s, ↵1 = ↵2 = ↵, and a1 = a2 = a
Another useful example is a dispersing prism with n=n(λ) in air (n=1) There is some wavelength whose rays follow paths parallel to the prism base
For these rays, the diagram is symmetric about the prism’s vertical bisector, so that
α1 α2A
n(λ)
θ(λ)
L L
t
s1 s2
a1 a2
s1 = s2 = s, ↵1 = ↵2 = ↵, and a1 = a2 = a
α αA
n(λ)
θ(λ)
L L
t
s s
a a
The OPL of the bottom ray (at the prism’s base) is just nt The OPL of the top ray (at the prism’s vertex) is Fermat’s principle says that these two OPLs must be the same, so
α1 α2A
n(λ)
θ(λ)
L L
t
s1 s2
a1 a2
2L cos↵
nt = 2L cos↵
α αA
n(λ)
θ(λ)
L L
t
s s
a a
We’re interested in the change of θ with wavelength, so let’s differentiate the OPL equation above:
From the figure, we see that
Thus
α1 α2A
n(λ)
θ(λ)
L L
t
s1 s2
a1 a2
tdn
d�= �2L sin↵
d↵
d�= �2L sin↵
d↵
d✓
d✓
d�
L sin↵ = a and ✓ = ⇡ �A� 2↵, so d↵/d✓ = �1/2
d✓
d�=
t
a
dn
d�
α αA
n(λ)
θ(λ)
L L
t
s s
a a
For most optical glasses, we can write (approximately)
where C0 and C1 are constants Therefore
The negative sign means that θ decreases as λ increases, so that blue light is deviated more than red light
The angular dispersion dθ/dλ is larger for shorter wavelengths
α1 α2A
n(λ)
θ(λ)
L L
t
s1 s2
a1 a2
d✓
d�= �2t
a
C1
�3
n(�) = C0 + C1/�2
α αA
n(λ)
θ(λ)
L L
t
s s
a a
Let’s return to the case of reflecting mirrors What shape must a mirror have to satisfy Fermat’s principle? Let’s consider three cases:
A concave mirror with one conjugate at ∞ A concave mirror with both conjugates finite A convex mirror with both conjugates finite
C OB′ z
yP
f
lΔ
y
Case 1: concave mirror, one conjugate at infinity For convenience, let f, l, and Δ be positive (note that this violates our sign convention!) Applying Fermat’s principle to a ray on the optical axis and one at height y, we see that
C OB′ z
yP
f
lΔ
y
2f = l + (f ��) and so l = f +�
We also know that (from Pythagoras) that
Eliminating l from these two equations we see that
This is the equation of a parabola with a vertex at (0,0)
C OB′ z
yP
f
lΔ
y
l2 = y2 + (f ��)2
y2 = 4f� = �4fz
Using Eq. (7) from the last set of slides and applying the sign convention we have
where R is the radius of curvature and both R and z are negative In three-space, we replace y2 with x2+y2, and we find that a paraboloid satisfies Fermat’s principle in this case
C OB′ z
yP
f
lΔ
y
y2 = 2Rz
Case 2: concave mirror, both conjugates finite In this case we find (using the same analysis)
where This is the equation for an ellipse with center (0,a)
C OB′Bz
y
P
Rs′
s
lΔ
y
y2 � 2zb2
a+ z2
b2
a2= 0
2a = s+ s0 and b2 = ss0
(16)
Case 3: convex mirror, both conjugates finite
In this case we find (using the same analysis)
where
This is the equation for a hyperbola with center (0,0)
y2 + 2zb2
a� z2
b2
a2= 0
2a = s+ s0 and b2 = �ss0
All of these forms can be written in a simple way:
First, recall that Eq. (7) from the last set of slides says
(in the paraxial approximation)
Next, consider an ellipse with eccentricity e=c/a where c is the distance from one focus to the center of the ellipse: c2 = a2 � b2
1
s+
1
s0=
2
Ror
ss0
s+ s0=
R
2=
b2
2a
For our elliptical mirror above, then, we can write
We can then write Eq. (16) as
Although we’ve derived this equation for an ellipse with –1<e<1, it’s actually the correct formula for any conic section
y2 � 2Rz + (1� e2)z2 = 0 (17)
e2 =
✓s� s0
s+ s0
◆2
and 1� e2 =4ss0
(s+ s0)2=
b2
a2
We can write this equation for a surface of revolution as
where K is Schwarzschild’s conic constant and ρ2=x2+y2. Then
conic section e2 Kprolate ellipsoid <0 >0
sphere 0 0oblate ellipsoid 0—1 –1—0
paraboloid 1 –1hyperboloid >1 <–1
⇢2 � 2Rz + (1 +K)z2 = 0
Note that the above calculation suggests that for both “classic” Cassegrain and Gregorian telescopes, the primary should be a paraboloid, while the secondary should be
an ellipsoid for a Gregorian
a hyperboloid for a Cassegrain
Consider a perfect optical system — one that satisfies Fermat’s principle
Light from two neighboring sources — say, stars A and B — separated by angle θ fill the aperture D
Resolutionto A
to B
D
L
Δ
θθ
θA′B′
z★★
The wavelength theory of light says that two image points cannot be separated — resolved — if the difference in light travel time to them from opposite sides of an aperture is less than ~one period of the wave
Resolutionto A
to B
D
L
Δ
θθ
θA′B′
z★★
In other words, the points can’t be resolved if the OPL difference between the rays is less than ~one wavelength
Resolutionto A
to B
D
L
Δ
θθ
θA′B′
z★★
The OPL difference in the figure is Δ, so we require Δ⪆λ
This then implies that
The exact result from diffraction theory is
Resolution
✓min ⇠ �/D
✓min = 1.22�/D
to A
to B
D
L
Δ
θθ
θA′B′
z★★
Aberrations
Departures from Gaussian optics are called aberrations
They occur in imperfect optical systems
Aberrations come in two main types:
chromatic aberrations due to wavelength variation in the index of refraction
monochromatic aberrations, which are independent of wavelength
Aberrations
Monochromatic aberrations come in two types:
those that deteriorate the image
those that deform the image
These are inherent to each optical element and can be corrected in multiple-element systems (hopefully)
Aberrations
Since Gaussian optics are based on the paraxial approximation, where sin θ≈θ, this will clearly break down when considering rays that are either
at a large distance y from the optical axis
like the marginal rays
or at a large angle θ to the optical axis
Aberrations
So let’s expand sin θ in a Taylor series:
Taking the first (paraxial) and second terms, we have third-order (Seidel) aberration theory
Higher-order (Zernike) terms can of course still be present
sin ✓ = ✓ � ✓3
3!+
✓5
5!� · · ·
Aberrations
In third-order theory, there are five primary aberrations, which scale as
where m+n=3
ym✓n
aberration type scaling for a ray
scaling for a telescope
spherical aberration deterioration (y/R)3 F–3
coma deterioration θ(y/R)2 θF–2
astigmatism deterioration θ2(y/R) θ2F–1
field curvature deformation θ2(y/R) θ2F–1
distortion deformation θ3 θ3
For a telescope of radius R∝yF, we call the aberrations
Note that the last four aberrations are dependent on θ
these are off-axis aberrations
Spherical aberration is independent of θ
occurs even for on-axis sources!
This aberration occurs when rays do not come to a focus at the same point on the optical axis
a point source makes a blurred disk
Spherical aberration
This is the spherical aberration of the Hubble Space Telescope before (left) and after (right) correction
image on left covers ~2″; image on right covers ≈0.05″ This was due to a 1.3mm washer inserted during polishing of the primary mirror, which caused the margins of the primary to be too flat by ~λ/2!
Rays from an off-axis source converge at different points on the focal plane
This occurs off-axis, and is asymmetric because rays from the opposite side of the optical axis land on the same side of the axis
Coma
Coma from a parabolic mirror
This is a picture of a far off-axis part of a field with significant coma
note the “head-tail” structure like a comet Coma also appears when optical elements are misaligned
Astigmatism is caused by rays in the horizontal plane and the vertical plane coming to different foci
The best focus is then the circle of least confusion
Astigmatism
Astigmatism from a parabolic mirror
This is the image of a star in an astigmatic system
Field curvatureOutside of the paraxial region the image plane is curved
the mapping from object to image follows spherical surfaces
You can correct this by adding lenses or mirrors to flatten the field
If you can’t remove the curvature, you need to have a curved detector (or curved slit in a spectrograph)
Distortion
This is a radial change in plate scale with field angle
Does not change the focus but deforms the images
Can be calibrated out, but it’s annoying!
Chromatic aberrations
Caused by rays of light with different wavelengths coming to different foci
Note the large blue halos and smaller yellow and red halos
this is chromatic aberration
You can correct chromatic aberration by creating achromatic doublets (or triplets) by combining positive and negative lenses
Doublets bring light from two wavelengths to a common focus Triplets bring light from three wavelengths to a common focus — generally sufficient for broad-band sources without causing undue aberrations
Achromatic doublets and triplets demonstrate a general rule for compensating aberrations:
“one can generally correct n primary aberrations with n reasonably separated powered optical elements”
for example, we correct spherical aberration by using a single mirror shaped as a paraboloid instead of a sphere
Aberration compensation
But with a single mirror, the other aberrations — coma, astigmatism, field curvature, distortion — remain
We correct these by added more (and more) elements
For example, by choosing the correct shapes (the conic constants Ki) for the primary and secondary, we can correct two aberrations
Aberration compensation
Schwarzschild (1905) and Ritchey & Chrétien (1910) showed that two hyperboloidic mirrors — with the focus of the hyperboloid of the primary shared with the focus of the hyperboloid of the secondary — form an aplanatic telescope
an aplanat is an optical system free of both spherical aberration and coma
A Ritchey-Chrétien (RC) telescope is then limited by astigmatism, not coma like a “classical” Cassegrain
HST is an RC telescope
Aberration compensation
Non-spherical surfaces are difficult to manufacture!
But paraboloids can be made by spinning the liquid glass — so many large telescopes are now constructed as “classical” Cassegrain or Gregorian telescopes
The WEAVE PFCPFC=prime focus corrector 6 lenses!