astrometry - github pages

Astrometry: Information Limits and Estimators

Jorge F. Silva, René Mendez, Marcos Orchard, Rodrigo Lobos, Alex Echeverria,

Sebastian Espinosa

Information Theory

Information Theory

Channel Source Coding Theorem: [Shannon, 1948]

channelPX PYPY |X

X(u) Y (u)

C = supPX

I(X;Y )

Astrometry: An Information Flow Problem xc

I1, .., In ⇠ pxcData

Source

Astrometry: An Information Flow Problem xc


Source• What are the “fundamental bounds”

for astrometry?

• How do the bounds depend on key aspects of the problem?

• quality of the site (\sigma of the PSF)

• attributes of the objects (F)

• quality of the instrument (ROM, D, etc.)

• observation setting (exposition time, etc)

• Are there data-processing schemes that “achieve” the limits?

Content

• Astrometry: A Statistical Decision Task

• The Cramer Rao (CR) bound: Information limits

• Practical Estimators for Astrometry

• LS and ML estimators

• The Role Priors: the Bayes Cramer Rao Bound

Astrometry: The Decision Settingxc


Source

Astrometry: The decision setting

✴ Estimating the position of a Point Source

F̃

xc,F̃(x) = F̃ · �(x� x

c

,�),

xc

the final expected pixel intensity is given by:

�k(xc, F̃ ) ⌘ E{Ik} = F̃ · gk(xc) + B̃k, 8k 2 Z

where

g

k

(xc

) ⌘Z

xk+�x/2

xk��x/2| {z }quantization

�(x� x

c

,�) dx, 8k 2 Z,

Astrometry as an Inference Problemn

�k(xc, F̃ ), k = 1, .., no

Then we have

Randomness


�k(xc, F̃ ), k = 1, .., no

Then we have where the observations

In = (I1, .., In) ⇠ pxc

more precisely,

P(In = in = (i1, ..., in))| {z } = p�1(xc,F̃ )(i1) · p�2(xc,F̃ )(i2) · · · p�n(xc,F̃ )(in)

⌘ pxc(i

n)

where (Poisson distribution).p

�

(x) =e

�� · �x

x!


�k(xc, F̃ ), k = 1, .., no

Then we have

Randomness

xcIn ⇠ pxc

Astrometric Cramer Rao Bound

Echeverria et al.: Bayesian Cramér-Rao bound in Astrometry

(sources of uncertainty) that a↵ect all measurements. The first isan additive background noise, which captures the photon emis-sions of the open (di↵use) sky and the noise of the instrument it-self (the read-out noise and dark-current, Howell (2006); Tyson(1986)) modeled by B̃k in Eq. (2). The second is an intrinsicuncertainty between the aggregated intensity (the nominal ob-ject brightness plus the background) and the actual detection andmeasurement process by the PID, which is modeled by indepen-dent random variables that follow a Poisson probability law. Thethird is the spatial quantization process associated with the pixel-resolution of the PID as specified in Eqs. (2) and (3). Includingthese three e↵ects, we have a countable collection of randomvariables (fluxes or counts measurable by the PID) {Ik : k 2 Z},where the Ik ⇠ Poisson(�k(xc, F̃)) are driven by the expected in-tensity at each pixel element k. The underlying (expected) pixelintensity is given by

�k(xc, F̃) ⌘ E{Ik} = F̃ · gk(xc)| {z }

⌘F̃k(xc,F̃)

+B̃k, 8k 2 Z (2)

and

gk(xc) ⌘Z xk+�x/2

xk��x/2�(x � xc,�) dx, 8k 2 Z, (3)

where E is the expectation value of the argument, while{xk : k 2 Z} denotes the standard uniform quantization of the realline-array with pixel resolution xk+1 � xk = �x > 0 for all k 2 Z.In practice, the PID has a finite collection of n measuring ele-ments (or pixels), then a basic assumption here is that we havea good coverage of the object of interest, in the sense that for agiven position of the source xc, it follows that

nX

k=1

gk(xc) ⇡X

k2Zgk(xc) =

Z 1

�1�(x � xc,�) dx = 1. (4)

In Eq. (3), we have assumed the idealized situation that everypixel has the exact response function (equal to unity), or, equiv-alently, that the flat-field process has been achieved with mini-mal uncertainty. This equation also assumes that the intra-pixelresponse is uniform. This is more important in the severely un-dersampled regime (see, e.g., Adorf (1996, Fig. 1)), which is notexplored in this paper. However, a relevant aspect of data cali-bration is achieving a proper flat-fielding, which can a↵ect thecorrectness of our analysis and the form of the adopted likeli-hood function (more details below).

Finally, given the source parameters (xc, F̃), the joint proba-bility mass function P (hereafter pmf) of the observation vectorIn = (i1, ..., in) (with values in Nn) is given by

P(In = in = (i1, ..., in))| {z }

⌘pxc (in)

= p�1(xc,F̃)(i1) · p�2(xc,F̃)(i2) · · · p�n(xc,F̃)(in),

(5)

8(i1, ..., in) 2 Nn, where p�(x) = e��·�x

x! denotes the pmf ofthe Poisson law (Gray & Davisson 2004)2. The adoption ofthis probabilistic model is common in contemporary astrometry(e.g., in Gaia, see Lindegren (2008)).2 Throughout this paper, in general, capital letters (e.g., In) denote arandom variable (or, in this case, a random vector of n elements), andlower-case letters (e.g., in) denote a particular realization (or measuredvalue) of the variable. This distinction will become particularly impor-tant in the Bayesian context described in the next section.

It is important to mention that Eq. (5) assumes that the ob-servations are independent (although not identically distributedsince they follow �i). This is only an approximation to the realsituation since it implies that we are neglecting any electronicdefects or features in the device, such as the cross-talk present inmulti-port CCDs (Freyhammer et al. 2001), or read-out correla-tions, such as the odd-even column e↵ect in IR detectors (Mason2008), as well as calibration or data reduction deficiencies (e.g.,due to inadequate flat-fielding; Gawiser et al. (2006)) that mayalter this idealized detection process. In essence, we are con-sidering an ideal detector that would satisfy the proposed likeli-hood function given by Eq. (5); in real detectors the likelihoodfunction could be considerably more complex3. Serious attemptshave been made by manufacturers and observatories to minimizethe impact of these defects, either by an appropriate electronicdesign or by adjusting the detector operational regimes (e.g.,cross-talk can be reduced to less than 1 part in 104 by adjust-ing the read-out speed and by a proper reduction process (seeFreyhammer et al. (2001)).

2.2. Astrometric Cramér-Rao lower bound

The CR inequality o↵ers a lower bound for the variance of thefamily of unbiased estimators. More precisely, the CR theoremis as follows:

Theorem 1. (Rao (1945); Cramér (1946)) Let {Ik : k =1, ..., n} be a collection of independent observations that follow aparametric pmf p✓m defined onN. The parameters to be estimatedfrom In = (I1, ..., In) will be denoted in general by the vector✓m = (✓1, ✓2, ..., ✓m) 2 ⇥ = Rm. Let

L(in; ✓m) ⌘ p✓m (i1) · p✓m (i2) · · · p✓m (in)

be the likelihood of the observation in 2 Nn given ✓m 2 ⇥. If thefollowing condition is satisfied

EIn⇠pn✓m

(

@ ln L(In; ✓m)@✓ j

)

= 0, 8 j 2 {1, . . . ,m} , (6)

then for any ⌧n(·) : Nn ! ⇥ unbiased estimator of ✓m (i.e.,EIn⇠pn

✓m{⌧n(In)} = ✓m) it follows that

Var(⌧n(In) j) � [I✓m (n)�1] j, j, (7)

where I✓m (n) is the Fisher information matrix given by

[I✓m (n)] j,l = EIn⇠pn✓m

(

@ ln L(In; ✓m)@✓ j

· @ ln L(In; ✓m)@✓l

)

, (8)

8 j, l 2 {1, . . . ,m}.Returning to the observational problem in Sect. 2.1, Mendez

et al. (2013, 2014) have characterized and analyzed the CR lowerbound in Eq. (7) for the isolated problem of astrometry and pho-tometry, respectively, as well as the joint problem of photometryand astrometry. For completeness, we highlight their 1-D astro-metric result, which will be relevant for the discussion in subse-quent sections of the paper:

Proposition 1. (Mendez et al. (2014, pp. 800)) Let us as-sume that F̃ and � 2 R+ are fixed and known, and we want to

3 In a real scenario, we may not even be able to write such a functionowing to our imperfect characterization or limited knowledge of thedetector device.

Article number, page 3 of 17

The likelihood

Minimum variance bound

Information of the measurements


For the scalar case:

min⌧n(·)2T n

V ar(⌧n(In)) � I✓(n)�1

= EIn⇠pn✓

("✓d lnL(In; ✓)

d✓

◆2#)�1

,

Exclusive function of the channel!


For the 1D Astrometric case we have that (Mendez et al 2013)

Ixc(n) =

nX

k=1

⇣F̃

dgk(xc)dxc

⌘2

F̃ g

k

(xc

) + B̃

k

,

min⌧

n:Nn!RV ar(⌧

n

(In)) � Ixc(n)

�1

| {z }⌘�

2CR

.

and

Numerical Analysis of the Bound

✴ Considering a Gaussian Profile for the PSF

where

�(x,�) =1p2⇡�

e

� (x)2

2�2

FWHM ⌘ 2p2 ln 2 �

Full-Width at Half-Maximum


✴ Constant background model

n

B̃i, i = 1, .., no

B = fs�x+D +RON2

G[ADU ],

• is the diffuse sky background (quality of the site)

• pixel resolution (instrument)

• , dark current and Read add Noise (instrument)

fs

�x

D RON


✴ Constant background model

n

B̃i, i = 1, .., no

B = fs�x+D +RON2

G[ADU ],

• is the diffuse sky background (quality of the site)

• pixel resolution

• , dark current and Read add Noise (Instrument)

fs

�x

D RON

fs�x � D +RON2

G| {z }ground based regime

Numerical Analysis of the BoundFor example, for P ¼ 0:9 (aperture containing 90% of the

total flux), we have uþ ∼ 1:164 (the true “physical” apertureon the detector would be 1:164ffiffiffiffiffi

ln 2p × FWHM≈ 1:40 × FWHM),

whereas for P ¼ 0:99 then uþ ∼ 1:822 (or 1:822ffiffiffiffiffiln 2

p × FWHM≈2:19 × FWHM). Because uþ increases faster than P ðuþÞ(which is bound to a maximum value of 1.0), we see that,for a given source, background, and detector, the S/N computedfrom equation (28) decreases as uþ increases beyond the maincore of the PSF. For example, for Δx ¼ 0:2″, G ¼ 2e%=ADU,RON ¼ 5e%, fs ¼ 2000 ADU=arcsecond, F ¼ 5000 ADU,and FWHM ¼ 1:0″, then for P ¼ 0:9, S=N ∼ 74, whereas forP ¼ 0:999 (for which uþ ∼ 2:33), S=N ∼ 68. As was explainedbefore, we note that equation (21) does not depend directly onthe S/N.

Equation (28) is interesting since it explicitly shows that, asΔx becomes smaller and smaller, the RON term starts to domi-nate over the sky background in its contribution to the totalnoise, the impact of which, on the Cramér-Rao bound, has al-ready been mentioned in § 3.2. However, when Δx increases,the sky background becomes the dominant source of back-ground noise, and the total noise becomes independent of thearray pixel size. Also, this equation clearly shows the classicalresult that, as an image becomes more spread (larger FWHM, orworse image quality) the S/N deteriorates, for a fixed total fluxF , because of the larger contribution from the sky and the (larg-er number of) pixels underneath the aperture: As we shall see,the FWHM has a very relevant impact on the Cramér-Raobound (see, e.g., eq. [45]).

Figure 2 shows the result of evaluating equation (21) underthe assumption of a background given by equation (23) for a setof representative values. An interesting point here is that, at verysmall values of Δx, we still see the “upturn” in the Cramér-Raolower bound seen in Figure 1, but it has a much smaller effect.Of course, the reason for this upturn is the prevalence of theRON over the sky background indicated in the previous para-graph, when Δx becomes extremely small. As we shall see(eq. [45]), the Cramér-Rao bound goes as Δx%1 for small S/Nand small pixels, a feature clearly seen in Figure 2. Otherwisewe see a broad region that exhibits a rather smooth and steadydecrease in positional precision when Δx becomes larger andlarger, and a rather steep increase when Δx increases beyondthe FWHM. The overall effects of pixel de-centering are quali-tatively similar to those already presented in Figure 1, and arethus not repeated in this figure. For very large S/N, equation (45)predicts that the Cramér-Rao bound becomes rather insensitiveto Δx, which also coincides with the behavior in Figure 2.

An interesting prediction of equation (21) is that high-resolution imaging in low-background, even for undersampledimages (e.g., HST), is better than imaging with larger apertureground-based telescopes, not undersampled, due to the worseFWHM and higher-background of the latter, a fact well-knownby people doing astrometry with HST (provided, of course, thatsystematic effects are well understood; e.g., a particularly

challenging situation with HST data is the account of time-dependent charge-transfer efficiency corrections. For detailssee, e.g., Bristow et al. [2005], especially their Fig. 4, or Bristowet al. [2006], especially their Fig. 10). For example, for the samedetector parameters as those adopted in Figure 2, and F ¼ 10000(which for a Gaussian PSF leads to maximum flux in the centralpixel of ∼1700 ADU [see § 4.1 and eq. (44)]), fs ¼3000 ADU=pix, and FWHM ¼ 0:45″, the Cramér-Rao boundis ∼1:7 mas (withΔx ¼ 0:08″). These (source and image) valuesare similar to those of the QSOs used in the astrometric study byMéndez et al. (2010) (see their Table 1) andMéndez et al. (2011),which demonstrated a single-measurement astrometric precisionof 1.5 mas (see § 3.2 in Méndez et al. [2010]) with the NTT(3.5 m aperture) telescope and SUSI2 imager. On the other hand,for HST with fs ¼ 30 ADU=pix and FWHM ¼ 0:15″, theCramér-Rao bound is ∼0:2 mas (in this case Δx ¼ 0:1″),whereas Piatek et al. (2002) reported a single-measurement pre-cision of 0.25 mas (in our calculation of the Cramér-Rao boundfor HSTwe have approximately taken into account the aperturedifference between the NTT and HST, and the different expo-sure times for the same QSOs adopted in these two studies, fromTable 1 in Piatek et al. [2002]).

4.1. The Cramér-Rao Bound in the Small Pixel (HighResolution) Approximation

Under certain circumstances, the summation in the denomi-nator of the RHS of equation (10) can be approximated into anintegral, which allows us to explore the behavior of the Cramér-Rao bound in a more explicit manner. Indeed, we see from equa-tions (12) and (18) that the application of the mean-value

FIG. 2.—Cramér-Rao bound as given in equation (21) in milliarcseconds(mas), as a function of pixel size Δx in arcseconds. All curves were computedfor a background given by equation (23) with fs ¼ 2000 ADU=arcsecond,RON ¼ 5 e%, D ¼ 0 e%, G ¼ 2 e%=ADU, and for a Gaussian source withFWHM ¼ 1″ centered on a pixel. The curves shown have different values forthe flux, and hence a different S/N. From top to bottom we have F ¼1000 ADU, S=N≈ 20, F ¼ 2000 ADU, S=N≈ 35 (both dashed lines);F ¼ 5000 ADU, S=N≈ 74 (solid line) (as a reference; in this case we haveFmax ≈ 930 ADU at Δx ¼ 0:2″; see eq. [33]); and F ¼ 10000 ADU,S=N≈ 120, and F ¼ 50000 ADU, S=N≈ 300 (both dotted lines).

CRAMÉR-RAO LOWER-BOUND IN ASTROMETRY 587

2013 PASP, 125:580–594

This content downloaded from 190.98.214.226 on Fri, 28 Jun 2013 16:42:07 PMAll use subject to JSTOR Terms and Conditions

Cramér-Rao bound in milliarcseconds (mas), as a function of pixel size Δx in arcseconds. From top to bottom, we have F=

1000 ADU, S/N = 20, F=2000 ADU, S/N = 35 (both dashed lines); F=5000 ADU, S/N=74 (solid line).

fs = 200ADU/arcsecond

RON = 5e�

FWHM = 1G = 2e�/ADU

�CR

Asymptotic Regimes

On interesting case is the “high resolution regime“

�2xc

⇡( p

⇡2 (2 ln 2)3/2

· B̃F̃ 2 · FWHM3

�x if F̃ ⌧ B̃1

8 ln 2 · 1F̃· FWHM2 if F̃ � B̃,

�x/� ⌧ 1

Mendez et al. 2013, PASP

… more of the 1D Astrometry analysis at

Analysis and Interpretation of the Cramér-Rao Lower-Bound in Astrometry:

One-Dimensional Case

RENE A. MENDEZ

Departamento de Astronomía, Facultad de Ciencias Físicas y Matemáticas, Universidad de Chile, Casilla 36-D, Santiago,Chile; [email protected]

AND

JORGE F. SILVA AND RODRIGO LOBOSDepartamento de Ingeniería Eléctrica, Facultad de Ciencias Fsicas y Matemáticas, Universidad de Chile, Beauchef 850,

Santiago, Chile; [email protected]; [email protected]

Received 2013 February 08; accepted 2013 April 22; published 2013 May 21

ABSTRACT. In this article we explore the maximum precision attainable in the location of a point source imagedby a pixel array detector in the presence of a background, as a function of the detector properties. For this we use awell-known result from parametric estimation theory, the so-called Cramér-Rao lower bound. We develop the ex-pressions in the one-dimensional case of a linear array detector in which the only unknown parameter is the sourceposition. If the object is oversampled by the detector, analytical expressions can be obtained for the Cramér-Raolimit that can be readily used to estimate the limiting precision of an imaging system, and which are very useful forexperimental (detector) design, observational planning, or performance estimation of data analysis software: Inparticular, we demonstrate that for background-dominated sources, the maximum astrometric precision goes asB=F 2, where B is the background in one pixel, and F is the total flux of the source, while when the backgroundis negligible, this precision goes as F!1. We also explore the dependency of the astrometric precision on: (1) the sizeof the source (as imaged by the detector), (2) the pixel detector size, and (3) the effect of source decentering. Puttingthese results into context, the theoretical Cramér-Rao lower bound is compared to both ground- as well as space-based astrometric results, indicating that current techniques approach this limit very closely. It is furthermore dem-onstrated that practical astrometric estimators like maximum likelihood or least-squares techniques cannot formallyreach the Cramér-Rao bound, but that they approach this limit in the one-dimensional case very tightly, for a widerange of signal-to-noise ratio (S/N) of the source. Our results indicate that we have found in the Cramér-Rao lowervariance bound a very powerful astrometric “benchmark” estimator concerning the maximum expected positionalprecision for a point source, given a prescription for the source, the background, the detector characteristics, and thedetection process.

1. INTRODUCTION

Astrometry relies on the precise determination of the relativelocation of, usually, point sources. The estimation of the preci-sion with which these measurements can be done, both from anempirical, as well as from a theoretical point of view, has beenthe subject of various papers. Seminal work, as applied to stellarimages recorded on photographic plates, are those of van Altena& Auer (1975) and Auer & van Altena (1978), with further re-finements by Lee & van Altena (1983), in which statistical es-timations for the precision of the position of stellar images werecompared to the results from the actual fitting of stellar profilesmeasured using microdensitometer scans through a classicalleast squares minimization technique assuming a Gaussiannoise on the measured intensities.

Nowadays, discrete digital detectors, such as Charged Cou-pled Devices (CCDs, Howell [2006]), being highly efficient

area detectors, are widely used in astronomy for photometric,astrometric and spectroscopic observations (Mackay 1986)(For the specific use of CCDs in astrometry see, e.g., Monet[1992], Lindegren [2005], and Howell [2013].) This promptedKing (1983) to carry out a similar analysis for CCDs, specifi-cally for HST data, starting also from the assumption that a leastsquares minimization approach provides the best estimation ofthe relevant parameters.

The studies by Lee and van Altena (1983) and King (1983)(see also Stone [1989]) provide estimates of the statistical un-certainties of the fitted parameters, given a noise model for thedata. However, a related question, less often addressed, is: Whatwould be the maximum attainable precision with which onecould expect to estimate the astrometric position of a source,given a prescription of the detection process? This question con-stitutes a central aspect to astrometric work. For example, insituations in which the detector nearly critically samples the

580

PUBLICATIONS OF THE ASTRONOMICAL SOCIETY OF THE PACIFIC, 125:580–594, 2013 May© 2013. The Astronomical Society of the Pacific. All rights reserved. Printed in U.S.A.

This content downloaded from 190.98.214.226 on Fri, 28 Jun 2013 16:42:07 PMAll use subject to JSTOR Terms and Conditions

Mendez et al. 2014, PASP

… and extension to 2D Astrometry and joint Astrometry & Photometry

Analysis of the Cramér-Rao Bound in the Joint Estimation ofAstrometry and Photometry

RENE A. MENDEZ,1,2 JORGE F. SILVA,3 RODRIGO OROSTICA,2 AND RODRIGO LOBOS3

Received 2014 April 07; accepted 2014 July 03; published 2014 August 19

ABSTRACT. In this paper, we use the Cramér-Rao lower uncertainty bound to estimate the maximum precisionthat could be achieved on the joint simultaneous (or two-dimensional) estimation of photometry and astrometry of apoint source measured by a linear CCD detector array. We develop exact expressions for the Fisher matrix elementsrequired to compute the Cramér-Rao bound in the case of a source with a Gaussian light profile. From these ex-pressions, we predict the behavior of the Cramér-Rao astrometric and photometric precision as a function of thesignal and the noise of the observations, and compare them to actual observations—finding a good correspondencebetween them. From the Cramér-Rao bound, we obtain the well-known fact that the uncertainty in flux on a Poisson-driven detector, such as a CCD, goes approximately as the square root of the flux. However, more generally, higher-order correction factors that depend on the ratioB=F or F=B (whereB is the background flux per pixel, and F is thetotal flux of the source), as well as on the properties of the detector (pixel size) and the source (width of the lightprofile), are required for a proper calculation of the minimum expected uncertainty bound in flux. Overall, theCramér-Rao bound predicts that the uncertainty in magnitude goes as ðS=NÞ#1 under a broad range of circumstan-ces. As for the astrometry, we show that its Cramér-Rao bound also goes as ðS=NÞ#1 but, additionally, we find thatthis bound is quite sensitive to the value of the background—suppressing the background can greatly enhance theastrometric accuracy. We present a systematic analysis of the elements of the Fisher matrix in the case whenthe detector adequately samples the source (oversampling regime), leading to closed-form analytical expressionsfor the Cramér-Rao bound. We show that, in this regime, the joint parametric determinations of photometry andastrometry for the source become decoupled from each other, and furthermore, it is possible to write down expres-sions (approximate to first order in the small quantities F=B or B=F ) for the expected minimum uncertainty in fluxand position. These expressions are shown to be quite resilient to the oversampling condition, and become thus veryvaluable benchmark tools to estimate the approximate behavior of the maximum photometric and astrometric pre-cision attainable under prespecified observing conditions and detector properties.

1. INTRODUCTION

In this paper, we extend the one-dimensional (1D) Cramér-Rao analysis done in Mendez et al. (2013) to the two-dimensional (2D) case of simultaneous photometry andastrometry estimation on a linear CCD detector. The goal isto provide an estimation setting that is more realistic than thatpresented in Mendez et al. (2013), while still being tractableanalytically so that useful closed-form expressions can be de-rived and interpreted from the analysis. This scenario allowsus also to explore, in a simple manner, the extent of the inter-dependence between astrometry and photometry, from the point

of view of the Cramér-Rao error bound under different instru-mental and detection regimes.

In general, the Cramér-Rao lower variance bound can beused to cover a broad span of applications, ranging from instru-ment design for specific target accuracy goals to observationalplanning and data analysis benchmarking (see, e.g., Perrymanet al. 1989; Jakobsen et al. 1992; Zaccheo et al. 1995; Adorf1996). For example, the Cramér-Rao bound can be used to pre-dict how a particular design choice (pixel size, readout noise,etc.) influences the photometric and astrometric performanceof the planned instrument, permits the prediction of lowerbounds to photometric errors for point sources (and for surfacephotometry of extended objects), and places lower bounds to theprecision with which the position of point sources can be mea-sured (depending on their shape), be it as isolated objects or in acluster. The Cramér-Rao formalism also allows us to determinethe influence of subpixel dither patterns on the astrometric andphotometric errors (Mendez et al. [2013] and this paper, § 3.1).Finally, the Cramér-Rao lower bound can be used to test thestatistical adequacy of different data reduction and analysis

1 On leave at the European Southern Observatory, Casilla 19001, Santiago,Chile.

2Departamento de Astronomía, Facultad de Ciencias Físícas y Matemáticas,Universidad de Chile, Casilla 36-D, Santiago, Chile; [email protected],[email protected].

3 Department of Electrical Engineering, Information and Decision SystemGroup, Universidad de Chile, Av. Tupper 2007, Santiago, Chile; [email protected], [email protected].

798

PUBLICATIONS OF THE ASTRONOMICAL SOCIETY OF THE PACIFIC, 126:798–810, 2014 August© 2014. The Astronomical Society of the Pacific. All rights reserved. Printed in U.S.A.

This content downloaded from 200.27.27.9 on Wed, 11 Mar 2015 10:49:49 AMAll use subject to JSTOR Terms and Conditions

Achievability of the CR Bound

Achievability Analysis

✴ Is the CR bound attainable by any practical estimator?

✴ How far are practical estimators used in Astrometry (like LS, WLS, ML) from the CR bound?

✴ How previous questions depends on the observational regime (S/R, pixel resolution, site, PSF, etc.)?

Impossibility Result: Lobos et al. 2015, PASP

Proposition: For any fixed and unknown position and unbiased

estimator

xc

⌧n : Nn �! ⇥

V ar(⌧n(In)) > �2

CR.

Least Square: Performance Approximation

The technical challenges is to find expression to approximate the performance of an “implicit estimator” solution of:

⌧LS(In) = argmin

↵2R

nX

i=1

(Ii � �i(↵))2

| {z }⌘J(In,↵)

,

where

�i(↵) = F̃ gi(↵) + B̃i.

Main Result: Lobos et al. (2015) PASPOn the other hand, a version of the LS estimator (given themodel presented in § 2.1) corresponds to the solution of:

τLSðInÞ ¼ argminα∈R

Xn

i¼1

ðIi $ λiðαÞÞ2

|fflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflffl}≡JðIn;αÞ

; (16)

with λiðαÞ ¼ ~FgiðαÞ þ ~Bi and where gið·Þ is given byequation (3).13

In a previous paper (Mendez et al. 2013, § 5), we have car-ried out numerical simulations using equations (15) and (16),and have demonstrated that both approaches are reasonable.However, an inspection of Mendez et al. (2013, Table 3) sug-gests that the LS method exhibits a loss of optimality at highS/N in comparison with either the (Poisson variance-) weightedLS or the ML method. This motivates a deeper study of the LSmethod, to properly understand its behavior and limitations interms of its MSE and possible statistical bias, which is the focusof the following section.

3.2. Bounding the Performance of the Least-SquaresEstimator

The solution to equation (16) is nonlinear (see Fig. 7), and itdoes not have a closed-form expression. Consequently, a num-ber of iterative approaches have been adopted (see, e.g., Stetson1987; Mighell 2005) to solve or approximate τLSðInÞ. Hence, asτLSðInÞ is implicit, it is not possible to compute its mean, itsvariance, nor its estimation error directly. We also note that,since we will be mainly analyzing the behavior of τLSðInÞ, noneof the caveats concerning the properness of the likelihood func-tion (eq. [5]) raised in § 2.1 are relevant in what follows, exceptfor what concerns the adequacy of equation (2), which we takeas a valid description of the underlying flux distribution.

The problem of computing the MSE of an estimator that isthe solution of an optimization problem has been recently ad-dressed by So et al. (2013) using a general framework. Theirbasic idea was to provide sufficient conditions on the objectivefunction, in our case JðIn;αÞ, to derive a good approximationfor EIn∼fxcfðτLSðI

nÞ $ xcÞ2g. Based on this idea, we providebelow a refined result (specialized to our astrometry problem),which relaxes one of the idealized assumptions proposed in Soet al. (2013, their eq. [5]), and which is not strictly satisfied inour problem (see Remark 2 in § 3.3). As a consequence, our

result offers upper and lower bounds for the bias and MSEof τLSðInÞ, respectively.

Theorem 1.— Let us consider a fixed and unknown param-eter xc∈R, and that In ∼ fxc . In addition, let us define the re-

sidual random variable WðIn;αÞ≡ J ″ðIn;αÞ$EIn∼fxc fJ″ðIn;αÞg

EIn∼fxc fJ″ðIn;αÞg .14 If

there exists δ∈ð0; 1Þ such that PðWðIn; xcÞ∈ð$δ; δÞÞ ¼ 1, then:

jEIn∼fxcfτLSðInÞg$ xcj ≤ ϵðδÞ; (17)

EIn∼fxcfðτLSðInÞ $ xcÞ2g∈

"σ2LSðnÞ

ð1þ δÞ2;σ2LSðnÞ

ð1$ δÞ2

#; (18)

where

σ2LSðnÞ≡

EIn∼fxcfJ0ðIn; xcÞ2g

ðEIn∼fxcfJ″ðIn; xcÞgÞ2

(19)

and

ϵðδÞ≡ EIn∼fxcfjJ0ðIn; xcÞjg

EIn∼fxcfJ″ðIn; xcÞg

·δ

1$ δ: (20)

(The proof is presented in Appendix B).

3.3. Analysis and Interpretation of Theorem 1

Remark 1.— Theorem 1 is obtained under a bounded con-dition (with probability one) over the random variableWðIn; xcÞ. To verify whether this condition is actually met,it is therefore important to derive an explicit expression forWðIn; xcÞ. Starting from equation (16) it follows that:

J ″ðIn;αÞ ¼ 2Xn

i¼1

$"dλiðαÞdα

#2

þ ðλiðαÞ $ IiÞ ·d2λiðαÞd2α

%

and, consequently, EIn∼fxcfJ″ðIn; xcÞg ¼ 2

Pni¼1ð

dλiðαÞdα jα¼xc

Þ2.Therefore:

WðIn; xcÞ ¼Xn

i¼1

ðλiðxcÞ $ IiÞ ·$λ″iðxcÞ=

Xn

j¼1

ðλ0jðxcÞÞ2

%:

(21)

Then,WðIn; xcÞ is not bounded almost surely, since Ii couldtake any value in N with nonzero probability. However,EIn∼fxcfW ðIn; xcÞg ¼ 0, and its variance in closed-form is:

13 Note that if we had assumed a detection process subject to a purely Gauss-ian noise with an rms of σ, then the probability mass function for each individualobservation would have been given by fλðxÞ ¼ 1ffiffiffiffi

2πp

σ e$ ðx$λÞ2

2σ2 . In this case, thelog-likelihood function (i.e., the equivalent of eq. [13]) would be given bylnLðIn;xcÞ ¼ $n lnð

ffiffiffiffiffiffi2π

pσÞ $ 1

2σ2

Pni¼1ðIi $ λiðxcÞÞ2. Therefore, in this sce-

nario, finding the maximum of the log-likelihood would be the same as findingthe minimum of the LS, as in eq. (16). This is a well-established result, describedin many statistical books.

14 As a short-hand notation: J 0ðIn; xcÞ≡ dJðIn;αÞdα jα¼xc

, and J ″ðIn; xcÞ≡d2JðIn;αÞ

d2α jα¼xc.

LEAST-SQUARES IN ASTROMETRY 1171

2015 PASP, 127:1166–1182

Main Result

The “predicted nominal value” offers a closed-form

�

2LS(n) =

Pni=1(F̃ gi(xc) + B̃i) · (g0i(xc))2⇣

F̃

Pni=1(g

0i(xc))2

⌘2

| {z }

.

to compare with �2CR

Main Result

The “predicted nominal value” offers a closed-form

�

2LS(n) =

Pni=1(F̃ gi(xc) + B̃i) · (g0i(xc))2⇣

F̃

Pni=1(g

0i(xc))2

⌘2

| {z }

.

Non-optimality for hight S/R:

VarðWðIn; xcÞÞ ¼Xn

i¼1

λiðxcÞ ·!ðλ″iðxcÞÞ2=

"Xn

j¼1

ðλ0jðxcÞÞ2#

2$:

(22)

From this, we can evaluate how far we are from the boundedassumption of Theorem 1. To do this, we can resort to Markov’sinequality (Cover & Thomas 2006), where PðW ðIn; xcÞ∉ð$ρ; ρÞÞ ≤ VarðWðIn; xcÞÞ=ρ2. Then, for any ϵ∈ð0; 1Þ, wecan characterize a critical δðϵÞ > 0 such that PðW ðIn; xcÞ∈ð$δðϵÞ; δðϵÞÞÞ > 1$ ϵ. Using this result and Theorem 1, wecan bound the conditional bias and conditional MSE of τLSðInÞusing equations (17) and (18), respectively. In § 4, we conduct anumerical analysis, where it is shown that the bounded assump-tion for WðIn; xcÞ is indeed satisfied for a number of importantrealistic experimental settings in astrometry (with very highprobability).

Remark 2.— Concerning the MSE of the LS estimator, equa-tion (18) offers a lower and upper bound in terms of a nominalvalue σ2

LSðnÞ (given by eq. [19]), and an interval around it. Inthe interesting regime where δ ≪ 1 (this regime approachesthe ideal case δ ¼ 0 studied by So et al. [2013] in which casethe variable WðIn; xcÞ becomes deterministic), we have thatτLSð·Þ is an unbiased estimator, as shown by equation (17), and,furthermore:

VarðτLSðInÞÞ ¼ EIn∼fxc fðτLSðInÞ $ xcÞ2g ¼ σ2

LSðnÞ ≥ σ2CR;

from equations (18) and (10). Thus, it is interesting to providean explicit expression for σ2

LSðnÞ which will be valid for theMSE of the LS method in this regime. First we note that,J 0ðIn; xcÞ ¼ 2 · ð

Pni¼1

dλiðxcÞdxc

· ðλiðxcÞ $ IiÞÞ, and therefore:

ðJ 0ðIn; xcÞÞ2 ¼ 4 ·"Xn

i¼1

Xn

j¼1;j≠i

ðIiIj $ Iiλj $ Ijλi þ λiλjÞ

·dλidxc

·dλjdxc

#þ 4 ·

"Xn

i¼1

ðI2i $ 2Iiλi þ λ2i Þ ·

"dλidxc

#2#:

Therefore, EIn∼fxcfJ0ðIn;xcÞ2g¼4 ·

Pni¼1λiðxcÞ ·ðλ0

iðxcÞÞ2,which implies that:

σ2LSðnÞ ¼

Pni¼1 λiðxcÞ · ðλ0

iðxcÞÞ2

ðP

ni¼1ðλ0

iðxcÞÞ2Þ2

¼P

ni¼1ð ~FgiðxcÞ þ ~BiÞ · ðg0iðxcÞÞ2

ð ~FP

ni¼1ðg0iðxcÞÞ2Þ2

: (23)

In the next section, we provide a numerical analysis tocompare the predictions of equation (23) with the CR boundcomputed through equation (12). We also analyze if this

nominal value is representative of the performance of the LSestimator.

Remark 3.— (Idealized low S/N regime) Following the idealscenario where δ ≪ 1, we explore the weak signal case in which~FgiðxcÞ ≪ ~Bi considering a constant background across thepixels, i.e., ~Bi ¼ ~B for all i. Then adopting equation (23) wehave that:

σ2LSðnÞ≈

~B~F 2 Pn

i¼1ðg0iðxcÞÞ2: (24)

On the other hand, from equation (12), we have thatIxcðnÞ≈ ~F 2= ~B

Pni¼1ðg0iðxcÞÞ2. Remarkably, in this context,

the LS estimator is optimal in the sense that it approachesthe CR bound asymptotically when a weak signal is observed.15

This result is consistent with the numerical simulations inMendez et al. (2013, Table 3).16

Remark 4.— (Idealized high S/N regime) For the high S/Nregime, assuming again that δ ≪ 1, we consider the case where~FgiðxcÞ ≫ ~Bi for all i. In this case:

σ2LSðnÞ≈

!~F

ðP

ni¼1 g

0iðxcÞ2Þ2P

ni¼1 giðxcÞg0iðxcÞ2

$$1

and

σ2CR ≈

!~FXn

i¼1

ðg0iðxcÞÞ2=giðxcÞ$$1

:

(25)

Therefore, in this strong signal scenario, there is no matchbetween the variance of the LS estimator and the CR bound,and consequently, we have that σ2

LSðnÞ > σ2CR. To provide more

insight on the nature of this performance gap, in the next prop-osition we offer a closed-form expression for this mismatch inthe high-resolution scenario where the source is oversampled,and the size of the pixel is a small fraction of the width parame-ter σ of the PSF in equation (1).

Proposition 3.— Assuming the idealized high S/N regime, ifwe have a Gaussian-like PSF and Δx=σ ≪ 1, then:

σ2LSðnÞσ2CR

≈ 8

3ffiffiffi3

p > 1: (26)

(The proof is presented in Appendix C).Equation (26) shows that there is a very significant perfor-

mance gap between the CR bound and the MSE of the LS esti-mator in the high S/N regime. This result should motivate the

15 Recall that, according to § 3.1, we have demonstrated that the CR cannot beexactly reached in astrometry.

16 We note that this asymptotic result can be considered the astrometric coun-terpart of what has been shown in photometry by Perryman et al. (1989), wherethe LS estimator approaches the CR bound in the low S/N regime.

1172 LOBOS ET AL.

2015 PASP, 127:1166–1182

Numerical Analysis:

as well as in the calculation of the background level per pixel~Bi, according to equation (27). Therefore, a change inΔx is notonly a design feature of the detector device, but it implies alsoa change in the distribution of the background underneaththe PSF. The impact of this covariant device-and-atmospherechange in the CR bound is explained in detail in Mendez et al.(2013, § 4; see also their Fig. 2).

In our simulations, we also consider different signal strengths~F∈f1080; 3224; 20004; 60160g,18 measured in photo-e!, corre-sponding to S=N∈ ∼ f12; 33; 120; 230g, respectively.19 Notethat increasing ~F implies increasing the S/N of the problem,which can be approximately measured by the ratio ~F= ~B≡F=B.20 On a given detector plus telescope setting, these differ-ent S/N scenarios can be obtained by changing appropriately theexposure time (open shutter) that generates the image.

4.1. Analyzing the Bounded Condition Over WðIn;αÞ

To validate how realistic is the bounded assumption overWðIn;αÞ in our problem, we first evaluate the variance ofWðIn; xcÞ from equation (22). This is presented in Figure 1for different S/N regimes and pixel resolutions in the array.Overall, the magnitudes are very small considering the admis-sible range (0,1) for W ðIn; xcÞ stipulated in Theorem 1. Also,given thatW ðIn; xcÞ has zero mean, the bounded condition willhappen with high probability. Complementing this, Figure 2presents the critical δ across different pixel resolutions andS/N regimes.21 For this, we fix a small value of ϵ (¼10!3 in thiscase), and calculate δ such that W ðIn; xcÞ∈ð!δ; δÞ with proba-bility 1! ϵ. From the curves obtained, we can say that thebounded assumption is holding (values of δ in (0,1)) for a widerange of representative experimental conditions and, conse-quently, we can use Theorem 1 to provide a range on the per-formance of the LS estimator. Note that the idealized conditionof δ ≈ 0 is realized only for the very high S/N regime (strongsignals).

4.2. Performance Analysis of the LS Estimator

We adopt equation (18) which provides an admissible rangefor the MSE performance of the LS estimator. For that we usethe critical δ obtained in Figure 2. These curves for the differentS/N regimes and pixel resolutions are shown in Figure 3.Following the trend reported in Figure 2, the nominal valueσ2LS is a precise indicator for the LS estimator performance

for strong signals (matching the idealized conditions stated inRemark 4), while on the other hand, Theorem 1 does not in-dicate whether σ2

LS is accurate or not for low S/N, as wedeviate from the idealized case elaborated in Remark 3. Nev-ertheless, we will see, based on some complementary empiri-cal results reported in what follows, that even for low S/N, thenominal σ2

LS predicts the performance of the LS estimatorquite well.

Assuming for a moment the idealized case in which δ ≪ 1,we can reduce the performance analysis to measuring the gapbetween the nominal value predicted by Theorem 1 (eq. [19]),and the CR bound in Proposition 1. Figure 4 shows the relative

difference given by e% ¼ σ2LS!σ2

CRσ2CR

· 100. From the figure we can

clearly see that, in the low S/N regime, the relative performancedifferences tends to zero and, consequently, the LS estimatorapproaches the CR bound, and it is therefore an efficient esti-mator. This matches what has been stated in Remark 3. Onthe other hand for high S/N, we observe a performance gapthat is non-negligible (up to ≈27% relative difference abovethe CR for ~F ¼ 60160 e!, and ≈15% above the CR for~F ¼ 20004 e! for Δx ¼ 0:2″). This is consistent with whathas been argued in Remark 4. Note that in this regime, theidealized scenario in which δ ≪ 1 is valid (see Fig. 2) and, thus,EIn∼fxcfðτLSðI

nÞ ! xcÞ2g≈ σ2LS, which is not strictly the case

for the low S/N regime (although see Fig. 6 and the discussionthat follows).

FIG. 4.—Relative performance differences between σ2LS in Theorem 1

(eq. [19]) and the CR bound σ2CR in Proposition 1 (eq. [12]). Results are reported

for different S/N and pixel sizes. A significant performance gap between the LStechnique and the CR bound is found for FWHM=Δx < 1 (good sampling ofthe PSF) at high S/N, indicating that, in this regime, the LS method is subopti-mal, in agreement with Proposition 3 (see also eq. [26]). This gap becomesmonotonically smaller as the S/N decreases.

18 These are the same values explored in Mendez et al. (2013, Table 3).19 For a given ~F and fs there is a weak dependency of S/N on the pixel size

Δx; see eq. (28) in Mendez et al. (2013).20 We note that while the ratio ~F= ~B can be used as a proxy for S/N, in what

follows we have used the exact expression to compute this quantity, as given byeq. (28) in Mendez et al. (2013).

21 These values were computed empirically (from frequency counts) using5000 realizations of the random variable WðIn; xcÞ for the different S/N re-gimes and pixel resolutions.

LEAST-SQUARES IN ASTROMETRY 1175

2015 PASP, 127:1166–1182

e% = 100 · �2LS � �2

CR

�2CR

A significant performance gap between the LS technique and the CR bound is found for hight S/R regime.

More in Lobos et al., 2015, PASP.

Performance Analysis of the Least-Squares Estimator in Astrometry

RODRIGO A. LOBOS,1 JORGE F. SILVA,1 RENE A. MENDEZ,2 AND MARCOS ORCHARD1

Received 2015 May 08; accepted 2015 August 26; published 2015 October 22

ABSTRACT. We characterize the performance of the widely used least-squares estimator in astrometry in termsof a comparison with the Cramér–Rao lower variance bound. In this inference context the performance of the least-squares estimator does not offer a closed-form expression, but a new result is presented (Theorem 1) where both thebias and the mean-square-error of the least-squares estimator are bounded and approximated analytically, in thelatter case in terms of a nominal value and an interval around it. From the predicted nominal value, we analyzehow efficient the least-squares estimator is in comparison with the minimum variance Cramér–Rao bound. Based onour results, we show that, for the high signal-to-noise ratio regime, the performance of the least-squares estimator issignificantly poorer than the Cramér–Rao bound, and we characterize this gap analytically. On the positive side, weshow that for the challenging low signal-to-noise regime (attributed to either a weak astronomical signal or a noise-dominated condition) the least-squares estimator is near optimal, as its performance asymptotically approaches theCramér–Rao bound. However, we also demonstrate that, in general, there is no unbiased estimator for theastrometric position that can precisely reach the Cramér–Rao bound. We validate our theoretical analysis throughsimulated digital-detector observations under typical observing conditions. We show that the nominal value for themean-square-error of the least-squares estimator (obtained from our theorem) can be used as a benchmark indicatorof the expected statistical performance of the least-squares method under a wide range of conditions. Our results arevalid for an idealized linear (one-dimensional) array detector where intrapixel response changes are neglected, andwhere flat-fielding is achieved with very high accuracy.

1. MOTIVATION

Astrometry, the branch of observational astronomy that dealswith the precise and accurate estimation of angular positions oflight-emitting (usually point-like) sources projected against thecelestial sphere, is the oldest technique employed in the study ofthe heavens (Høg 2009, 2011; Reffert 2009). Repeated meas-urements of positions, spread over time, allow a determinationof the motions and distances of these sources, with astrophysicalimplications on dynamical studies of stellar systems and theMilky Way as a whole. With the advent of solid-state detectorsand all-digital techniques applied to radio-interferometry andspecialized ground- and space-based missions, astrometry hasbeen revolutionized in recent years, as we have entered a high-precision era in which this technique has started to play an in-creasingly important role in all areas of astronomy, astrophysics(van Altena 2013), and cosmology (Lattanzi 2012).

Current technology, based on two-dimensional (2D) discretedigital detectors (such as charged coupled devices (CCDs)), re-cord a (noisy) image (on an array of photo-sensitive pixels) of

celestial sources, from which it is possible to estimate both theirastrometry and photometry, simultaneously (Howell 2006). Theinference problem associated to the determination of thesequantities is at the core of the astrometric endeavor describedpreviously.

A number of techniques have been proposed to estimate thelocation and flux of celestial sources as recorded on digital de-tectors. In this context, estimators based on the use of a least-squares error function (LS hereafter) have been widely adopted(Stetson 1987; King 1983; Alard & Lupton 1998; Honeycutt1992; Cameron et al. 2006). The use of this type of decisionrule has been traditionally justified through heuristic reasonsand because they are conceptually straightforward to formulatebased on the observation model of these problems. Indeed, theLS approach was the classical method used when the observa-tions were obtained with analog devices (van Altena & Auer1975; Auer & van Altena 1978) (which corresponds to a Gauss-ian noise model for the observations, different from that of mod-ern digital detectors, which is characterized instead by a Poissonstatistics) and, consequently, the LS method was naturallyadopted from the analogous to the digital observational setting.3

In contemporary astrometry (Gaia, for instance), stellar po-sitions will be obtained by optimizing a likelihood function (see,

1 Departamento de Ingeniería Eléctrica, Information and Decision SystemsGroup (IDS), Facultad de Ciencias Físicas y Matemáticas, Universidad de Chile,Beauchef 850, Santiago, Chile; [email protected], [email protected],[email protected].

2 Departamento de Astronomía, Facultad de Ciencias Físicas y Matemáticas,Universidad de Chile, Casilla 36-D, Santiago, Chile; [email protected]. 3 See footnote 13 in § 3.1.

1166

PUBLICATIONS OF THE ASTRONOMICAL SOCIETY OF THE PACIFIC, 127:1166–1182, 2015 November© 2015. The Astronomical Society of the Pacific. All rights reserved. Printed in U.S.A.

The Maximum Likelihood Estimator

⌧ML(In) = argmin

↵2R

nX

i=1

�Ii ln(�i(↵)) + �i(↵).

Given a set of observations the maximum likelihood solves:

In

that reduces to:

⌧ML(In) = argmax

↵2Rln p↵(I

n)


– 9 –

In other words, the only way in which the performance of the WLS approximate the CR limits is

that we select the weights as in (29). However, this selection needs the information of the true

position xc

, which is impractical as it contradicts the very essence of the inference task. Another

interpretation is that no matter how we choose the weights of the WLS estimator, it is not possible

that the WLS is close to the CR bound for every position xc

, telling that the WLS is intrinsically

not optimal from the perspective of being close to the CR limit in all the possible astrometric

escenarios. In particular, this impossibility result is very precise in the high S/R regimes where

�2WLS

(n) ⇡ V arI

n⇠f

x

c

{⌧WLS

(In)}. This implication is consistent with the analysis presented by

Lobos et al. (2015, Fig. 4), where it was shown that the variance of the LS estimator is significantly

higher than then CR bound in the high S/N regime.

Remark 1 justified the study of the ML estimador.

4.2. Bounding the Variance of the ML estimator

The ML estimator, denoted by ⌧ML

(In) in (31), is implicitly defined through a cost function

J(↵, In) =n

X

i=1

Ii

ln(�i

(↵))� �i

(↵), (30)

where ↵ is a general source position parameter. Specifically, given an observation In we have that

⌧ML

(In) = argmax↵2R

J(↵, In),

= argmin↵2R

n

X

i=1

�Ii

ln(�i

(↵)) + �i

(↵). (31)

Applying Theorem 1 we obtain the following result:

Theorem 3 Let us consider the ML estimator solution of (31), then we have that

|EI

n⇠f

x

c

{⌧ML

(In)}� xc

| {z }

bias

| ✏ML

(n) (32)

and

V arI

n⇠f

x

c

{⌧ML

(In)} 2�

�2ML

(n)� �ML

(n),�2ML

(n) + �ML

(n)�

, (33)

where

�2ML

(n) = �2CR

(n) =

0

B

@

n

X

i=1

⇣

F̃ dg

i

(xc

)dx

c

⌘2

F̃ gi

(xc

) + B̃i

1

C

A

�1

, (34)

and �ML

(n) and ✏ML

(n) are well defined analytical expression of the problem (presented in Appendix

C).


∆ x [arcsec]0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

Var [

mas

]2

1600

1700

1800

1900

2000

2100

2200

2300

2400

2500σ2CR

(n) + βML(n)σ2CR

(n)− βML(n)σ2CR

(n)

F̃ = 1080e� small S/R


∆ x [arcsec]0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

Var [

mas

]2

240

250

260

270

280

290

300

310

320σ2CR

(n) + βML(n)σ2CR

(n)− βML(n)σ2CR

(n)

F̃ = 3224e� medium S/R


∆ x [arcsec]0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

Var [

mas

]2

15.5

16

16.5

17

17.5

18

18.5

19σ2CR

(n) + βML(n)σ2CR

(n)− βML(n)σ2CR

(n)

F̃ = 20004e� high S/R


∆ x [arcsec]0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

Var [

mas

]2

3.9

4

4.1

4.2

4.3

4.4

4.5

4.6

4.7

4.8σ2CR

(n) + βML(n)σ2CR

(n)− βML(n)σ2CR

(n)

F̃ = 60160e� very high S/R

Integrating Prior Information

Prior Information in Astrometry

Recent work considers the use of “prior information” to improve the accuracy in Astrometry (Michalik et al. 2015 and Michalik &Lindegren 2016):

Michalik &Lindegren 2016 proposes the use of priors from QSO proper motions to improve astrometry in the GAIA astrometric satellite.

Michalik et al. 2015 uses prior information to improve astrometry in simulated GAIA observations for point source with poor observations (low S/R). Prior comes from reasonable assumptions about the distribution of proper motions and parallaxes.

Michalik, D. & Lindegren, L. 2016, A&A, 586, A26 Michalik, D., Lindegren, L., Hobbs, D., & Butkevich, A. G. 2015b, A&A, 583, A68

Bayes Setting

The position is a r.v. with a “prior density function“

Xc (·)

P(Xc 2 B) =

Z

B (x) dx.

Bayes Setting

The position is a r.v. with a “prior density function“

Xc (·)

P(Xc 2 B) =

Z

B (x) dx.

Then

P(In = i

n|Xc

= x

c

) = p

xc(in) = ⇧n

k=1p�k(xc)(ik)| {z }⌘L(in;xc)

and we have a joint r.v.

(Xc, In)

P((Xc

, I

n) 2 B ⇥A) =

Z

B

X

i

n2A

p

x

(in) · (x)| {z } dx.

L̃(xc, in)

“joint likelihood“

The Bayes Cramer Rao Lower Bound

A&A proofs: manuscript no. Bayes_CR_astrometry_A&A

estimate xc (fixed but unknown) from a realization of In ⇠ pxc inEq. (5), then the Fisher information in Eq. (8) is given by

Ixc (n) =n

X

k=1

⇣

F̃ dgk(xc)dxc

⌘2

F̃gk(xc) + B̃k, (9)

which from Eq. (7) induces a minimum variance bound for theastrometric estimation problem. More precisely

min⌧n:Nn!R

Var(⌧n(In)) � Ixc (n)�1| {z }

⌘�2CR

. (10)

The expression �2CR in Eq. (10) is a shorthand for the 1-D astro-

metric CR lower bound in this parametric (classical) approach,as opposed to the Bayesian CR bound, which will be introducedin the next section.

3. The Bayesian estimation approach in astrometry

We now consider a Bayesian setting (Moon & Stirling 2000) forthe problem of estimating the object location xc from a set of ob-servations in 2 Nn. The Bayes scenario considers that the (hid-den) position is a random variable Xc (i.e., a random parameter),as opposed to a fixed although unknown parameter consideredin the classical setting in Sect. 2.2. The goal is to estimate Xcfrom a realization of the observation random vector In. In orderfor this inference to be nontrivial, Xc and In should be statis-tically dependent. In our case this dependency is modeled bythe observation equation in (5). More precisely, we have that theconditional probability of In given Xc is given by

P(In = in|Xc = xc) = pxc (in) = ⇧n

k=1 p�k(xc)(ik)| {z }

⌘L(in;xc)

(11)

where L(in; xc) denotes the likelihood of observing in given thatthe object position is xc while p�k(xc)(ik) has been defined inEq. (5).

The other fundamental element of the Bayesian approachis the “prior distribution” of Xc, given by a probability densityfunction (pdf) (·), i.e., for all B ⇢ R,

P(Xc 2 B) =Z

B (x) dx. (12)

Consequently, in the Bayes setting we know that, for all A ⇢ Nn

and B ⇢ R, the joint distribution of the random vector (Xc, In) isgiven by

P((Xc, In) 2 B ⇥ A) =Z

B

X

in2Apx(in) · (x) dx. (13)

It is important to highlight the role of the prior distribution ofthe object position Xc because it is the key mathematical objectthat allows us to pose the astrometry problem in the context ofBayesian estimation.

For the estimation of the object location, we need to establishthe decision function ⌧n(·) : Nn ! R that minimizes the MSE ininferring Xc from a realization of In. More precisely, the optimaldecision would be the solution of the following problem:

min⌧n:Nn!R

E(Xc,In)n

(⌧n(In) � Xc)2o

. (14)

The expectation value in Eq. (14) is taken with respect to thejoint distribution of both variables (Xc, In) (see Eq. (13)), and

the minimum is taken over all possible mappings (decisionrules) from Nn to R. On the right-hand-side (hereafter RHS)of Eq. (14), ⌧n(In) is the estimation of Xc from In throughoutthe decision rule ⌧n, also known as the estimator (Lehmann &Casella 1998). The optimal MSE estimator, which is the solutionof Eq. (14), is known as the Bayes rule (or estimator), which forthe square error risk function has a known theoretical solutionfunction of the posterior density P(Xc = xc|In = in) (Kay 2010,chap. 8). More details are presented in Sect. 7 (see in particularEq. (31)).

3.1. Bayes Cramér-Rao lower bound

As was the case in the parametric setting presented in Sect. 2.2,in the Bayes scenario it is also possible and meaningful to es-tablish bounds for the minimum MSE (MMSE hereafter) inEq. (14). This powerful result is known as the van Trees in-equality or the Bayesian CR (BCR) lower bound: Theorem

2. (Van Trees 2004, Sec. 2.4) For any possible decision rule⌧n : Nn ! R, it is true that

E(Xc,In)n

(⌧n(In) � Xc)2o

�2

6

6

6

6

6

4

E(In,Xc)

8

>

>

<

>

>

:

d ln L̃(Xc, In)dx

!29>

>

=

>

>

;

3

7

7

7

7

7

5

�1

, (15)

where

L̃(xc, in) ⌘ pxc (in) · (xc) = L(in; xc) · (xc) (16)

is shorthand for the joint density of (Xc, In) (see Eq. (13)), andwhere L(in; xc) is the likelihood of observing in given that theobject position is xc, for all xc 2 R and in 2 Nn (see Eq. (11)).

This result turns out to be the natural extension of the para-metric CR lower bound to the Bayes setting (see Sect. 2.2). Wenote in particular the similarities between Eq. (15) and the ex-pression in Eq. (7). In the Bayes setting this result o↵ers a lowerbound for the MSE of any estimator and, consequently, a lowerbound for the MMSE, i.e.,

min⌧n:Nn!R

E(Xc,In)n

(⌧n(In) � Xc)2o

�2

6

6

6

6

6

4

E(Xc,In)

8

>

>

<

>

>

:

d ln L̃(Xc, In)dx

!29>

>

=

>

>

;

3

7

7

7

7

7

5

�1

.

(17)

Proposition 2. If we analyze what we call the Bayes-Fisherinformation (BFI) term (a function that depends on F and ) onthe RHS of Eq. (17), we can establish that

E(Xc,In)

8

>

>

<

>

>

:

d ln L̃(Xc, In)dx

!29>

>

=

>

>

;

| {z }

Bayes Fisher Information=BFI(F, )

= EXc⇠ �IXc (n)

+ EXc⇠

8

>

>

<

>

>

:

d ln (Xc)dx

!29

>

>

=

>

>

;

| {z }

⌘I( )

. (18)

(see Appendix A for the derivation of Eq. (18))From Eq. (18), we conclude that the BFI corresponds to the

average Fisher information of the parametric setting (averagewith respect to ), plus a non-negative term associated with theprior distribution that we call the “prior information” and de-note I( ). Finally, from Eq. (15) we have that

min⌧n:Nn!R

E(Xc,In)n

(⌧n(In) � Xc)2o

� 1EXc⇠

�IXc (n)

+ I( )| {z }

⌘�2BCR

, (19)


“the Information Term“

Van Trees, H. L. 2004, Detection, estimation, and modulation theory (John Wiley & Sons)

The Bayes Information

… after some algebra (Echeverria et al. 2016):

E(Xc,In

)

8<

:

d ln L̃(Xc, I

n)

dx

!2

9=

;| {z }Bayes Fisher Information=BFI(F, )

= EXc⇠ {IXc(n)}+

EXc⇠

(✓d ln (Xc)

dx

◆2

)

| {z }⌘I( )

.

The Bayes Information

… then:

min

⌧n:Nn!RE(Xc,In)

n

(⌧n(In)�Xc)2o

� 1

EXc⇠ {IXc(n)}+ I( )| {z }

⌘�2BCR

,

is the average Fisher Information

information from the “prior”

EXc⇠ {IXc(n)}I( )

Gain in Astrometric Precision from the Prior

Definition: The prior information is irrelevant if

otherwise is relevant!

I( )I( )

EXc⇠ {IXc(n)}⇡ 0,


More meaningful is the following “operational” indicator:

Definition: The gain in performance attributed to is:

Gain( ) ⌘ min⌧nunNn!R and ⌧n

un is unbiasedEn

(⌧nunbias(In)� x)2

o

� min⌧n:Nn!R

En

(⌧n(In)�Xc)2o

,


……we can show that (Echeverria et al. 2016, Proposition 3)

Gain( ) � EXc⇠

n

IXc(n)�1

o

| {z }

� 1

EXc⇠ {IXc(n)}+ I( )| {z }

Mean CR Bound

Bayes CR Bound


……we can show that (Echeverria et al. 2016, Proposition 3)

where even for the “worse prior” (from Jensen´s Inequality):

Gain( ) � EXc⇠

n

IXc(n)�1

o

| {z }

� 1

EXc⇠ {IXc(n)}+ I( )| {z }

Mean CR Bound

Bayes CR Bound

Gain( ) � EXc⇠

n

IXc(n)�1

o

� EXc⇠ {IXc(n)}�1 � 0.


… further details in Echeverria et al., A&A, 2016.

Astronomy & Astrophysics manuscript no. Bayes_CR_astrometry_A&A c�ESO 2016August 2, 2016

Analysis of the Bayesian Cramér-Rao lower bound in astrometry:

Studying the impact of prior information in the location of an object

Alex Echeverria1, Jorge F. Silva,1, Rene A. Mendez2 and Marcos Orchard1

1 Information and Decision Systems Group, Department of Electrical Engineering, Universidad de Chile, Av. Tupper 2007, Santiago,Chilee-mail: [email protected], [email protected], [email protected]

2 Departamento de Astronomía, Facultad de Ciencias Físicas y Matemáticas, Universidad de Chile, Casilla 36-D, Santiago, Chilee-mail: [email protected]

Received January, 2016

ABSTRACT

Context. The best precision that can be achieved to estimate the location of a stellar-like object is a topic of permanent interest in theastrometric community.Aims. We analyze bounds for the best position estimation of a stellar-like object on a CCD detector array in a Bayesian setting wherethe position is unknown, but where we have access to a prior distribution. In contrast to a parametric setting where we estimate a pa-rameter from observations, the Bayesian approach estimates a random object (i.e., the position is a random variable) from observationsthat are statistically dependent on the position.Methods. We characterize the Bayesian Cramér-Rao (CR) that bounds the minimum mean square error (MMSE) of the best estimatorof the position of a point source on a linear CCD-like detector, as a function of the properties of detector, the source, and thebackground.Results. We quantify and analyze the increase in astrometric performance from the use of a prior distribution of the object position,which is not available in the classical parametric setting. This gain is shown to be significant for various observational regimes, inparticular in the case of faint objects or when the observations are taken under poor conditions. Furthermore, we present numericalevidence that the MMSE estimator of this problem tightly achieves the Bayesian CR bound. This is a remarkable result, demonstratingthat all the performance gains presented in our analysis can be achieved with the MMSE estimator.Conclusions. The Bayesian CR bound can be used as a benchmark indicator of the expected maximum positional precision of a set ofastrometric measurements in which prior information can be incorporated. This bound can be achieved through the conditional meanestimator, in contrast to the parametric case where no unbiased estimator precisely reaches the CR bound.

Key words. Astrometry, Bayes estimation, Bayes Cramér-Rao lower bound, performance analysis, minimum mean-square-errorestimation

1. Introduction

Astrometry, which relies on the precise determination of the rel-ative location of point sources, is the foundation of classical as-tronomy and modern astrophysics, and it will remain a corner-stone of the field for the 21st century. Historically, it was thefirst step in the evolution of astronomy from phenomenology to ascience that is rooted in precise measurements and physical the-ory. Astrometry spans more than two thousand years, from Hip-parchus (ca. 130 BC) and Ptolemy (150 AD) to modern digital-based all sky surveys, from the ground and in space. The dra-matic improvement in accuracy reflects this historic time scale(Høg (2011), see, e.g., his Figure 4). Nowadays, astronomerstake for granted resources such as the ESA Hipparcos mission,which yielded a catalog of more than 100,000 stellar positionsto an accuracy of 1 milliarcsecond, and look forward to the re-sults of the ESA Gaia astrometric satellite, which will delivera catalog of over 109 stars with accuracies smaller than 10-20microarcseconds for objects brighter than V = 15 and a com-pleteness limit of V = 20.

The determination of the best precision that can be achievedto determine the location of a stellar-like object has been a topic

of permanent interest in the astrometric community (van Altena& Auer 1975; Lindegren 1978; Auer & Van Altena 1978; Lee& van Altena 1983; Winick 1986; Jakobsen et al. 1992; Adorf1996; Bastian 2004; Lindegren 2010). One of the tools used tocharacterize this precision is the Cramér-Rao (CR) bound, whichprovides a lower bound for the variance that can be achievedto estimate (with an unbiased estimator) the position of a pointsource (Mendez et al. (2013, 2014); Lobos et al. (2015)), giventhe properties of the source and the detector. In astrometry thisCR bound o↵ers meaningful closed-form expressions that canbe used to analyze the complexity of the inference task in termsof key observational and design parameters, such as position ofthe object in the array, pixel resolution of the instrument, andbackground. In particular, Mendez et al. (2013, 2014) have de-veloped closed-form expressions for this bound and have studiedits structure and dependency with respect to important observa-tional parameters. Furthermore, the analysis of the CR bound al-lows us to address the problem of optimal pixel resolution of thearray for a given observational setting, and in general to eval-uate the complexity of the astrometric task with respect to thesignal-to-noise ratio (S/R) and di↵erent observational regimes.Complementing these results, Lobos et al. (2015) have studied


Article published by EDP Sciences, to be cited as http://dx.doi.org/10.1051/0004-6361/201628220

Numerical Analysis

We consider a Gaussian prior

(x) =1p

2⇡ �priori

e

� (x�µ)2

2�2priori

I( ) = EXc⇠

(✓d ln (X

c

)

dx

◆2)

=1

�

2priori

.

Numerical Analysis

We consider a Gaussian prior

then we have an expression for the astrometric BCR bound

(x) =1p

2⇡ �priori

e

� (x�µ)2

2�2priori

I( ) = EXc⇠

(✓d ln (X

c

)

dx

◆2)

=1

�

2priori

.

�

2BCR

=

2

4E(Xc

,I

n)

8<

:

d ln L̃(X

c

, I

n)

dx

!29=

;

3

5�1

=

2

64

0

B@F̃

2

2⇡�2B̃

· EX

c

⇠N (µ,�priori

)

8><

>:

nX

k=1

⇣e

��(x�k

�x

c

) � e

��(x+k

�x

c

)⌘2

⇣1 + 1p

2⇡ �

F̃

B̃

·Rx

+k

x

k

e

��(x�x

c

)dx

⌘

9>=

>;+

1

�

2priori

1

CA

3

75

�1

.

Numerical Analysis


RON = 5e�

G = 2e�/ADUD = 0

F 2 {268, 540, 1612} ) S/R 2 {6, 12, 32}

�priori

= 0.5arcsec.


simple to verify that the prior information becomes irrelevant (Definition 3) relative to the information of the observations In, and,consequently, the BCR bound in Eq. (27) reduces to:

�2BCR =

h

EXc⇠N(µ,�priori)�IXc (n)

i�1=

2

6

6

6

6

6

6

6

6

6

6

4

F̃2

2⇡�2B̃· EXc⇠N(µ,�priori)

8

>

>

>

>

>

<

>

>

>

>

>

:

nX

k=1

⇣

e��(x�k �xc) � e��(x+k �xc)⌘2

✓

1 + 1p2⇡�

F̃B̃ ·

R x+kx_

ke��(x�xc) dx

◆

9

>

>

>

>

>

=

>

>

>

>

>

;

3

7

7

7

7

7

7

7

7

7

7

5

�1

. (28)

This is the case when the prior statistics of Xc does not have any impact on the performance of the estimation of the object location,because of the high fidelity (informatory content) of the observations.

The second scenario is the background dominated regime, when F̃ ⌧ B̃ (i.e., the estimation on a very faint object). For thisanalysis we can fix �priori > 0, B̃, �x, and the FWHM, and take the limit when F̃ ! 05. In this case the information of theobservation is irrelevant relative to the prior information I( ) (Definition 4), and, consequently, the BCR bound in Eq. (27) reducesto:

�2BCR = I( )�1 = �2

priori. (29)

Therefore, when the observations are irrelevant (non-informative), the MMSE of the estimation of Xc (Eq. (19)) reduces to the priorvariance of Xc. Hence, it is direct to verify that the best MSE estimator (MMSE) in this context is the prior mean, i.e.,

EXc⇠N(µ,�priori) {Xc} = µ,

which, as expected, does not depend on the observations In.Note that, in general, I( )�1 is an upper bound for the expression in Eq. (19), which represents the worse case scenario from

the point of view of the quality of the observations.

5.3. Numerical evaluation and analysis

For the site conditions, we consider the scenario of a ground-based station located at a good site with clear atmospheric conditionsand the specifications of current science-grade PIDs, where fs = 2 000 ADU/arcsec, D = 0, RON = 5 e�, G = 2 e�/ADU andFWHM = 0.5 arcsec or FWHM = 1 arcsec (with these values B = 313 ADU for �x = 0.2 arcsec using Eq. (23)). In terms ofscenarios of analysis, we explore di↵erent pixel resolutions for the PID array �x 2 [0.1, 2.0] measured in arcsec, and di↵erent signalstrengths F 2 {268, 540, 1 612} ADU6. Note that increasing F implies increasing the Signal-to-Noise (S NR) of the problem, whichcan be approximately measured by the ratio F/B. On a given detector plus telescope setting, these di↵erent S NR scenarios can beobtained by changing appropriately the exposure time (open shutter) that generates the image (for further details, see Eq. (35)).

0 0.5 1 1.5 20

0.05

0.1

0.15

0.2

0.25

∆x [arcsec]

√

MSE

[arcsec]

BCR SNR=6BCR SNR=12BCR SNR=32MCR SNR=6MCR SNR=12MCR SNR=32

(a) FWHM = 1 arcsec

0 0.5 1 1.50

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

∆x [arcsec]

√

MSE

[arcsec]


(b) FWHM = 0.5 arcsec

Fig. 1. Relationship between the classical MCR (from Eqs. (B.2) and (24)), and the BCR (from Eq. (27)) lower bounds as a function of pixel size,for three di↵erent S NR regimes and two FWHM for the case of �priori = 0.5 arcsec. As can be seen �BCR �MCR in all regimes.

Fig. 1 shows the parametric and Bayes CR bounds for three S NR regimes, as a function of pixel size of the PID, and for twoFWHM scenarios (0.5 and 1.0 arcsec). We first note, as the theory predicts, that the BCR bound is below the classical CR boundin all cases, and that the gap (the performance gain Gain( ) in Eq. (21)) increases as a function of increasing the pixel size in allcases. In fact the di↵erence between the bounds becomes relevant for a pixel size larger than ⇠0.8 arcsec in Fig. 1a and bigger5 Alternatively, we can fix �priori, F̃, �x, and FWHM, and take the limit when B̃! 1.6 These are the same values explored in Mendez et al. (2013, Table 3).


FWHM = 1

�BCR�MCR

Numerical Analysis


RON = 5e�

G = 2e�/ADUD = 0

F 2 {268, 540, 1612} ) S/R 2 {6, 12, 32}

�priori

= 0.1arcsec.


than ⇠0.6 arcsec in Fig. 1b. The way to interpret these results is that, as �x increases, the astrometric quality of the observationdeteriorates and the prior information I( ) becomes more and more relevant in the Bayes context, information that is not available inthe parametric scenario. This explains the non-decreasing monotonic behaviour of Gain( ) as a function of �x for all the scenarios.If we look at one of the figures, and analyze the gain Gain( ) as a function of the S NR regime, we notice that the pixel size �x atwhich the prior information I( ) becomes relevant, in the sense that Gain( ) > ⌧ for some fixed threshold ⌧, increases with theS NR. In other words, for faint objects the prior information is relevant for a wider range of pixel sizes, than in the case of brightobjects.

0 0.5 1 1.5 20

0.05

0.1

0.15

0.2

0.25

∆x [arcsec]

√

MSE

[arcsec]


(a) FWHM = 1 arcsec

0 0.5 1 1.50

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

∆x [arcsec]

√

MSE

[arcsec]



Fig. 2. Same as Fig. 1 but with �priori = 0.1 arcsec.

0 0.5 1 1.5 20

0.05

0.1

0.15

0.2

0.25

∆x [arcsec]

√

MSE

[arcsec]


(a) FWHM = 1 arcsec

0 0.5 1 1.50

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

∆x [arcsec]

√

MSE

[arcsec]



Fig. 3. Same as Fig. 1 but with �priori = 0.05 arcsec. In this figure, as well as in Fig. 2 we see that while �MCR increases without bound as thequality of the observations deteriorate, �BCR is bounded by the prior information in accordance to Eq. (29) (see also comments in the text after thatequation).

Figs. 2 and 3 exhibit the trends for the bounds for the same experimental conditions as in Fig. 1, but with a significantlysmaller prior variance (�priori 2 {0.1, 0.05} arcsec). Therefore, we increase the prior information I( ) to see what happens in theperformance gain. In this scenario, the gain Gain( ) is significantly non-zero for all pixel resolutions, meaning that even for verysmall pixel size (in the range of 0.1-0.2 arcsec) the Bayes setting o↵ers a boost in the performance, which is very significant for faintobjects, see for instance the case of FWHM = 1 arcsec and S NR = 6 (Fig. 2a). From Figs. 2 and 3, another interesting observationis that the BCR bound converges to its upper bound limit, provided by the prior information I( )�1 = �2

priori, as �x increases, whichis very clear in Fig. 3a for �x > 1 in all the S NR regimes. Note that this upper bound limit is absent in the trend of the classical CRlimit, as the performance of the best estimator in the parametric setting deteriorates with the (loss of) quality of the observationswithout a bound.


FWHM = 1

�BCR

�MCR

Numerical Analysis


RON = 5e�

G = 2e�/ADUD = 0

F 2 {268, 540, 1612} ) S/R 2 {6, 12, 32}

FWHM = 1


than ⇠0.6 arcsec in Fig. 1b. The way to interpret these results is that, as �x increases, the astrometric quality of the observationdeteriorates and the prior information I( ) becomes more and more relevant in the Bayes context, information that is not available inthe parametric scenario. This explains the non-decreasing monotonic behaviour of Gain( ) as a function of �x for all the scenarios.If we look at one of the figures, and analyze the gain Gain( ) as a function of the S NR regime, we notice that the pixel size �x atwhich the prior information I( ) becomes relevant, in the sense that Gain( ) > ⌧ for some fixed threshold ⌧, increases with theS NR. In other words, for faint objects the prior information is relevant for a wider range of pixel sizes, than in the case of brightobjects.

0 0.5 1 1.5 20

0.05

0.1

0.15

0.2

0.25

∆x [arcsec]

√

MSE

[arcsec]


(a) FWHM = 1 arcsec

0 0.5 1 1.50

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

∆x [arcsec]

√

MSE

[arcsec]



Fig. 2. Same as Fig. 1 but with �priori = 0.1 arcsec.

0 0.5 1 1.5 20

0.05

0.1

0.15

0.2

0.25

∆x [arcsec]

√

MSE

[arcsec]


(a) FWHM = 1 arcsec

0 0.5 1 1.50

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

∆x [arcsec]

√

MSE

[arcsec]



Fig. 3. Same as Fig. 1 but with �priori = 0.05 arcsec. In this figure, as well as in Fig. 2 we see that while �MCR increases without bound as thequality of the observations deteriorate, �BCR is bounded by the prior information in accordance to Eq. (29) (see also comments in the text after thatequation).

Figs. 2 and 3 exhibit the trends for the bounds for the same experimental conditions as in Fig. 1, but with a significantlysmaller prior variance (�priori 2 {0.1, 0.05} arcsec). Therefore, we increase the prior information I( ) to see what happens in theperformance gain. In this scenario, the gain Gain( ) is significantly non-zero for all pixel resolutions, meaning that even for verysmall pixel size (in the range of 0.1-0.2 arcsec) the Bayes setting o↵ers a boost in the performance, which is very significant for faintobjects, see for instance the case of FWHM = 1 arcsec and S NR = 6 (Fig. 2a). From Figs. 2 and 3, another interesting observationis that the BCR bound converges to its upper bound limit, provided by the prior information I( )�1 = �2

priori, as �x increases, whichis very clear in Fig. 3a for �x > 1 in all the S NR regimes. Note that this upper bound limit is absent in the trend of the classical CRlimit, as the performance of the best estimator in the parametric setting deteriorates with the (loss of) quality of the observationswithout a bound.


�priori

= 0.05arcsec.

�MCR

�BCR

Numerical Analysis: Achievability with the MMSE

The conditional mean estimator (MMSE) achieves the �2BCR


is available for this problem. By contrast, in the parametric scenario, there is no prescription on how to build an unbiased estimatorthat reaches the CR lower bound in Eq. (10), unless certain very restrictive conditions are met (see Mendez et al. (2013), speciallytheir Eqs. (5) and (46)). Unfortunately these conditions are not satisfied in the astrometric case using PID detectors (see Lobos et al.(2015), specially their Sect. (3.1) and Appendix A), so in the parametric case there is no unbiased estimator that can precisely reachthe CR bound.

Returning to our problem, the Bayes rule is the well-known posterior mean of Xc given a realization of the observations. Moreformally, for all in 2 Nn the MMSE estimator is (Weinstein & Weiss 1988):

⌧nBayes(i

n) ⌘ EXc |In=in {Xc} =R

x2R x · (x)px(in)dxR

x2R (x̄)px̄(in)dx̄(31)

=

Z

x2Rx · pXc |In (x|in)dx. (32)

Note that (x) · px(in) is the joint density of the vector (Xc, In), the denominator of the RHS of Eq. (31) is the marginal distributionof In, which we denote by pIn (in) ⌘

R

x2R (x̄)px̄(in)dx̄, and, consequently, pXc |In (x|in) = (x)·px(in)pIn (in) in Eq. (32) denotes the posterior

density of Xc, evaluated at xc, conditioned to In = in.Furthermore, the performance of the MMSE estimator ⌧n

Bayes(·) has the following analytical expression:

E(Xc,In)

⇢

⇣

⌧nBayes(I

n) � Xc⌘2�

| {z }

MMS E

= EIn

⇢

EXc |In

⇢

⇣

⌧nBayes(I

n) � Xc⌘2��

= EIn

8

>

>

>

>

>

<

>

>

>

>

>

:

EXc |In

n

�

EXc |In {Xc} � Xc�2o

| {z }

Var(Xc |In)

9

>

>

>

>

>

=

>

>

>

>

>

;

(33)

=X

in2Nn

pIn (in) ·Z

x2R(⌧n

Bayes(in) � x)2 · pXc |In (x|in) dx

| {z }

Var(Xc |In=in)

, (34)

which can be interpreted as the average variance of Xc given realizations of In.Therefore, revisiting the inequality in expression (19), it is essential to analyze how tight is the BCR bound or, equivalently, how

large is the di↵erence between the MMSE in Eq. (34) and the BCR bound. To answer this important question, in the next subsection,we conduct some numerical experiments to evaluate how close is the BCR bound to the performance of the optimal estimator givenby Eq. (34) under some relevant observational regimes, and in various scenarios of prior information.

7.1. Numerical results

Figs. 5, 6, and 7 present the MMSE from Eq. (34) side by side with the BCR bound in di↵erent observational regimes. From theseresults we can say that, for any practical purposes, the optimal Bayes rule in Eq. (32) (used to determine the location of a pointsource) does achieve the BCR lower bound. Consequently, we conclude that for astrometry, the Bayes rule o↵ers a concrete andimplementable way to achieve the theoretical gain analyzed and studied in Sects. 4, 5.3 and 6 of this work.

0 0.5 1 1.5 20

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

∆x [arcsec]

√

MSE

[arcsec]

BCRMCRMMSE

(a) �priori=0.1 arcsec

0 0.5 1 1.5 20

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

∆x [arcsec]√

MSE

[arcsec]

BCRMCRMMSE

(b) �priori=0.05 arcsec

Fig. 5. Comparison between the MMSE from Eq. (34) and the BCR bound from Eq. (19) for two di↵erent �priori scenarios, considering a S NR =6. In addition the MCR bound from Eq. (B.2) is plotted to highlight the information gain.



is available for this problem. By contrast, in the parametric scenario, there is no prescription on how to build an unbiased estimatorthat reaches the CR lower bound in Eq. (10), unless certain very restrictive conditions are met (see Mendez et al. (2013), speciallytheir Eqs. (5) and (46)). Unfortunately these conditions are not satisfied in the astrometric case using PID detectors (see Lobos et al.(2015), specially their Sect. (3.1) and Appendix A), so in the parametric case there is no unbiased estimator that can precisely reachthe CR bound.

Returning to our problem, the Bayes rule is the well-known posterior mean of Xc given a realization of the observations. Moreformally, for all in 2 Nn the MMSE estimator is (Weinstein & Weiss 1988):

⌧nBayes(i

n) ⌘ EXc |In=in {Xc} =R

x2R x · (x)px(in)dxR

x2R (x̄)px̄(in)dx̄(31)

=

Z

x2Rx · pXc |In (x|in)dx. (32)

Note that (x) · px(in) is the joint density of the vector (Xc, In), the denominator of the RHS of Eq. (31) is the marginal distributionof In, which we denote by pIn (in) ⌘

R

x2R (x̄)px̄(in)dx̄, and, consequently, pXc |In (x|in) = (x)·px(in)pIn (in) in Eq. (32) denotes the posterior

density of Xc, evaluated at xc, conditioned to In = in.Furthermore, the performance of the MMSE estimator ⌧n

Bayes(·) has the following analytical expression:

E(Xc,In)

⇢

⇣

⌧nBayes(I

n) � Xc⌘2�

| {z }

MMS E

= EIn

⇢

EXc |In

⇢

⇣

⌧nBayes(I

n) � Xc⌘2��

= EIn

8

>

>

>

>

>

<

>

>

>

>

>

:

EXc |In

n

�

EXc |In {Xc} � Xc�2o

| {z }

Var(Xc |In)

9

>

>

>

>

>

=

>

>

>

>

>

;

(33)

=X

in2Nn

pIn (in) ·Z

x2R(⌧n

Bayes(in) � x)2 · pXc |In (x|in) dx

| {z }

Var(Xc |In=in)

, (34)

which can be interpreted as the average variance of Xc given realizations of In.Therefore, revisiting the inequality in expression (19), it is essential to analyze how tight is the BCR bound or, equivalently, how

large is the di↵erence between the MMSE in Eq. (34) and the BCR bound. To answer this important question, in the next subsection,we conduct some numerical experiments to evaluate how close is the BCR bound to the performance of the optimal estimator givenby Eq. (34) under some relevant observational regimes, and in various scenarios of prior information.

7.1. Numerical results

Figs. 5, 6, and 7 present the MMSE from Eq. (34) side by side with the BCR bound in di↵erent observational regimes. From theseresults we can say that, for any practical purposes, the optimal Bayes rule in Eq. (32) (used to determine the location of a pointsource) does achieve the BCR lower bound. Consequently, we conclude that for astrometry, the Bayes rule o↵ers a concrete andimplementable way to achieve the theoretical gain analyzed and studied in Sects. 4, 5.3 and 6 of this work.

0 0.5 1 1.5 20

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

∆x [arcsec]

√

MSE

[arcsec]

BCRMCRMMSE

(a) �priori=0.1 arcsec

0 0.5 1 1.5 20

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

∆x [arcsec]

√

MSE

[arcsec]

BCRMCRMMSE

(b) �priori=0.05 arcsec

Fig. 5. Comparison between the MMSE from Eq. (34) and the BCR bound from Eq. (19) for two di↵erent �priori scenarios, considering a S NR =6. In addition the MCR bound from Eq. (B.2) is plotted to highlight the information gain.


The Bayes CR bound can be attained in Astrometry!!

Bayes CR limits applied on Real DataGain in astrometric precision evaluated in a real data:

(BCR vs. MCR) applied point-wise for all 226 objects in the USNO-B1 SPG stellar catalog (Echeverria et al. 2016, A&A)

the reported uncertainty of the catalog (in arcseg) is used as prior, under our Guassian assumption. (prior information)

New set of measurements assumed under our realistic observational ground-based conditions and realistic S/R derived from the catalog (data information)

• specification of the EFOSC2 installed in NTT 3.58m telescope of ESO-La Silla observatory (RON of 9.2 e^-, D=7 e^-/pix/hr, G=1.33 e^-/ADU, pixel resolution of 0.24arcseg)

Bayes CR vs. Mean CR on a Real ScenarioA&A proofs: manuscript no. Bayes_CR_astrometry_A&A

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.40

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

σMCR [arcsec]

σBCR

[arcsec]

(a) FWHM=0.7 arcsec, aperture=3.5 m

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.40

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

σMCR [arcsec]

σBCR

[arcsec]

(b) FWHM=0.7 arcsec, aperture=1 m

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.40

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

σMCR [arcsec]

σBCR

[arcsec]

(c) FWHM=1.2 arcsec, aperture=3.5 m

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.40

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

σMCR [arcsec]

σBCR

[arcsec]

(d) FWHM=1.2 arcsec, aperture=1 m

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.40

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

σMCR [arcsec]

σBCR

[arcsec]

(e) FWHM=2 arcsec, aperture=1 m

Fig. 11. BCR versus MCR bounds for the whole SGP sample analysed in this paper. The dashed line indicates the one to one relationship, i.e., nogain of the BCR with respect to the parametric case given by the MCR bound.

measured in terms of a relative gain in astrometric performance, i.e., for each sample of the database we compute 100⇣

�MCR��BCR�MCR

⌘

.This array of figures ratifies the usefulness of a Bayesian strategy to mitigate the inevitable deterioration of astrometric precision asa function of flux (see Eq. (30) in Mendez et al. (2014)). Remarkably, this becomes particularly dramatic when the observations areperformed under adverse conditions (panels (d) and (e) in Fig. 12), where one can expect improvements of between 10% and up to50% on the astrometric precision for the faintest objects when using the Bayesian approach as described here.

9. Conclusions and outlook

In this work we provide a systematic analysis of the best precision that can be achieved to determine the location of a stellar-likeobject on a CCD-like detector array in a Bayesian setting. This setting changes in a radical way the nature of the inference taskin hand: from a parametric context - in which we are estimating a constant (or parameter) from a set of observations - to a settingin which we estimate a random object (i.e., the position is modeled as a random variable) from observations that are statisticallydependent with the position. A key new element of the Bayesian setting is the introduction of a prior distribution of the objectposition: We quantify and analyze in a systematic way how much is the gain in astrometric performance from the use of a priordistribution of the object position, information which is not available (or used) in the classical parametric setting. We tackle thisproblem basically from a theoretical and experimental point of view.

We derive new closed-form expressions for the Bayesian CR as well as expressions to estimate the gain in astrometric precision.Di↵erent observational regimes are evaluated to quantify the gain induced from the prior distribution of the object position. Ainsightful corollary of this analysis is that the Bayes setting always o↵ers a better performance than the parametric setting, even inthe worse-case prior (i.e., that of a uniform distribution).

We evaluate numerically the gain of the Bayes setting with respect to the parametric scenario under realistic experimentalconditions: We find that the gain in performance is significant for various observational regimes, which is particularly clear in thecase of faint objects, or when the observations are taken in poor conditions (i.e., in the low signal-to-noise regime). In this contextwe introduce the new concept of the equivalent object brightness. We also submit evidence that the minimum mean-square-errorestimator of this problem (the well-known conditional mean) tightly achieves the Bayesian CR lower bound, a remarkable result,demonstrating that all the performance gains presented in the theoretical analysis part of our paper can indeed be achieved withthe minimum mean-square-error estimator, which has in principle a practical implementation. We finalize our paper with a simple


Bayes CR vs. Mean CR on a Real Scenario


0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.40

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

σMCR [arcsec]

σBCR

[arcsec]

(a) FWHM=0.7 arcsec, aperture=3.5 m

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.40

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

σMCR [arcsec]

σBCR

[arcsec]

(b) FWHM=0.7 arcsec, aperture=1 m

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.40

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

σMCR [arcsec]

σBCR

[arcsec]

(c) FWHM=1.2 arcsec, aperture=3.5 m

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.40

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

σMCR [arcsec]

σBCR

[arcsec]

(d) FWHM=1.2 arcsec, aperture=1 m

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.40

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

σMCR [arcsec]

σBCR

[arcsec]

(e) FWHM=2 arcsec, aperture=1 m

Fig. 11. BCR versus MCR bounds for the whole SGP sample analysed in this paper. The dashed line indicates the one to one relationship, i.e., nogain of the BCR with respect to the parametric case given by the MCR bound.

measured in terms of a relative gain in astrometric performance, i.e., for each sample of the database we compute 100⇣

�MCR��BCR�MCR

⌘

.This array of figures ratifies the usefulness of a Bayesian strategy to mitigate the inevitable deterioration of astrometric precision asa function of flux (see Eq. (30) in Mendez et al. (2014)). Remarkably, this becomes particularly dramatic when the observations areperformed under adverse conditions (panels (d) and (e) in Fig. 12), where one can expect improvements of between 10% and up to50% on the astrometric precision for the faintest objects when using the Bayesian approach as described here.

9. Conclusions and outlook

In this work we provide a systematic analysis of the best precision that can be achieved to determine the location of a stellar-likeobject on a CCD-like detector array in a Bayesian setting. This setting changes in a radical way the nature of the inference taskin hand: from a parametric context - in which we are estimating a constant (or parameter) from a set of observations - to a settingin which we estimate a random object (i.e., the position is modeled as a random variable) from observations that are statisticallydependent with the position. A key new element of the Bayesian setting is the introduction of a prior distribution of the objectposition: We quantify and analyze in a systematic way how much is the gain in astrometric performance from the use of a priordistribution of the object position, information which is not available (or used) in the classical parametric setting. We tackle thisproblem basically from a theoretical and experimental point of view.

We derive new closed-form expressions for the Bayesian CR as well as expressions to estimate the gain in astrometric precision.Di↵erent observational regimes are evaluated to quantify the gain induced from the prior distribution of the object position. Ainsightful corollary of this analysis is that the Bayes setting always o↵ers a better performance than the parametric setting, even inthe worse-case prior (i.e., that of a uniform distribution).

We evaluate numerically the gain of the Bayes setting with respect to the parametric scenario under realistic experimentalconditions: We find that the gain in performance is significant for various observational regimes, which is particularly clear in thecase of faint objects, or when the observations are taken in poor conditions (i.e., in the low signal-to-noise regime). In this contextwe introduce the new concept of the equivalent object brightness. We also submit evidence that the minimum mean-square-errorestimator of this problem (the well-known conditional mean) tightly achieves the Bayesian CR lower bound, a remarkable result,demonstrating that all the performance gains presented in the theoretical analysis part of our paper can indeed be achieved withthe minimum mean-square-error estimator, which has in principle a practical implementation. We finalize our paper with a simple


Numerical Analysis

… further details in Echeverria et al., A&A, 2016.

Astronomy & Astrophysics manuscript no. Bayes_CR_astrometry_A&A c�ESO 2016August 2, 2016

Analysis of the Bayesian Cramér-Rao lower bound in astrometry:

Studying the impact of prior information in the location of an object

Alex Echeverria1, Jorge F. Silva,1, Rene A. Mendez2 and Marcos Orchard1

1 Information and Decision Systems Group, Department of Electrical Engineering, Universidad de Chile, Av. Tupper 2007, Santiago,Chilee-mail: [email protected], [email protected], [email protected]

2 Departamento de Astronomía, Facultad de Ciencias Físicas y Matemáticas, Universidad de Chile, Casilla 36-D, Santiago, Chilee-mail: [email protected]

Received January, 2016

ABSTRACT

Context. The best precision that can be achieved to estimate the location of a stellar-like object is a topic of permanent interest in theastrometric community.Aims. We analyze bounds for the best position estimation of a stellar-like object on a CCD detector array in a Bayesian setting wherethe position is unknown, but where we have access to a prior distribution. In contrast to a parametric setting where we estimate a pa-rameter from observations, the Bayesian approach estimates a random object (i.e., the position is a random variable) from observationsthat are statistically dependent on the position.Methods. We characterize the Bayesian Cramér-Rao (CR) that bounds the minimum mean square error (MMSE) of the best estimatorof the position of a point source on a linear CCD-like detector, as a function of the properties of detector, the source, and thebackground.Results. We quantify and analyze the increase in astrometric performance from the use of a prior distribution of the object position,which is not available in the classical parametric setting. This gain is shown to be significant for various observational regimes, inparticular in the case of faint objects or when the observations are taken under poor conditions. Furthermore, we present numericalevidence that the MMSE estimator of this problem tightly achieves the Bayesian CR bound. This is a remarkable result, demonstratingthat all the performance gains presented in our analysis can be achieved with the MMSE estimator.Conclusions. The Bayesian CR bound can be used as a benchmark indicator of the expected maximum positional precision of a set ofastrometric measurements in which prior information can be incorporated. This bound can be achieved through the conditional meanestimator, in contrast to the parametric case where no unbiased estimator precisely reaches the CR bound.

Key words. Astrometry, Bayes estimation, Bayes Cramér-Rao lower bound, performance analysis, minimum mean-square-errorestimation

1. Introduction

Astrometry, which relies on the precise determination of the rel-ative location of point sources, is the foundation of classical as-tronomy and modern astrophysics, and it will remain a corner-stone of the field for the 21st century. Historically, it was thefirst step in the evolution of astronomy from phenomenology to ascience that is rooted in precise measurements and physical the-ory. Astrometry spans more than two thousand years, from Hip-parchus (ca. 130 BC) and Ptolemy (150 AD) to modern digital-based all sky surveys, from the ground and in space. The dra-matic improvement in accuracy reflects this historic time scale(Høg (2011), see, e.g., his Figure 4). Nowadays, astronomerstake for granted resources such as the ESA Hipparcos mission,which yielded a catalog of more than 100,000 stellar positionsto an accuracy of 1 milliarcsecond, and look forward to the re-sults of the ESA Gaia astrometric satellite, which will delivera catalog of over 109 stars with accuracies smaller than 10-20microarcseconds for objects brighter than V = 15 and a com-pleteness limit of V = 20.

The determination of the best precision that can be achievedto determine the location of a stellar-like object has been a topic

of permanent interest in the astrometric community (van Altena& Auer 1975; Lindegren 1978; Auer & Van Altena 1978; Lee& van Altena 1983; Winick 1986; Jakobsen et al. 1992; Adorf1996; Bastian 2004; Lindegren 2010). One of the tools used tocharacterize this precision is the Cramér-Rao (CR) bound, whichprovides a lower bound for the variance that can be achievedto estimate (with an unbiased estimator) the position of a pointsource (Mendez et al. (2013, 2014); Lobos et al. (2015)), giventhe properties of the source and the detector. In astrometry thisCR bound o↵ers meaningful closed-form expressions that canbe used to analyze the complexity of the inference task in termsof key observational and design parameters, such as position ofthe object in the array, pixel resolution of the instrument, andbackground. In particular, Mendez et al. (2013, 2014) have de-veloped closed-form expressions for this bound and have studiedits structure and dependency with respect to important observa-tional parameters. Furthermore, the analysis of the CR bound al-lows us to address the problem of optimal pixel resolution of thearray for a given observational setting, and in general to eval-uate the complexity of the astrometric task with respect to thesignal-to-noise ratio (S/R) and di↵erent observational regimes.Complementing these results, Lobos et al. (2015) have studied


Article published by EDP Sciences, to be cited as http://dx.doi.org/10.1051/0004-6361/201628220

Rene Mendez Marcos Orchard

Rodrigo Lobos Alex Echeverria Sebastian Espinosa

Thanks

astrometry - github pages

Documents