fully bayesian analysis of calibration uncertainty in high ...hea- filefully bayesian analysis of...

58
Outline Background Methodological Research Results Future Work New Dataset 1878 PCA for 1000 rmfs Fully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics, UCI February 26, 2013 JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Upload: others

Post on 28-Oct-2019

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Fully Bayesian Analysis of Calibration UncertaintyIn High Energy Spectral Analysis

JIN XU

Department of Statistics, UCI

February 26, 2013

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 2: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Background

Methodological ResearchModel BuildingPrinciple Component AnalysisThree Inferencial Models

ResultsSimulationQuasar Analysis

Future WorkDoubly-intractable DistributionOther Calibration Uncertainty

New Dataset 1878

PCA for 1000 rmfs

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 3: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Background

I High-Energy Astrophysics

I Spectral Analysis

I Calibration Products

I Scientific Goals

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 4: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

High-Energy Astrophysics

I Provide understanding into high-energy regions of theUniverse.

I Chandra X-ray Observatory is designed to observe X-rays fromhigh-energy regions of the Universe.

I X-ray detectors typically count a small number of photons ineach of a large number of pixels.

I Spectral Analysis aims to explore the parameterized patternbetween the photon counts and energy.

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 5: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Quasar

The Chandra X-ray image of the quasar PKS 1127-145, a highlyluminous source of X-rays and visible light about 10 billion lightyears from Earth.

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 6: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

An Example of One Dataset

TITLE = EXTENDED EMISSION AROUND A GIGAHERTZPEAKED RADIO SOURCEDATE = 2006-12-29 T 16:10:48

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 7: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Calibration Uncertainty

I Effective area records sensitivity as a function of energy.I Energy redistribution matrix can vary with energy/location.I Point Spread Functions can vary with energy and location.

E [keV]

AC

IS−

S e

ffe

ctive

are

a (

cm

2)

02

00

40

06

00

80

0

0.2 1 10

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 8: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Incorporate Calibration Uncertainty

I Calibration Uncertainty in astronomical analysis has beengenerally ignored.

I No robust principled method is available.

I Our goal is to incorporate the uncertainty by BayesianMethods.

I In this talk, we will focus on uncertainty in the effective area.

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 9: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Problem Description

I The true effective area curve can’t be observed.

I No parameterized form for the density of effective area curvecomplicates model fitting.

I Simple MCMC is quite expensive, due to the complexity ofthe astronomical model.

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 10: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Generating Calibration Sample

I Drake et al. (2006),suggest generatingcalibration sample ofeffective area curves torepresent the uncertainty.

I The plot shows thecoverage of a sample of1000 effective area curves,and the default one (A0)is a black line.

I Calibration Sample:{A1,A2,A3, ...,AL}

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 11: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Model BuildingPrinciple Component AnalysisThree Inferencial Models

Three Main Steps

I Use Principle Component Analysis to parameterize effectivearea curve.

I Model Building, that is combining source model withcalibration uncertainty.

I Three Inferencial Models.

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 12: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Model BuildingPrinciple Component AnalysisThree Inferencial Models

A simplified model of telescope

E (Y (Ei )) = A(Ei ) ∗ S(Ei ); Y (Ei ) ∼ Poisson(E (Y (Ei )))

Y (Ei ): Observed Photon in certain energy bin Ei

S(Ei ): True Source Model, we set it as,

S(Ei ) = exp(−nH ∗ α(Ei )) ∗ a ∗ E(−Γ)i + b

A(Ei ): Effective Area Curve

θ: source parameter, θ = {nH , a, Γ, b}α(Ei ): photo-electric cross-section

b: background intensity

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 13: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Model BuildingPrinciple Component AnalysisThree Inferencial Models

Use PCA to represent effective area curve

A = A0 + δ +∑m

j=1 ej rjvj

A0 : default effective area,

δ : mean deviation from A0,

rj and vj : first m principle component eigenvalues & vectors,

ej : independent standard normal deviations.

Capture 95% of uncertainty with m = 6 - 9. (Lee et al. 2011, ApJ)

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 14: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Model BuildingPrinciple Component AnalysisThree Inferencial Models

Three Inferencial Models

I Fixed Effective Area Model(Standard Approach)I Pragmatic Bayesian Model

I Original Pragmatic Bayesian Scheme (Lee et al. 2011, ApJ)I Efficient Pragmatic Bayesian Scheme

I Fully Bayesian ModelI Gibbs Sampling SchemeI Importance Sampling Scheme

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 15: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Model BuildingPrinciple Component AnalysisThree Inferencial Models

Model One: Fixed Effective Area (Standard Approach)

I Model: p(θ|Y ,A0)

I We assume A = A0, where A0 is the default affective areacurve, and may not be the true one,

I This model doesn’t incorporate calibration uncertainty, whichis widely used because of its simplicity.

I The estimation may be biased and error bars may beunderestimated.

I Only one sampling step involved:p(θ|Y ,A0) ∝ L(Y |θ,A0)π(θ)

I A mixed approach of Metropolis and Metropolis-hastings isused in the sampling

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 16: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Model BuildingPrinciple Component AnalysisThree Inferencial Models

Model Two: Pragmatic Bayesian(Lee et al, 2011, ApJ)

I Model: Pprag (θ,A|Y ) = p(θ|A,Y ) ∗ π(A)

I Doubly-intractable Distribution!

I Main purpose is to reduce complexity of sampling.

I Step One: sample A from π(A)

I Step Two: sample θ from p(θ|Y ,A) ∝ L(Y |θ,A)π(θ)

I A mixed approach of Metropolis and Metropolis-hastings isused in the Step Two

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 17: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Model BuildingPrinciple Component AnalysisThree Inferencial Models

Model Two: Efficient Pragmatic Bayesian

I After each draw of Ai (i from 1 to n) from π(A), we have tofind the best Metropolis-hastings proposal for p(θ|Y ,A),which costs a long and relatively constant time, say, T1. (MLinvolved.)

I Once the proposal distribution is fixed given Ai , each draw ofθ from p(θ|Y ,A) costs a rather short time, say, T2. (T1 > T2)

I In order to obtain the most effective samples for θ, we samplem θ’s given Ai , say, θij . (j from 1 to m)

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 18: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Model BuildingPrinciple Component AnalysisThree Inferencial Models

Example

0 200 400 600 800

0.08

0.09

0.10

0.11

0.12

0.13

Index

nH

0 200 400 600 800

0.08

0.09

0.10

0.11

0.12

0.13

Index

nH

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 19: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Model BuildingPrinciple Component AnalysisThree Inferencial Models

Model Two: Efficient Pragmatic Bayesian

I Then this problem could be simplified into one optimizationproblem.

I Minimize: Var( 1n

∑i (

1m

∑j θij ))

Subject to: T = nT1 + nmT2

I T is the total time, and when m = 1, the scheme turns intooriginal Pragmatic Bayesian, Lee et al(2011, ApJ)

I We can get simple analytical solution:

n =√

BT√BT1+

√WT2T1

; m =√

WT1√BT2

I Here, B = σ2θ − σ2

θ|A, W = σ2θ|A, and we assume θ’s given A is

independent to each other

I Notice, m is not related to T .

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 20: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Model BuildingPrinciple Component AnalysisThree Inferencial Models

Model Two: Efficient Pragmatic Bayesian

I The assumption that θ’s given A is independent to each othercan be achieved if we thin the iterations within one A by a bignumber.

I If we assume θ’s given one A are AR(1), neighbor correlationis ρ

I Then Var(Y ) = 1n (B + W 1+ρ

m(1−ρ) )

I we can get still get similar optimization solutions as above,only need to replace W by W 1+ρ

1−ρ

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 21: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Model BuildingPrinciple Component AnalysisThree Inferencial Models

Model Two: Efficient Pragmatic Bayesian Sampling

Two ways of Efficient Pragmatic Bayesian Sampling of N θ’s

I (n1,m1)⇒ (B, W )⇒ (m)⇒ (n = N−n1m1m )

I while {n0 < N}do { update B and W;calculate m;sample A;sample m θ’s from p(θ|Y ,A);n0 = n0 + m;}

I The second adaptive scheme hasn’t been verified yet!

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 22: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Model BuildingPrinciple Component AnalysisThree Inferencial Models

Model Two: Efficient Pragmatic Bayesian Sampling

Here are two chains, separately from Pragmatic Bayesian andEfficient Pragmatic Bayesian Samplings for quasar dataset 3104.

0 1000 2000 3000

0.02

0.04

0.06

0.08

Original Pragmatic Bayesian

abs.nH

0 1000 2000 3000

0.02

0.04

0.06

0.08

Efficient Pragmatic Bayeisan

abs.nH

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 23: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Model BuildingPrinciple Component AnalysisThree Inferencial Models

Model Two: Efficient Pragmatic Bayesian Sampling

QQ plot of these two chains.

0.02 0.04 0.06 0.08

0.03

0.04

0.05

0.06

0.07

Original Pragmatic Bayesian

Effic

ient P

ragm

atic B

ayeis

an

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 24: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Model BuildingPrinciple Component AnalysisThree Inferencial Models

Model Two: Efficient Pragmatic Bayesian Sampling

Results for dataset 3104:Before Efficient Pragmatic Bayesian Sample, T1, T2, B and W areestimated. T1 = 7.1sec, T2 = 0.045sec, B=4.01e-5, W=2.40e-5.Then the optimal m = 10, n = 300, if we want to draw 3000 θ’s.

µabs nH σ2abs nH σ2

µ T

Pragmatic Bayesian 0.0447 6.36e-05 2.12e-08 5.9hrsEfficient Pragmatic Bayesian 0.0443 6.10e-05 1.13e-07 0.6hrs

Ratio 0.187 10

EPB almost doubles effective sample size.

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 25: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Model BuildingPrinciple Component AnalysisThree Inferencial Models

Model Three: Fully Bayesian using Gibbs sampling

I Uses correct Bayesian Approach:Pfull (θ,A|Y ) = p(θ|A,Y ) ∗ p(A|Y )

I This Model allows the current data to influence calibrationproducts,

I Step One: sample A from p(A|Y , θ) ∝ L(Y |θ,A)π(A)

I Step Two: sample θ from p(θ|Y ,A) ∝ L(Y |θ,A)π(θ)

I A mixed approach of Metropolis and Metropolis-hastings isused in the both steps

I Most difficult approach to sample.

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 26: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Model BuildingPrinciple Component AnalysisThree Inferencial Models

Importance sampling for Fully Bayesian

Fully Bayesian using Gibbs sampling involves a lot of choice ofproposal distributions, and the choice of proposal distributionshighly influences the performance of the chains. Here, weintroduce importance sampling for Fully Bayesian, which takesadvantage of the draws from Pragmatic Bayesian Model.(Pragmatic Bayesian Model has larger variance.)

0.06 0.10 0.14

0.85

0.95

1.05

1.15

Fixed ARF

abs1.nH

p1.gam

ma

0.06 0.10 0.14

0.85

0.95

1.05

1.15

Pragmatic Bayesian

abs1.nH

p1.gam

ma

0.06 0.10 0.14

0.85

0.95

1.05

1.15

Fully Bayesian

abs1.nH

p1.gam

ma

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 27: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Model BuildingPrinciple Component AnalysisThree Inferencial Models

Importance sampling for Fully Bayesian

Steps:

I Get the draws from Pragmatic Bayesian MethodI Approximate Pprag (θ|A,Y ) as Multivariate Normal

distribution, we call it Pnew prag (θ|A,Y ).I Pprag (θ|A,Y ) can’t be calculated because of doubly-intractable

distribution.I 18 parameters from A are all independent standard normal.

I Get new draws by sampling π(A) and Pnew prag (θ|A,Y )

I Calculate the ratio r = Pfully (A, θ|Y )/Pnew prag (A, θ|Y )

I Use the ratios to do resampling for Fully Bayesian.

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 28: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Model BuildingPrinciple Component AnalysisThree Inferencial Models

Importance sampling for Fully Bayesian

I The great benefit from the new scheme is everything can workautomatically, saving the trouble of choosing ”nice” proposaldistributions in the Fully Bayesian Model using Gibbssampling.

I The disadvantage is that every time we need to fit FullyBayesian Model, we have to do Pragmatic Bayesian first.Usually, astronomers would like to use all three Models and todo the comparison.

I The results of Fully Bayesian using Gibbs sampling andImportance sampling are usually identical to each other.

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 29: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Model BuildingPrinciple Component AnalysisThree Inferencial Models

Importance sampling for Fully Bayesian

Here are two scatter plots, separately from Fully Bayesian usingGibbs sampling(Black) and Importance sampling(Red) for quasardataset 3105.

0.000 0.005 0.010 0.015 0.020

1.45

1.50

1.55

1.60

nH

Γ

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 30: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

SimulationQuasar Analysis

Eight simulated data sets

The first four data sets were all simulated without backgroundcontamination using the XSPEC model wabs*powerlaw, nominaldefault effective area A0 from the calibration sample of Drake etal. (2006), and a default RMF.

I Simulation 1: Γ = 2,NH = 223cm−2, and 105 counts;

I Simulation 2: Γ = 1,NH = 221cm−2, and 105 counts;

I Simulation 3: Γ = 2,NH = 223cm−2, and 104 counts;

I Simulation 4: Γ = 1,NH = 221cm−2, and 104 counts;

The other four data sets (Simulation 5-8) were generated using anextreme instance of an effective area.

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 31: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

SimulationQuasar Analysis

Results for Simulation 2

0.8 0.9 1.0 1.1 1.2 1.3

010

2030

40

Γ

[SIM 2] NH=1021; Γ=1; N=105

fix ARF

fully bayesian

pragmatic bayesian

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 32: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

SimulationQuasar Analysis

Results for Simulation 3

1.6 1.8 2.0 2.2 2.4 2.6 2.8

01

23

4

Γ

[SIM 3] NH=1023; Γ=2; N=104

fix ARF

fully bayesian

pragmatic bayesian

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 33: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

SimulationQuasar Analysis

Results for Simulation 6

0.8 0.9 1.0 1.1 1.2 1.3

05

1015

2025

30

Γ

[SIM 6] NH=1021; Γ=1; N=105

fix ARF

fully bayesian

pragmatic bayesian

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 34: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

SimulationQuasar Analysis

Results for Simulation 7

1.6 1.8 2.0 2.2 2.4 2.6 2.8

01

23

45

Γ

[SIM 7] NH=1023; Γ=2; N=104

fix ARF

fully bayesian

pragmatic bayesian

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 35: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

SimulationQuasar Analysis

Quasar results

I 16 Quasar data sets were fit by these three models: 377, 836,866, 1602, 3055, 3056, 3097, 3098, 3100, 3101, 3102, 3103,3104, 3105, 3106, 3107.

I Most interesting founding for Fully Bayesian model is shift ofparameter fitting, besides the change of standard errors.

I Both comparisons of mean and standard errors among threemodels are shown below.

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 36: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

SimulationQuasar Analysis

mean: fix-prag

Fixed Effective Area Curve Model has almost the same parameterfitting as Pragmatic Bayesian Model.

0.0 0.5 1.0 1.5 2.0 2.5

0.0

0.5

1.0

1.5

2.0

2.5

µfix(Γ)

µ pra

g(Γ)

3778368661602305530563097309831003101310231033104310531063107

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 37: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

SimulationQuasar Analysis

mean: fix-full

Fully Bayesian model shifts the parameter fitting, which mean dataitself influence the choice of effective area curve.

0.0 0.5 1.0 1.5 2.0 2.5

0.0

0.5

1.0

1.5

2.0

2.5

µfix(Γ)

µ full(Γ

)

3778368661602305530563097309831003101310231033104310531063107

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 38: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

SimulationQuasar Analysis

sd: fix-prag

Pragmatic Bayesian has larger parameter standard deviation thanFixed Effective Area Curve Model, especially for large datasets.

0.01 0.02 0.05 0.10 0.20 0.50 1.00

0.01

0.02

0.05

0.10

0.20

0.50

1.00

σfix(Γ)

σ pra

g(Γ)

3778368661602305530563097309831003101310231033104310531063107

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 39: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

SimulationQuasar Analysis

sd: fix-full

Fully Bayesian usually has larger parameter standard deviation thanFixed Effective Area Curve Model, too.

0.01 0.02 0.05 0.10 0.20 0.50 1.00

0.01

0.02

0.05

0.10

0.20

0.50

1.00

σfix(Γ)

σ full(Γ

)

3778368661602305530563097309831003101310231033104310531063107

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 40: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

SimulationQuasar Analysis

sd: prag-full

It can be observed that generally parameter standard deviationfrom Fully Bayesian Model is between Pragmatic Bayesian andFixed Effective Area Curve Models.

0.01 0.02 0.05 0.10 0.20 0.50 1.00

0.01

0.02

0.05

0.10

0.20

0.50

1.00

σprag(Γ)

σ full(Γ

)

3778368661602305530563097309831003101310231033104310531063107

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 41: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

SimulationQuasar Analysis

more plots

µprag (Γ) =µprag (Γ)−µfix (Γ)

σfix (Γ) , these lines cover 2 sd.

−4 −2 0 2 4

0.02

0.05

0.10

0.20

0.50

µprag(Γ)

σ fix(Γ

)

3778368661602305530563097309831003101310231033104310531063107

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 42: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

SimulationQuasar Analysis

more plots

µfull (Γ) = µfull (Γ)−µfix (Γ)σfix (Γ) , these lines cover 2 sd.

−4 −2 0 2 4

0.02

0.05

0.10

0.20

0.50

µfull(Γ)

σ fix(Γ

)

3778368661602305530563097309831003101310231033104310531063107

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 43: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Doubly-intractable DistributionOther Calibration Uncertainty

Doubly-intractable Distribution

I Murray et al (2012) defines doubly-intractable distributionthis way:

p(θ|y) = ( f (y ;θ)π(θ)Z(θ) )/p(y)

I Here, Z (θ) =∫

f (y ; θ)dy , is called as ”unknown constant”and can’t be computed.

I Doubly-intractable Distributions widely exist, and mostfamous method is introduced by Møller, Jesper, et al.(2006),called ”Auxiliary Variable Method”.

I One of most popular model involving Doubly-intractableDistribution is Ising Model.

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 44: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Doubly-intractable DistributionOther Calibration Uncertainty

Doubly-intractable Distribution

I Our Pragmatic Bayesian Model involves:

Pprag (θ,A|Y ) = p(θ|A,Y ) ∗ π(A) = L(Y |θ,A)∗π(θ)p(Y |A) ∗ π(A)

I Here, p(Y |A) is unknown normalizing constant.I Interestingly, unlike other Doubly-intractable Distributions,

our Pragmatic Bayesian Model can avoid this constant easilyand get the posterior draws.

I However, when we use these draws to help sample FullyBayesian Model by importance sampling, we find out thatratio = p(Y |A)

p(Y ) , can’t avoid the constant!!!I That’s the reason we re-estimate the draws’ density by

Multivariate Normal, and use new draws from theMulti-Gaussian to do the importance sampling. Avoid theconstant again!!!

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 45: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Doubly-intractable DistributionOther Calibration Uncertainty

Other Calibration Uncertainty

For effective area curve, Lee et al. firstly gets 1000 curve samplesand then use PCA to parameterize the effective area curve.

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 46: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Doubly-intractable DistributionOther Calibration Uncertainty

Other Calibration Uncertainty

I PCA has its unique advantages: easy to parameterize vectorsamples; dimension highly reduced; easy to sample afterwards.

I However, there is no principled method of PCA toparameterize matrix samples.

I Other Calibration Uncertainty, rmf and psf are both matrix.

I Our goal is to parameterize these matrix samples, define theprobability density of the matrix, and nicely sample newmatrix.

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 47: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Doubly-intractable DistributionOther Calibration Uncertainty

Other Calibration Uncertainty

Some possible approaches:

I Diffusion map allows mapping data into a coordinate systemthat efficiently reveals the geometric structure of data.However, because of its nonlinear transformation, diffusionmap is not reversible.

I Delaigle et al (2010) introduced a way to define probabilitydensity for a distribution of random functions. However, howto extend it into 2-dimension space is under question.

I Wavelets technique can provide nice way to extractinformation from 2-dimension data. Besides, complementarywavelets allows us to recover the original information withminimal loss.

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 48: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Doubly-intractable DistributionOther Calibration Uncertainty

Other Calibration Uncertainty

Here is our potential approach:

I Use wavelets to analyze all the matrix samples;

I Use PCA to summarize wavelet parameters from step one;

I By sampling independent standard normal deviations toconstruct wavelet parameters;

I Use new wavelet parameter to construct new matrix sample.

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 49: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Doubly-intractable DistributionOther Calibration Uncertainty

Citations

I Lee, Hyunsook, et al. ”Accounting for calibrationuncertainties in X-ray analysis: effective areas in spectralfitting.” The Astrophysical Journal 731.2 (2011): 126.

I Murray, Iain, Zoubin Ghahramani, and David MacKay.”MCMC for doubly-intractable distributions.” arXiv preprintarXiv:1206.6848 (2012).

I Delaigle, Aurore, and Peter Hall. ”Defining probability densityfor a distribution of random functions.” The Annals ofStatistics 38.2 (2010): 1171-1193.

I Lee, Ann B., and Larry Wasserman. ”Spectral connectivityanalysis.” Journal of the American Statistical Association105.491 (2010): 1241-1255.

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 50: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

1878

I Model: ”xsphabs.abs1*(xsapec.kT1+xsapec.kT2)”

I Set kT2.Abundanc = kT1.Abundanc

I Six source parameters: abs1.nH, kT1.kT, kT1.Abundanc,kT1.norm, kT2.kT, kT2.norm

I Sherpa fit() using Neldermead motheds to find MLE fails

I Stuck at boundaries: abs.nH=0 and kT1.Abundanc=5

I Multiple modes

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 51: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Fixed ARF

The results are highly correlated to starting values. Although thechains might seem converge, it’s just stuck in the local mode.

0.00 0.05 0.10 0.15 0.20 0.25

0.1

0.2

0.3

0.4

0.5

0.6

abs1.nH

kT1.kT

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 52: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Fully Bayeisan

Similarly, it happens to Fully Bayesian. Once its stuck in the mode,it seems impossible to accept a new ARF.

0 500 1000 1500 2000 2500 3000

0.010

0.015

0.020

0.025

0.030

0.035

Index

abs1.nH

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 53: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Redistribution Matrix File (RMF)

Matrix consists the probability that an incoming photon of energyE will be detected in the output detector channel I. The matrix hasdimension 1078*1024. But the data is stored in a compressedformat. For example rmf 0001:n grp = UInt64[1078], stores the number of non-zero groups ineach rowf chan = UInt32[1395], stores starting point of each groupn chan = UInt32[1395], stores the element number of each groupmatrix = Float64[384450], stores all the non-zero values

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 54: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

Redistribution Matrix File (RMF)

I The sum of each row equals to one

I sum(n grp)=1395

I sum(n chan)=384450

I sum(matrix)=1078

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 55: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

log(Rmf0001+1e-6)

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 56: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

log(Rmf0002+1e-6)

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 57: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

prepare for PCA

I raw data dimension (1000,1078*1024)

I discard bottom-right zeros, dimension (1000, 445380)

I I use python DMP NIPALS algorithm (Nonlinear IterativePartial Least Squares)

I (1000, 445380) MemoryError!!!

I (100, 445380) MemoryError!!!

I (1000, 100000) MemoryError!!!

I (1000, 10000) 1 hours, I got 5 principal components

I variance: (3.6e-02, 8.8e-03, 1.02e-04, 3.05e-05, 9.74e-06)

JIN XU Fully Bayesian Analysis of Calibration Uncertainty

Page 58: Fully Bayesian Analysis of Calibration Uncertainty In High ...hea- fileFully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis JIN XU Department of Statistics,

OutlineBackground

Methodological ResearchResults

Future WorkNew Dataset 1878PCA for 1000 rmfs

simulating rmf

JIN XU Fully Bayesian Analysis of Calibration Uncertainty