david baker ncar / terrestrial sciences section 23 july 2006

47
The prototype carbon inverse problem: estimation of regional CO 2 sources and sinks from global atmospheric [CO 2 ] measurements David Baker NCAR / Terrestrial Sciences Section 23 July 2006

Upload: mort

Post on 09-Feb-2016

24 views

Category:

Documents


0 download

DESCRIPTION

The prototype carbon inverse problem: estimation of regional CO 2 sources and sinks from global atmospheric [CO 2 ] measurements. David Baker NCAR / Terrestrial Sciences Section 23 July 2006. Description of the source/sink problem CO 2 measurement availability thru time - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

The prototype carbon inverse problem:

estimation of regional CO2 sources and sinks

from global atmospheric [CO2] measurements

David BakerNCAR / Terrestrial Sciences Section

23 July 2006

Page 2: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

Outline• Description of the source/sink problem• CO2 measurement availability thru time• Method determined by computational issues• The hierarchy of inverse methods used:

• Batch least-squares• The (full) Kalman filter/smoother• The use of the adjoint: what it can (and can’t) do• The ensemble Kalman filter/smoother• Variational data assimilation

Page 3: David Baker NCAR / Terrestrial Sciences Section 23 July 2006
Page 4: David Baker NCAR / Terrestrial Sciences Section 23 July 2006
Page 5: David Baker NCAR / Terrestrial Sciences Section 23 July 2006
Page 6: David Baker NCAR / Terrestrial Sciences Section 23 July 2006
Page 7: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

TransCom3 Regions (22) and Measurement Sites (78)

Page 8: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

Application: CO2 flux estimation

• Discretize problem: solve for monthly fluxes for 22 regions

• Measurement equations: Hx=z– x – regional CO2 fluxes, monthly avg– z – CO2 measurements, monthly avg– H – transport matrix; columns are Green’s

functions of a 1-month pulse of unitized CO2 flux from each region calculated using the transport model

Page 9: David Baker NCAR / Terrestrial Sciences Section 23 July 2006
Page 10: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

M

M

M

M

M

M

M

M

M

M

M

M

M

M

M

M

HHHHHHHH

HHHHHHHH

HHHHHHHH

HHHH

123

1

1

2

1

3

1

4

1

55

45

35

25

44

34

24

14

33

23

13

03

22

12

02

12

11

01

11

21

−−−

−−

OOOO

MM

MM

zxzx

zxzxzxzxzx

xxx

11

55

44

33

22

11

0

1

2

−−

=

M

0

H

Form of the Batch Measurement Equations

fluxesconcentrations

Transport basis functions

Page 11: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

Batch least-squares

Optimal fluxes found by minimizing

wheregiving

)()()()( 11oo

To

TJ xxPxxzHxRzHx −−+−−= −−

111ˆˆ

111ˆˆ

11111

....

)()ˆˆ(

)()(ˆ

−−−

−−−

−−−−−

+=

+==

++=

oT

oTT

ooT

oT

or

ddE

PHRHP

PHRHxxP

xPzRHPHRHx

xx

xx

)();( Tooo

T ddEddE xxPzzR ==

Page 12: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

Monthly fluxes

Deseasonalized fluxes

IAV (subtract mean)

IAV w/ errors

Calculation of Interannual Variability (IAV) -- Europe (Region #11)

13 transport models

[PgC/year]

Page 13: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

Total flux IAV (land + ocean) [PgC/yr]

Page 14: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

[PgC/yr]OceanFlux IAV

Land flux IAV

Page 15: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

[PgC/yr]

Page 16: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

Flux IAV [PgC/yr],11 land regions

Page 17: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

Flux IAV [PgC/yr],11 ocean regions

Page 18: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

“Representativeness” or “Discretization” Error

• Incorrect assumptions about space/time patterns across large space/time blocks can cause large biases in inversions

• To fix this, solve at as fine a resolution possible, and transform final estimate to coarser scales post-inversion, if necessary

• Sounds good, but poorly-known correlations must still be used in the fine-scale inversions to constrain the results

• Resolution of robust results ultimately set by measurement density and precision

Page 19: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

Computational considerations

• Transport model runs to generate H:– 22 regions x 16 years x 12 months x 36 months =

152 K tracer-months (if using real winds)– 22 x 12 x 36 = 9.5 K tracer-months (climatological

winds)• Matrix inversion computations: O(N3)

– N= 22 regions x 16 years x 12 months = 4.4 K• Matrix storage: O (N*M) --- 66 MB

– M = 78 sites x 16 years x 12 months = 15 K

Page 20: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

Present FutureNew measurements augmenting the weekly flasks:• More continuous analyzers• More aircraft flights• More towers (both tall and eddy-flux)• Ships/planes of opportunity• Low-cost in situ analyzers• CO2-sondes, tethered balloons, etc.• Satellite-based column-integrated CO2, CO2 profilesHigher frequency, more sensitive to continental airMost importantly: MORE DATA, better coverage

Will permit more detail to be estimated

Page 21: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

Eddy-flux tower sites

Page 22: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

Growth of measurement density and likely solution resolution

• ~22 regions, monthly, ~60 sites, 1987-present: N• ~50 regions, weekly, ~100 sites, c. 2000: 10·N• ~200 regions, daily?, ~200 continuous sites, ~2005: 300·N• ~60000 regions (1°x1°), hourly, ~250 meas/hour,

from satellites launched ~2008: 2,000,000·N

• Upshot: batch least squares not feasible for much longer

Page 23: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

Three approaches to handling these computational limitations:

• 1) Break span into smaller blocks (Bousquet, et al 1999) -- 1-2 year spin-up time limits this, due to required overlaps

• 2) Solve in the (smaller) measurement space -- i.e., O(M3) (Peylin, et al 2005)

• 3) Use the adjoint to the transport model to efficiently generate H basis functions in time-reverse mode (Kaminski, et al, 1999; Roedenbeck, et al 2005)

• 2) and 3) allow one to efficiently solve for many fluxes when the number of measurements is small(er), but both break down when both the number of measurements and fluxes are large

Page 24: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

Computational considerations• Transport model runs to generate H:

– 22 regions x 16 years x 12 months x 36 months = 152 K tracer-months (if using real winds)

– 22 x 12 x 36 = 9.5 K tracer-months (climatological winds)• Matrix inversion computations: O(N3)

– N= 22 regions x 16 years x 12 months = 4.4 K• Matrix storage: O (N*M) --- 66 MB

– M = 78 sites x 16 years x 12 months = 15 K

Three key ideas to get around these limitations:• Use sequential estimators instead of batch inversion• Use indirect (iterative) solvers instead of direct ones• Solve for only an approximate covariance, not the full-rank one

Page 25: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

M

M

M

M

M

M

M

M

M

M

M

M

M

M

M

M

HHHHHHHH

HHHHHHHH

HHHHHHHH

HHHH

123

1

1

2

1

3

1

4

1

55

45

35

25

44

34

24

14

33

23

13

03

22

12

02

12

11

01

11

21

−−−

−−

OOOO

MM

MM

zxzx

zxzxzxzxzx

xxx

11

55

44

33

22

11

0

1

2

−−

=

M

0

H

Kalman Filter/Smoother

Page 26: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

M

M

M

M

M

M

M

M

M

M

M

M

M

M

M

M

HHHHHHHH

HHHHHHHH

HHHHHHHH

HHHH

123

1

1

2

1

3

1

4

1

55

45

35

25

44

34

24

14

33

23

13

03

22

12

02

12

11

01

11

21

−−−

−−

OOOO

MM

MM

zxzx

zxzxzxzxzx

xxx

11

55

44

33

22

11

0

1

2

−−

=

M

0

H

Kalman Filter/Smoother

Page 27: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

Kalman Filter Equations

Measurement update step at time k:

Dynamic propagation step from times k to k+1:

Put multiple months of flux in state vector xk, method becomes effectively a fixed-lag Kalman smoother

][

]ˆ[ˆˆ

1)()(

)()()(

)()()(

−−−

−−+

−−+

+=

−=−+=

kT

kkkT

kkkwhere

kkkkk

kkkkkk

RHPHHPK

PHKPPxHzKxx

kT

kkkk

kkk

QPP

xx

+ΦΦ=

Φ=

+−+

+−+

)()(1

)()(1 ˆˆ

Page 28: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

For retrospective analyses, a 2-sided smoother gives more accurate estimates than a 1-sided filter.

The 4-D Var method is 2-sided, like a smoother.

(Gelb, 1974)

Page 29: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

Traditional (full-rank) Kalman filter ensemble Kalman filter

Full KF only a partial fix due to still-large P matrices• Ensemble KF (EnKF): full covariance matrix replaced by

approximation from an ensemble• EnKF can be thought of as a reduced-rank square root

KF with each column of the reduced-rank P1/2 being given by a state deviation (dx) from an ensemble of independently propagated and updated states x

• Matrix operations involving P in the basic KF equations replaced with dot products of state deviations (dxx) summed over the ensemble

• See Peters, et al (2005) for implementation

Page 30: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

Ensemble KF vs. Variational DAEnsemble KF still a relatively new method -- details still being

worked out:– Proper method for introducing measurement and dynamical errors into

ensemble– Proper means for handling time/space correlations in measurements– Proper way to apply it to time-dependent CO2 fluxes

• Variational data assimilation is more developed– Pros:

• Wide use in numerical weather prediction• Based on a minimization -- can always be made to converge

– Con*: -- Requires an adjoint model *Since ensemble smoothers need an adjoint, too, for true retrospective analyses (those for which

a fixed-lag smoother won’t work well), this is not necessarily a con in those cases

Page 31: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

4-D Var Data Assimilation Method

Find optimal fluxes u and initial CO2 field xo to minimize

subject to the dynamical constraint

wherex are state variables (CO2 concentrations),h is a vector of measurement operatorsz are the observations,R is the covariance matrix for z,uo is an a priori estimate of the fluxes,Puo is the covariance matrix for uo,xo is an a priori estimate of the initial concentrations,Pxo is the covariance matrix for xo,

Φ is the transition matrix (representing the transport model), andG distributes the current fluxes into the bottom layer of the model

J = (hi(x i) − z j )T R j

−1(hi(x i) − z j )δi, ji= 0

I

+ (ui − uio)T P

u io

−1(ui − uio)

i= 0

I −1

∑ + (x0 − x0o)T P

x 0o

−1(x0 − x0o)

x i+1 = Φ i+1i x i + G iui ≡ di(x i,ui), i = 0, L ,I −1

Page 32: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

4-D Var Data Assimilation Method

Adjoin the dynamical constraints to the cost function using Lagrange multipliers

Setting F/xi = 0 gives an equation for i, the adjoint of xi:

The adjoints to the control variables are given by F/ui and F/xo as:

The gradient vector is given by F/ui and F/xoo, so the optimal u and x0 may

then be found using one’s favorite descent method. I have been using the BFGS method, since it conveniently gives an estimate of the leading terms in the covariance matrix.

F ≡ J + λ i+1T (di(x i,ui) − x i+1)

i= 0

I −1

i = [Φ i+1i ]T λ i+1 + ∂hi

∂x i

T

R j−1(hi(x i) − z j )δ ij i = I −1,K ,1

λ I = ∂hI

∂x I

T

R j−1(hI (x I ) − z j )δIj ≈ 0

∂F∂ui

T

= G iT λ i+1 + P

u io

−1(ui − uio), i = 0,K ,I −1

∂F∂x0

T

= Φ10T

λ1 + Px 0

o−1(x0 − x0

o) + R j−T (h0 − z j )δ0 j

Page 33: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

F = (hi(x i) − z j )T R j

−1(hi(x i) − z j )δiji= 0

I

∑ + (x0 − x0o)T P

x 0o

−1(x0 − x0o)

+ [(ui − uio)T P

u io

−1(ui − uio) + λ i+1

T (Φ i+1i x i + G iui − x i+1)

i= 0

I −1

∑ ]

let Si ≡ (hi(x i) − z j )T R j

−1(hi(x i) − z j )δij

+(ui − uio)T P

uio

−1(ui − uio) + λ i+1

T (Φ i+1i x i + G iui)

F = (Si − λ i+1T

i= 0

I −1

∑ x i+1) + (x0 − x0o)T P

x 0o

−1(x0 − x0o) + (hI (x I ) − z j )

T R j−1(hI (x I ) − z j )δIj

= (Si − λ iT

i=1

I −1

∑ x i) + S0 + (x0 − x0o)T P

x 0o

−1(x0 − x0o) − λ I

T x I + (hI (x I ) − z j )T R j

−1(hI (x I ) − z j )δIj

∂F∂x I

= 0T ⇒ λ I = ∂hI

∂x I

⎡ ⎣ ⎢

⎤ ⎦ ⎥

T

R j−1(hI (x I ) − z j )δIj

∂F∂x i

= ∂Si

∂x i

− λ iT = 0T i =1,K ,I −1

⇒ λ i = Φ i+1i[ ]

Tλ i+1 + ∂hi

∂x i

⎡ ⎣ ⎢

⎤ ⎦ ⎥

T

R j−1(hi(x i) − z j )δ ij ;

Page 34: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

F = (Si − λ iT

i=1

I −1

∑ x i) + S0 + (x0 − x0o)T P

x 0o

−1(x0 − x0o)

− λ IT x I + (hI (x I ) − z j )

T R j−1(hI (x I ) − z j )δIj

∂F∂ui

= ∂Si

∂ui

= (ui − uio)T P

u io

−1 + λ i+1T G i

⇒ ∇u i= P

u io

−1(ui − uio) + G i

T λ i+1, i =1,...,I −1

∂F∂x0

= ∂S0

∂x0

+ (x0 − x0o)T P

x 0o

−1

= (h0(x 0) − z j )T R j

−1 ∂h0

∂x0

δ0 j + λ1TΦ1

0 + (x0 − x0o)T P

x 0o

−1

⇒ ∇ x 0= Φ1

0[ ]T

λ1 + ∂h0

∂x0

⎡ ⎣ ⎢

⎤ ⎦ ⎥

T

R j−1(h0(x 0) − z j )δ0 j + P

x 0o

−1(x0 − x0o)

with Si ≡ (hi(x i) − z j )T R j

−1(hi(x i) − z j )δij

+(ui − uio)T Pu i

o−1(ui − ui

o) + λ i+1T (Φ i+1

i x i + G iui)

Page 35: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

°

°°

°

0

2

1

3

x2

x1

x3

x0

AdjointTransport

ForwardTransport

ForwardTransport

MeasurementSampling

MeasurementSampling

“True”Fluxes

EstimatedFluxes

ModeledConcentrations

“True”Concentrations

ModeledMeasurements

“True”Measurements

AssumedMeasurement

Errors

WeightedMeasurement

Residuals

/(Error)2

AdjointFluxes=

FluxUpdate

4-D Var Iterative Optimization Procedure

Minimum of cost function J

Page 36: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

BFGS Minimization Approach• Variable metric method -- accumulates 2nd order

information (approximate covariance matrix) as minimization proceeds

• Search direction: sk= Hkk

• Hk approximates [2J/u2]-1= Pu

• Precondition with H0=Pu°

• H never stored, but reconstructed from p and q as needed

Hk +1 = Hk + (1+ qTHkqpTq

) ppT

pTq− (pqTHk + HkqpT ) /pTq

where p ≡ uk+1 − uk; q ≡∇ k +1 −∇ k

Page 37: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

“Soft” Dynamical Constraint

• Allows for dynamical errors• Add dynamical mismatches i to J as

• Note that

• Then solve for i from i and add these to the propagated state

J =K + Δ iTDi

−1Δ ii= 0

I −1

∑where Δ i = x i+1 − (Φ i+1

i x i + G iui)

i+1 = Di−1Δ i

Page 38: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

Atmospheric Transport Model• Parameterized Chemical Transport Model (PCTM; Kawa, et al,

2005)– Driven by reanalyzed met fields from NASA/Goddard’s GEOS-DAS3

scheme– Lin-Rood finite volume advection scheme– Vertical mixing: diffusion plus a simple cloud convection scheme– Exact adjoint for linear advection case

• Basic resolution 2x 2.5, 25 layers, t30 min, with ability to reduce resolution to– 4x 5, t60 min – 6x 10, t120 min < we’ll use this in the example– 12x 15, t180 min

• Measurements binned at t resolution

Page 39: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

Data Assimilation Experiments• Monday: Test performance of 4DVar method in a simulation framework, with dense

data (6°x10°, lowest level of model, every t, 1 ppm (1) error)– Case 1 -- no data noise added, no prior– Case 2 -- w/ data noise added, no prior– Case 3 -- w/ data noise added, w/ prior– Case 4 -- no data noise added, w/ prior

• Tuesday: with Case 3 above, do OSSEs for– Case 5 -- dense data, but OCO column average– Case 6 -- OCO ground track and column average– Case 7 -- extended version of current network– Case 8 -- current in situ network

• Possibilities for projects: examine the importance of– Data coverage and accuracy vs. targeted flux resolution– Prior error pattern and correlation structures– Measurement correlations in time/space– Errors in the setup assumptions (“mistuning”)– Effect of biases in the measurements

Page 40: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

How to run the 4D-Var code• Home directory: /project/projectdirs/m598/dfb/4DVar_Example/scripts/case1/BFGS• Work directories: /scratch/scratchdirs/dfb/case1/work_fwd &

/scratch/scratchdirs/dfb/case1/work_adj• Submit batch job by typing ‘llsubmit runBFGS2_LL’ while in

/project/projectdirs/m598/dfb/4DVar_Example/scripts/case1/BFGS/• This executes the main driver script, found in BFGSdriver4d.F, in same

directory, which controls setting up all the files and running FWD and ADJ inside the minimization loop

• The scripts that execute the FWD and ADJ runs of the model are found in …/4DVar_Example/scripts/case1/, named run.co2.fvdas_bf_fwd(adj)_trupri997_hourly

• Check progress of job by typing ‘llqs’ • Jobs currently set up to do a 1 year-long run (360 days), solving for the fluxes

in 5-day long chunks, at 6.4x10 resolution, with t =2 hours

Page 41: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

How to monitor job while running

• In /scratch/scratchdirs/dfb/case1/work_fwd/costfuncval_history/temp– Column 1 -- measurement part of cost function– Column 2 -- flux prior part of cost function– Column 4 -- total cost function value– Column 10 -- weighted mismatch from true flux– Column 11 -- unweighted mismatch from true flux

• Columns 4, 10, and 11 ought to be decreasing as the run proceeds

• Columns X and Y give the iteration count and 1-D search count

Page 42: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

How to view detailed results

• A results file in netCDF format written to: /scratch/scratchdirs/dfb/case1/work_fwd/estim_truth.nc

• sftp this to davinci.nersc.gov (rename it, so that you don’t overwrite another group’s file)

• Pull up an X-window to davinci and ssh -X davinci.nersc.gov

• On davinci, ‘module load ncview’• Then ‘ncview estim_truth.nc’• Click on a field to look at it• Hint: set ‘Range’ to +/- 2e-8 for most fields

Page 43: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

Other code details• The code for the FWD and ADJ model is in

../4DVar_Example/src_fwd_varres and src_adj_varres• Measurement files are located in

/scratch/scratchdirs/dfb/case1/meas• Two files controlling the tightness of the prior and whether or

not noise is added to the measurements are /scratch/scratchdirs/dfb/case1/work_fwd/ferror and /scratch/scratchdirs/dfb/case1/work_fwd/measnoise_on.dat

Page 44: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

Monday’s Experiment• 2-hourly measurements in the lowest model level at 6.4x 10, 1

ppm error (1) • Iterate 30 descent steps, 1-year-long run, starts 1/1• 4 cases

– Case 1 -- No measurement noise added, no prior– Case 2 -- Add measurement noise added, no prior– Case 3 -- Add noise, and apply a prior– Case 4 -- No noise, but apply a prior

• Designed to test the method and understand the impact of data errors and the usefulness of the prior

• Case 3 is the most realistic and will be used to do OSSEs for several possible future networks for Tuesday’s problem set

Page 45: David Baker NCAR / Terrestrial Sciences Section 23 July 2006
Page 46: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

Tuesday’s Experiment

• Use Case 3 from above to test more-realistic measurement networks:– Current in situ network– Extended version of current network– OCO satellite– Hourly 6.4x 10 column measurements

• Essentially an “OSSE” (observing system simulation experiment) -- tells you how well your instrument should do in constraining the fluxes. Only gives the random part of the error, not biases

Page 47: David Baker NCAR / Terrestrial Sciences Section 23 July 2006

OCO Groundtrack,Jan 1st

(Boxes at 6x 10)

Across 1 day

5 days 2 days