INVERSE MODELING TECHNIQUESINVERSE MODELING TECHNIQUES
Daniel J. Jacob
GENERAL APPROACH FOR COMPLEX SYSTEM ANALYSISGENERAL APPROACH FOR COMPLEX SYSTEM ANALYSIS
Construct mathematical “forward” model describing system
As function of limited # of state variables (state vector x)
Solution:
a posteriorix σ
Improve observation
system
Assemble a priori knowledgea ax S
Use model to relate state
variables to observables y( ) yy f x S
Use observations to improve knowledge of x;min ( ) ( ( ), , )J F obs a y ax y f x x x S ,SImprove
model
Improvea priori
EXAMPLE: INVERSE ESTIMATE OF SURFACE COEXAMPLE: INVERSE ESTIMATE OF SURFACE CO22 FLUXES ( FLUXES (xx))
FROM ATMOSPHERIC MIXING RATIO MEASUREMENTSFROM ATMOSPHERIC MIXING RATIO MEASUREMENTS
Fuel consumption data Ecosystem model and data Ocean modeland databottom-up constraint xa (a priori knowledge of fluxes)
Continuity equation
w/ surface fluxes (state vector x) as boundary conditions
Measurements from aircraft,towers, satellites… Observation vector y
( )y
yt
U
y Kx
Jacobian matrixdescribes the CTM
top-down constraint
on x
optimal a posteriori
estimate: fit x to top-down and bottom-up constraints
ChemicalTransportModel (CTM)
BAYES’ THEOREM: FOUNDATION FOR INVERSE MODELSBAYES’ THEOREM: FOUNDATION FOR INVERSE MODELS
( ) ( )( )
( )
P PP
P
y | x xx | y
y
P(x) = probability distribution function (pdf) of xP(y|x) = pdf of y given x
A priori pdfObservation pdf
Normalizing factor (unimportant)
A posteriori pdf
Maximum a posteriori (MAP) is the solution to ( | )P x x y 0
SIMPLE LINEAR INVERSE PROBLEM FOR A SCALARSIMPLE LINEAR INVERSE PROBLEM FOR A SCALARconsider a single measurement used to quantify a single sourceconsider a single measurement used to quantify a single source
(fuel burned) X (emission factor) a priori bottom-up estimate
xa a
Monitoring site measures
concentration y
Forward model gives y = kx
“Observational error” y• instrument• fwd model
y = kx ± y
Bayes’ theorem:
22
2 2
( )( )ln ( | ) ln ( | ) ln ( ) a
y a
x xy kxP x y P y x P x
Max of P(x|y) is given by minimum of cost function
22
2 2
( )( )( ) a
y a
x xy kxJ x
Solution: ˆ ( )a ax x g y kx where g is a gain factor
2
2 2 2
a
a y
k xg
k y
2 2 1/ 2( ) a y
Let x be the true value: (1 ) ay kx x ax a x g
where a is an averaging kernelx
a gkx
GENERALIZATION:GENERALIZATION:CONSTRAINING CONSTRAINING nn SOURCES WITH SOURCES WITH mm OBSERVATIONS OBSERVATIONS
1
n
j ij ii
y k x
Forward model:
A cost function defined as , 1
1 2 21 1, ,
( )( )
( ,... )
n
j ij in mi a i i
ni ja i j
y k xx x
J x x
is generally not adequate because it does not account for correlation between sources or between observations. Need to go to vector-matrix formalism:
1 1( ,... ) ( ,... ) T Tn mx x y y x y y Kx ε
with Jacobian matrix K (elements kij)) and error covariance matrices
-1 1( ) ( ) ( ) ( ) ( )T TJ a a a yx x x S x x y Kx S y Kx
{( )( ) } =E{( ) ( ) }T T a a a yS x x x x S y Kx y Kx
leading to formulation of cost function:
VECTOR-MATRIX REPRESENTATION OF LINEAR VECTOR-MATRIX REPRESENTATION OF LINEAR INVERSE PROBLEMINVERSE PROBLEM
Scalar problem Vector-matrix problem
2
2 2 2
2 2 1/ 2
ˆ ( )
( )
ˆ (1 )
a a
a
a y
a y
a
x x g y kx
k xg
k y
x ax a x g
xa gk
x
Optimal a posteriori solution (retrieval):
1
1
ˆ ( )
( )
ˆ ( )
ˆ ( )
ˆ
T T
T
a a
a a
1 1a
n a
1a
x x G y Kx
G S K KS K S
S K S K S
x Ax I A x Gε
A GK I SS
Gain factor:
A posteriori error:
Averaging kernel:
Jacobian matrix
ˆ G x/ y
K = y/ x sensitivity of observations to true state (fwd model)
Gain matrix sensitivity of retrieval to observations
Averaging kernel matrix ˆ A x/ x sensitivity of retrieval to true state
APPLICATION OF BAYESIAN INVERSIONAPPLICATION OF BAYESIAN INVERSIONTO SATELLITE RETRIEVALSTO SATELLITE RETRIEVALS
Tr (A) gives the number of pieces of info in the profile (1-2 for MOPITT)
Here y is the vector of wavelength-dependent radiances (radiance spectrum);
x is the state vector of concentrations;
forward model y = Kx is the radiative transfer model
Illustrative MOPITT averaging kernel matrix
INVERSE ANALYSIS OF MOPITT AND TRACE-P DATA INVERSE ANALYSIS OF MOPITT AND TRACE-P DATA TO CONSTRAIN ASIAN SOURCES OF COTO CONSTRAIN ASIAN SOURCES OF CO
TRACE-P CO DATA(G.W. Sachse)Bottom-up
emissions(customized for TRACE-P)
Fossil and biofuel Daily biomass burning(satellite fire counts)
GEOS-CHEM Chemical Transport
Model (CTM)
MOPITT CO
Inverse analysis
validation
chemicalforecasts
top-downconstraints
OPTIMIZATION OF SOURCES
Streets et al. [2003] Heald et al. [2003a]
COMPARE TRACE-P OBSERVATIONS WITH CTM COMPARE TRACE-P OBSERVATIONS WITH CTM RESULTS USING RESULTS USING A PRIORI A PRIORI SOURCESSOURCES
Model is low in boundary layer north of 30oN: suggests Chinese source is lowModel is high in free trop. south of 30oN: suggests biomass burning source is high
• Assume that Relative Residual Error (RRE) after bias is removed describes the observational error variance (20-30%)
• Assume that the difference between successive GEOS-CHEM CO forecasts during TRACE-P (to+48h and to + 24 h) describes the covariant error structure (“NMC method”)
Palmer et al. [2003], Jones et al. [2003]
CHARACTERIZING THE OBSERVATIONAL ERROR CHARACTERIZING THE OBSERVATIONAL ERROR COVARIANCE MATRIX FOR MOPITT CO COLUMNSCOVARIANCE MATRIX FOR MOPITT CO COLUMNS
Diagonal elements (error variances)obtained by residual relative error method
Add covariant structure from NMC method
SELECTING THE STATE VECTOR OF CO SOURCESSELECTING THE STATE VECTOR OF CO SOURCESStart from possible16-element vector
Try separating fuelvs. biomass burning sources in 4 regions
Do singular value decomposition of normalized Jacobian 1
2 2ˆy a
1
K S KS
n singular values > 1 identify modes for which obs system gives useful constraints
MOPITT: n = 10
• Don’t separate fuel from biomass burning sources• Merge Korea and Japan
TRACE-P: n = 4
11-componentstate vector
• Don’t separate fuel from biomass burning sources• Merge Korea, Japan, N. China• Merge central and western China• Merge India and Indonesia w/ rest of world
6-componentstate vector
Assume that spatial distribution within region, temporal variation are known (“hard” constraints)
COMPARATIVE INVERSE ANALYSIS OF ASIAN CO SOURCES COMPARATIVE INVERSE ANALYSIS OF ASIAN CO SOURCES USING DAILY MOPITT AND TRACE-P DATA USING DAILY MOPITT AND TRACE-P DATA
• MOPITT and TRACE-P both show underestimate of anthropogenic emissions (40% for China, likely due to under-reporting of industrial coal use)
• MOPITT and TRACE-P both show overestimate of biomass burning emissions in southeast Asia ;very low values from TRACE-P could reflect transport bias
• MOPITT has higher information content than TRACE-P because it observes source regions and Indian outflow
• MOPITT information degrades if data are averaged weekly or monthly • Ensemble modeling of MOPITT data indicates 10-40% uncertainty on retrieved sources
Heald et al. [2004]
CO observations from Spring 2001, GEOS-CHEM CTM as forward model
TRACE-P Aircraft CO MOPITT CO Columns
4 degreesof freedom
10 degreesof freedom
(from validation)
BASIC KALMAN FILTERBASIC KALMAN FILTERTO OPTIMIZE TEMPORAL VARIATION OF SOURCESTO OPTIMIZE TEMPORAL VARIATION OF SOURCES
Consider vector of observations at discrete times yt , to be used in a
sequential manner to optimize the time-evolving source xt
, t 1 t-1x S
• Assume that we have previously obtained a best estimate at time t-1
• Assume a source model xt = Mxt-1 + for the evolution of x from t-1
to t, which gives the a priori value xa of x at time t with an error
Sat = M MT + S
• Run forward model from t-1 to t, optimize xt using the observations yt
• Repeat the process for t+1
S
Filter can also be run backward
ITERATIVE SOURCE ITERATIVE SOURCE OPTIMIZATION IN 4-D VAROPTIMIZATION IN 4-D VAR°
°
°
°
0
2
1
3
x2
x1
x3
x0
Minimum of cost function J
Estimate with a numerical method involving Lagrange multipliers and the model adjoint, rather than analytically
Advantage: computationally efficient for large state vectors
Problem: errors on x are not characterized
J x
APPLICATION OF 4-D VAR TO SYNTHETIC COAPPLICATION OF 4-D VAR TO SYNTHETIC CO22
FLUX INVERSION (D. Baker, NCAR)FLUX INVERSION (D. Baker, NCAR)
4-D VAR allows optimization of the surface flux field on the native grid resolution of the forward model