using!lasso!to!infer!ahigh2order!eddy!viscosity!model!for!k2 ε...
TRANSCRIPT
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000. SAND2014-2429C
Using LASSO to infer a high-‐order eddy viscosity model for k-‐ε
RANS simula=on of transonic flows S. Lefantzi, J. Ray, S. Arunajatesan and L. Dechant
SAND2016-4661C
The problem
§ Aim: Develop a predic=ve k-‐ε RANS model for transonic jet-‐in-‐crossflow (JinC) simula=ons
§ Drawback: RANS simula=ons are simply not predic=ve § They have “model-‐form” error i.e., missing physics § The numerical constants/parameters in the k-‐ε model are usually
derived from canonical flows
§ Hypothesis § One can calibrate RANS to jet-‐in-‐crossflow experiments; thereaRer the
residual error is mostly model-‐form error § Due to model-‐form error and limited experimental measurements, the
parameter es=mates will be approximate § We will es=mate parameters as probability density func=ons (PDF)
§ We then address the model-‐form error with an enriched eddy viscosity model for the missing physics
2
The equa=ons § The model
§ Devising a method to calibrate k-‐ε parameters from expt. data
§ Sources of errors § Parameters {C2, C1} are obtained from canonical flows § Cµ is deemed constant throughout the flowfield § Linear stress-‐strain rate rela=onship τij = -‐2/3k δij + µT Sij
§ Called a linear eddy viscosity model (LEVM) 3
∂ρk∂t
+∂∂xi
ρuik − µ +µTσ k
#
$%
&
'(∂k∂xi
)
*+
,
-.= Pk − ρε + Sk
∂ρε∂t
+∂∂xi
ρuiε − µ +µTσε
#
$%
&
'(∂ε∂xi
)
*+
,
-.=
εkC1 f1Pk −C2 f2ρε( )+ Sε
µT =Cµ fµρk2
ε
Target problem -‐ jet-‐in-‐crossflow
§ A canonical problem for spin-‐rocket maneuvering, fuel-‐air mixing etc.
§ We have experimental data (PIV measurements) on the cross-‐ and mid-‐plane
§ Will calibrate to vor=city on the crossplane and test against mid-‐plane
4 −0.06−0.04−0.0200.020.040.06
0
0.05
0.1
Z (m)
Y (m
)
−4000
−3000
−2000
−1000
0
1000
2000
3000
4000
5000
RANS (k-‐ω) simula=ons -‐ crossplane results
§ Crossplane results for stream § Computa=onal results (SST) are too round; Kw98 doesn’t have
the mushroom shape; non-‐symmetric! § Less intense regions; boundary layer too weak
5
Reducing errors § Model-‐form errors
§ The linear turbulent stress – strain rate rela=onship (LEVM) can be enriched with quadra=c and cubic terms (QEVM / CEVM) § Includes terms with vor=city and cross terms with vor=city and strain rate
§ However the high-‐order models have parameters in them § What are the appropriate values for those parameters?
§ Parametric uncertainty § (C2, C1) can be es=mated (somewhat) from experimental data
§ But because of model-‐form errors and limited experimental data, these cannot be es=mated with much certainty
§ We’ll use Bayesian inversion and es=mate them as PDFs § Quan=fies uncertainty in the es=mate of the parameters
§ Calibra=on process § Iden=fy which of the CEVM parameters can actually be es=mated
from experimental data § Then calibrate those along with (C2, C1); call the full set C = (:, C2, C1)
6
Calibra=on details § Aims of the calibra=on
§ Calibrate to a M = 0.8, J = 10.2 interac=on § Learn the form of the high-‐order eddy viscosity model by fiong to
turbulent stresses measurements on the mid-‐plane § Calibrate to crossplane data; check by matching the midplane velocity
profiles
§ Technical challenges § Computa=onal cost of 3D JinC RANS simula=on
§ Replace 3D RANS with a surrogate model i.e., model crossplane streamwise vor=city ω(RANS)
x(y) = f(y; C), f(:; C) is a curve-‐fit § Surrogate model = emulators
§ Arbitrary combina=ons of C may be nonphysical § How to build emulators when C are nonsensical?
§ What func=onal form to use for f(:; C)?
High-‐order eddy-‐viscosity model
§ CraR 95 describes a cubic eddy viscosity (CEVM) model § τij = -‐2/3k δij + CµF(Sij, ε) + c1f1(Sij, Ωij, ε) + c2f2(Sij, Ωij,ε) ….. c7f7(Sij, Ωij, ε) § F(Sij) is linear in Sij, f1(:, :, :) -‐ f3(:, :, :) are quadra=c in Sij & Ωij
§ f4(:, :, :) – f7(:, :, :) are cubic in Sij & Ωij
§ Our experimental data, on the midplane, consists of: § Sij & Ωij obtained from the measured velocity field § τij and k, also measured § ε (dissipa=on rate of turbulent KE) cannot be measured
§ It is approximated by assuming equilibrium of produc=on and dissipa=on of turbulent KE.
§ CraR’s model prescribes {c1 … c7} § Parameter value obtained from a simple, incompressible turning flow § May not be valid for transonic JinC interac=on
8
Es=ma=on of CEVM parameters
§ The 180 measurements that we have may not have info that informs c1 … c7
§ Cast the es=ma=on problem as
§ The first half es=mates x = {ci} that provide CEVM predic=ons near Y § The second half – the λ penalty – tries to set as many ci to zero § Called Shrinkage Regression
§ The penalty λ is the lynchpin § If it is too small, we get over-‐fiong (too many ci survive) § The best way to get λ is via k-‐fold cross-‐valida=on
§ The method for solving the op=miza=on problem is LASSO 9
minx
Y − Ac2
2+λ c
1
k-‐fold cross-‐valida=on § Divide the 180 measurements into 8 “folds” (equal subsets) § Pick a value of λ’
§ Pick fold # 1 as the tes=ng set, folds 2-‐8 as the learning set § Solve the op=miza=on problem (solve for c) using Y constructed from
the learning set § Predict the data in the tes=ng set § Repeat with folds #2, #3 … as the tes=ng sets § Obtain the mean error and error bars for λ’
§ Ul=mately you get error as a func=on of λ§ Pick the λ with min error
§ For higher values of λ, expect to see lots of ci becoming zero § And predic=ve errors becoming large
§ Nomenclature: The norm of difference (Y(obs) – Ac) is called the ‘deviance’ 10
LASSO results
§ CraR explains around 28% of deviance § As log(λ) increases and # of terms retained decreases, CEVM worsens § One gets λmin and λ1se
Looking for a good λ
log(λ) log(λ)
Rel
ativ
e D
evia
nce
No.
of c
oeffi
cien
ts
log(λ) M
ean-
Squa
red
Erro
r
Tabulate coefficients and MSE
Method c1 c2 c3 c4 c5 c6 c7 MSE Craft -0.1 0.1 0.26 -10 0 -5 5 0.662 λmin -0.065 -0.103 1.68 -4.02 5.7 5.4 -3.64 0.386 λ1se 0.0 0.0 0.455 0.0 0 0 0 0.483 LM -0.0789 -0.149 2.02 -5.88 0 6.68 -11.87 0.382
§ ln(λmin) = -‐5.11, ln(λ1se) = -‐1.75 § CraR’s default parameters are changed when we regress it to data
§ Results called ‘LM’
§ When we LASSO the model using λ1se, we’re leR with just 1 quadra=c term § But the model loses much accuracy
§ Let’s choose λ1se. § Provides a simple model, and keeps the Ω2 term
Calibra=on of {c3, C2, C1} § We will calibrate C = (c3, C2, C1)
§ Our model really has a quadra=c eddy viscosity model (QEVM)
§ Approach: § Data: Use vor=city measurements on crossplane to es=mate C
§ Useful measurements available at 225 loca=ons (“probes”)
§ Es=ma=on procedure: Bayesian calibra=on using MCMC § Model: Use surrogate models (emulators) of the RANS simulator
§ Set of 1275 runs in the parameter space C to make the training data § Iden=fy a physically realis=c space R, use SVMs to model R § Make emulators ω(C) = f(c3, C2, C1) with polynomials; valid in R § Use MCMC to create the posterior PDF of C
§ Checking results § Draw 100 samples from the posterior PDF § Develop an ensemble of predic=ons of vor=city and velocity; compare against measurements
13
The Bayesian calibra=on problem • Model experimental values at probe j as ω(j)
ex = ω(j)(C) + ε(j), ε(j) ~ N(0, σ2)
• Given prior beliefs π on C, the posterior density (‘the PDF’) is • P(C|ωex) is a complicated distribution that has to be described/
visualized by drawing samples from it • This is done by MCMC
– MCMC describes a random walk in the parameter space to identify good parameter combination
– Each step of the walk requires a model run to check out the new parameter combination
Λ ωex( j ) |C( )∝ exp −
ωex( j ) −ω ( j ) (C)( )
2
2σ 2
$
%
&&
'
(
))
j∈P∏
P(C,σ |ωex( j ) )∝Λ(ωex
( j ) |C,σ ) π (c3,C2,C1) πσ (σ )
Making emulators -‐ 1
§ Training data § Sample the parameter space C = {c3, C2, C1}; bounds are known § Run RANS models at 1275 samples; save vor=city on cross-‐plane § Select the top 25% of the training runs
§ Call this subspace of R § Keeps us out of non-‐physical parts of the parameter space C
§ Making emulators in R § Model vor=city at probe j ω(j) as a polynomial in C
§ Simplify using AIC; cross validated using repeated random sub-‐sampling (100 rounds) § RMSE in Learning & Tes=ng sets should be equal
§ Accept all surrogate models that have < 10% error 15
ω ( j ) ≅ a0 + a1c3 + a2C2 + a3C1 + a4c3C2 + a5c3C1 + a6C2C1 +.....
Making emulators -‐ 2
§ Emulators with 10% accuracy could only be made for 55 / 224 probes § 90 with large vor=city
(circles) § 55 with emulators (+)
§ Also, the emulators are only applicable in the R sec=on of the parameter space C
Making the informa=ve prior § Our emulators are valid only inside R
in the parameter space C § During the op=miza=on (MCMC) we
have to reject parameter combina=ons outside R (this is our prior belief πprior(C)) § We define ζ(C) = 1, for C in R and ζ(C)
= -‐1 for C outside R § Then the level set ζ(C) = 0 is the
boundary of R § The training set of RANS runs is used
to populate ζ(C) § We have to “learn” the discrimina=ng
func=on ζ(C) = 0 § We do that using support vector
machine (SVM) classifiers 17
Runs in the top 25th percentile
0.06 0.07 0.08 0.09 0.10 0.11 0.12
1.20
1.25
1.30
1.35
1.40
1.45
1.50
1.55
1.71.8
1.92.0
2.1
Cµ
C2
C1
PDFs from the calibra=on
§ About 60,000 MCMC steps to convergence
§ Calibrated values of C quite different from the ones from literature § Ver=cal lines are the
“canonical” values of the parameters
§ Next step § Draw 100 samples from
the posterior distribu=on and perform RANS simula=ons
§ Compare with experimental measurements
QEVM point vortex metrics § Compare measured and
simulated vor=city fields using the circula=on, the centroid and radius of gyra=on of the vor=city distribu=on § Called the “point vortex
metrics”
§ Comparable results using exis=ng LEVM models have 20%-‐70% errors
QEVM PPT predic=ons on midplane
§ Use the 100 RANS simula=ons to obtain velocity field on the mid-‐plane
§ Compare experimental and simulated predic=ons
Conclusions § We are beginning to “fix-‐up” engineering models with observa=onal data
§ Includes both es=ma=ng model parameters and enriching closure models (inferring missing physics in models)
§ Methods are Bayesian; fully probabilis=c inference (of parameters, at least) § Accommodates uncertainty in es=mates due to limited data and shortcomings of
the RANS model (model-‐form error)
§ We can tackle rather complicated problems using Bayesian inference § Computa=onal costs are immense, but only for genera=ng training data § Bri|le – we depend on emulators, which can’t always be made § Can tackle peculiari=es of non-‐physical parameter spaces using informa=ve
priors (classifiers) § Tools and theories: A mixture of sta=s=cs and machine learning
§ Bayesian inference, emulators, shrinkage are conven=onally sta=s=cal § Classifiers etc. are purely ML § As we scale up and confront large data (simulated flowfields etc.) to infer
model-‐form error, expect MapReduce implementa=ons of these tools 21
BONEYARD
22
RANS (k-‐ω) simula=ons – midplane results
§ Experimental results in black § All models are pre|y inaccurate (blue and red lines are the non-‐
symmetric results)
U-‐defect V -‐ velocity
What is MCMC? § A way of sampling from an arbitrary distribu=on
§ The samples, if histogrammed, recover the distribu=on
§ Efficient and adap=ve § Given a star=ng point (1 sample), the MCMC chain will sequen=ally
find the peaks and valleys in the distribu=on and sample propor=onally
§ Ergodic § Guaranteed that samples will be taken from the en=re range of the
distribu=on
§ Drawback § Genera=ng each sample requires one to evaluate the expression for
the density π§ Not a good idea if π involves evalua=ng a computa=onally expensive
model
An example, using MCMC § Given: (Yobs, X), a bunch of n observa=ons § Believed: y = ax + b § Model: yiobs = axi + bi + εi, ε ~ N(0, σ) § We also know a range where a, b and σ might lie
§ i.e. we will use uniform distribu=ons as prior beliefs for a, b, σ
§ For a given value of (a, b, σ), compute “error” εi = yiobs – (axi + bi) § Probability of the set (a, b, σ) = Π exp( -‐ εi2/σ2 )
§ Solu=on: π ( a, b, σ | Yobs, X ) = Π exp( -‐ εi2/σ2 ) * (bunch of uniform priors) § Solu=on method:
§ Sample from π ( a, b, σ | Yobs, X ) using MCMC; save them § Generate a “3D histogram” from the samples to determine which region in the
(a, b, σ) space gives best fit § Histogram values of a, b and σ, to get individual PDFs for them § Es=ma=on of model parameters, with confidence intervals!
MCMC, pictorially § Choose a star=ng point, Pn =
(acurr, bcurr) § Propose a new a, aprop ~
N(acurr, σa) § Evaluate π ( aprop, bcurr | ...) / π ( acurr, bcurr | … ) = m
§ Accept aprop (i.e. acurr <-‐ aprop) with probability min(1, m)
§ Repeat with b § Loop over =ll you have
enough samples a
a
b
b
a
Proposal distribution
“good” values of (a, b)
What is a SVM classifier? § Given a binary func=on y = f(x) as a
set of points (yi, xi), yi = (0, 1) § Find the hyperplane y + Ax = 0 that
separates the x-‐space into y = 0 and y = 1 parts
§ Posed as an op=miza=on problem that maximizes the margin
27
§ In case of a curved discriminator, need a transforma=on first § Achieved using kernels § We use a cubic kernel