algorithms, software, and high-resolution science...de-na-0003525. sand2018-12624 pe. also at this...
TRANSCRIPT
Non-hydrostatic dynamics: Algorithms, software, and high-resolution science
Sandia National LaboratoriesPeter Bosler, Andrew Bradley, Oksana GubaAndrew Salinger, Mark TaylorLos Alamos National LaboratoryBalu Nadiga, Xiaoming Sun
SciDAC PI MeetingJuly 17, 2019
Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-NA-0003525. SAND2018-12624 PE.
Also at this meeting• Poster (Tuesday): Characterizing Non-Hydrostatic Effects in the Next-Generation
E3SM Atmosphere Model• Poster (Thursday): Algorithms and Software for Fast E3SM Atmosphere Tracer
Transport
Outline• Primary results & impact
– Performance & accuracy improvements to E3SM Atmosphere (EAM) transport– Fully integrated into E3SM v2– New diagnostic methods for high-resolution experiments
• Methods and highlights– Algorithms and software– High-resolution science: Non-hydrostatic dynamics
• Ongoing & proposed work– Leveraging EAM results for MPAS Ocean– Model fidelity improvements with minimal performance impact– High resolution science campaigns
SciDAC Pilot Phase 1: Before & after
Coupler/Other
Physics/Chem
Dynamics
Tracer Advection
Computation time by component
• Transport speedup (0.25 deg, 40 tracers): ~6x• Dynamical core (v1 dynamics + transport, no physics) speedup: ~3x• Full atmosphere model speedup: ~2.1x
EAM v1 (2015)
Coupler/Other
Physics/Chem
Dynamics
Tracer Advection
Now Available
Computation time by component
E3SM v1 with SL transport (now)
2019: EAM/SL ready for v2
E3SM Atm. Dycore Performance
40 tracers25 km res.72 vertical levels
3.2x speedup (Edison)
• SYPD (higher is better)• Solid: Eulerian SE transport w/lim9• Dashed: Pointwise SL + QLT• Red: Cori (KNL)• Green: Edison (HSW)
6x transport speedup• 3x from SL: larger time step,
smaller data movement requirements
• 2x from MPI (NGD): Reduced communication volume
3x atm. dycore speedup• At scale• Appx. same for KNL
2x full atm. Speedup• Physics now largest component
Full EAM (dyn., trans., & phys.) performance
Total Transport Dynamics0
200
400
600
800
com
pute
tim
e(m
in)
(a) �� = 1�, 5 years
Total Transport Dynamics0
25
50
75
(b) �� = 0.25�, 5 days
EAM v1
EAM/SL
“Max” timers
“Max” timers
2.1x speedup (Anvil, 2700 ranks)
More details in Thursday’s poster session• Comparison of costs vs. resolution • And more!
Unexpected result: ~10% to ~20% speedup in dynamics solver• Better use of fast memory (L3 cache)• procs. spend less time in MPI_Wait
Algorithms & software development• Cell-integrated remap SL
transport• Highlight:
– Conservative multi-moment transport along characteristics for discontinuous Galerkin methods, SIAM J. Sci. Comput., 2019 (Accepted)
• Insight: High-order, remap SL …– Requires at least 1 global collective per
timestep to achieve tracer consistency and shape preservation
– Still lower communication volume than SL flux-form
– Can efficiency be improved further?
0 5 10 15 20 25 30 35 40NQ
0.0
0.2
0.4
0.6
0.8
1.0
nor
mal
ized
wal
ltim
e
SE
JCT
EAM Transport
2.24x speedup (2400 ranks, Mutrino)
Cell-integrated SL vs. Pointwise interpolation SLMethod: Cell-integrated remap Pointwise interpolation
Provides: (1) Conservation(2) Accuracy(5) Efficiency (~2x)
(2) Accuracy(5) Efficiency (~3x)
Needs:Problem A
(3) Shape preservation(4) Tracer consistency
Problem B(1) Conservation(3) Shape preservation(4) Tracer consistency
Cell-integrated SL vs. Pointwise interpolation SL
Method: Cell-integrated remap Pointwise interpolation
Provides: (1) Conservation(2) Accuracy(5) Efficiency (~2x)
(2) Accuracy(5) Efficiency (~3x)
Needs:Problem A
(3) Shape preservation(4) Tracer consistency
Problem B(1) Conservation(3) Shape preservation(4) Tracer consistency
Claim 1: An algorithm that provides locally bounds-preserving shape preservation is sufficient to also provide tracer consistency.
Claim 2: Solving Problem A without losing conservation is equivalent to solving Problem B.
Algorithms & software• Stabilized interpolation SL transport
with Communication-efficient Density Reconstruction (CDR)
• Highlight: – Communication-efficient property
preservation in tracer transport, SIAM J. Sci. Comput. 41(3), C161—C193, 2019.
– Mass conservation, tracer consistency, and shape preservation solved in smallest possible number of global collectives
– Verification: • Improved accuracy in space• Convergence in time with simplified
physics
3.2x speedup (Edison)
EAM Dycore SYPD
504030252015105
�t
10�3
10�2
l 2(p
s)
f0r1
f0r2
f0r4
f4r1
f4r4
O(�t)
• Highlight:– Software release– COMPOSE: Library for
communication-efficient, property-preserving, semi-Lagrangian tracer transport
• Climate validation• Integration for E3SM v2
– CDR/ISL transport– Time step coupling– Already in E3SM Master
PRECT
Algorithms & software
RE
STO
MR
EST
OM
(DJF
)LW
CF
SWC
FN
ETC
FN
ETC
F(D
JF)
PRE
CT
PRE
CT
(DJF
)PS
LTR
EFH
TT8
50U
850
U20
0U
200
(DJF
)O
ME
GA
850
OM
EG
A20
0�1
0
1
2
3
4
5
%
100 ⇥ {mean difference, RMSE}/(max - min)
Relative mean differenceRelative RMSE
EAM v1 vs. EAM/SL
Hi-res science: Physics for hi-res • BER Call for Proposals
March 2019– Aerosol & aerosol-cloud
interaction identified as “crucial” for E3SM v3, v4
– Accurate representation may require additional tracer species
– Multi-tracer efficiency of new SL methods provides significant advantages over E3SM v1
Hi-res science: Quantifying non-hydrostatic effects
• Vertical resolution requirements in hydrostatic vs. non-hydrostatic models (right)
• Multiscale interactions– Physically accurate or numerical
artifacts?• Results: EAM v3, v4 may require
increased number of levels to reduce numerically-triggered small-scale features
Symmetric instability
Latitude Latitude
z (k
m)
z (k
m)
z (k
m)
305.8330.3
281.3
More details in Tuesday’s poster session• Comparison of NH to H with baroclinic
instability• And more …
• Small-scale process• Intensifies rain bands in
extratropical cyclones– Large effects on regional
precip. • Diagnosed by change in sign
of EPV (right)– EPV now in EAM
• Results: EAM-NH capable of representing sym. inst., EAM-H is not
Proposed for Phase 2• Algorithms/Software
– SL Trans. for MPAS Ocn (right)• No degradation in accuracy,
conservation, consistency• Significant performance gains,
esp. BGC– High-order tracer transport
• Improved resolution of small scales
• No timestep penalty
• NH Science– Radiative-Convective Eq.
• RCEMIP contribution– Tropical cyclone climo.– Condensate loading
�tbt
�tbc
�ttrn�trm
�tphys
tn 7! tn + R�tbc tn 7! tn + RC�tbc
(a)
�tbt
�tbc
�ttrn�trm
�tphys
tn 7! tn + R�tbc tn 7! tn + RC�tbc
(b)
barotropic dyn., �tbt
baroclinic dyn., �tbc
remap, �trmp
physics, �tphys
transport, �ttrn
adv. time scale
T , V tends.
tracer tends.
ISL step
MPAS Ocn. Time step coupling
ISL vs. FCT (1d)
Summary• Phase 1 impacts:
– Model speedup @ 0.25 deg: Transport up to 6x, Dycore 3x, EAM 2x– EAM v2 with 120 tracers now runs as fast as v1 with 40 tracers– Aerosol modeling no longer constrained by transport– Flexible time-step coupling methods– COMPOSE Software already employed by other projects (LDRD, SciDAC)– Better understanding of NH effects in EAM, how to diagnose them
Coupler/Other
Physics/Chem
Dynamics
TracerAdvectionNow Available
Atmosphere w/SL Transport• Phase 2 impacts:
– Transfer algorithms & tailor an implementation for MPAS Ocean
– Improve perf. of BGC campaign– Add resolution to tracers without
time step penalty– RCEMIP experiments– Hi-res climatology