vapor:’a’desktop’environmentfor’interac7ve’ exploraon’of ... ·...
TRANSCRIPT
VAPOR: A desktop environment for interac7ve explora7on of large scale CFD simula7on data
John Clyne, Alan Norton Na7onal Center for Atmospheric Research
Boulder, CO USA
This work is funded in part through U.S. Na7onal Science Founda7on grants 03-‐25934 and 09-‐06379, and through a TeraGrid GIG award.
Computa7onal and Informa7on Systems Laboratory Na7onal Center for Atmospheric Research
9/23/2011
FRHPCS Sept. 23, 2011
Computational and Information Systems Laboratory National Center for Atmospheric Research
9/23/2011
Outline
• Problem mo7va7on • VAPOR overview – what makes VAPOR unique?
– Data model – Earth and space sciences focus – Analysis capabili7es
• Laptop demonstra7on • Future direc7ons
Computational and Information Systems Laboratory National Center for Atmospheric Research
9/23/2011
Solar thermal star7ng plume Computed at the dawn of terascale compu7ng
Mark Rast, NCAR/CU, 2003
• 2003 -‐ Simula7on – 6 months run 7me – 504x504x2048 grid – 5 variables (u,v,w,rho,temp) – ~500 7me steps saved – 9 TBs storage (4GBs/var/7mestep) – 112 IBM SP RS/6000 processors
• 2004 -‐ Post-‐processing – 3 months – 3 derived variables (vor7city
components)
• 2004 -‐ Analysis – Abandoned!!!
Why did Mark give up? 1. Analysis tools didn’t scale 2. Analysis tools didn’t have features needed for in-‐depth analysis
• 2006 -‐ Analysis Resumed • 2007 -‐ New Journal Physics publica7on
VAPOR Project Goals • Improve scien7fic produc7vity by facilita7ng interac7ve analysis and
explora7on of the largest numerical simula7on outputs without the need for Herculean interac7ve compu7ng resources
And… • Change role of Advanced 3D Visualiza7on in sciences
From: a scien7fic finale
• Pictures for publica7on and presenta7on • Performed by visualiza7on experts
To: an integral part of the scien7fic discovery process • Visual data analysis aiding inves7ga7on • Performed by scien3sts
Computa7onal and Informa7on Systems Laboratory Na7onal Center for Atmospheric Research
9/23/2011
Computational and Information Systems Laboratory National Center for Atmospheric Research
9/23/2011
Key Components: What makes VAPOR different from other tools?
1. Domain specific: earth and space sciences CFD
2. Data analysis: qualita7ve and quan7ta7ve data interroga7on and manipula7on capabili7es
3. Terabytes from the desktop: operates on terascale sized simula7ons with only desktop compu7ng
Computational and Information Systems Laboratory National Center for Atmospheric Research
9/23/2011
Key component (1) Earth and space sciences CFD focus
• Scien7fic steering commilee guides development – 18 scien7sts from a broad set of disciplines
• Algorithms – General purpose
• E.g. volume rendering, isosurfaces, cumng planes, histograms – Specialized for earth sciences CFD
• E.g. steady and unsteady flow visualiza7on • Geo-‐referenced data (e.g. map projec7ons, lat-‐lon coordinates) • Physically based feature tracking
Computational and Information Systems Laboratory National Center for Atmospheric Research
9/23/2011
Key component (1) Earth sciences CFD focus
• Data types and grids: – Regular grids
• Uniform, terrain following, staggered – Block structured AMR – Spherical (prototype) – Temporal data with non-‐uniform sampling
• Domain specific Graphical User Interface 1. Features you need are there (hopefully!) 2. Features you don’t need are not there
• => improved ease-‐of-‐use
Computa7onal and Informa7on Systems Laboratory Na7onal Center for Atmospheric Research
9/23/2011
Specialized algorithms: Spherical shell data volume rendering
• Simula7on of deep convec7on in convec7on zones of solar-‐like stars
• Grid geometry is a spherical shell, covering all la7tudes and longitudes and spans a depth of 0.72-‐0.96 solar radii
• Non-‐uniform grid spacing in la7tude and radial axes
• Image courtesy of Ben Brown, University of Colorado
Computa7onal and Informa7on Systems Laboratory Na7onal Center for Atmospheric Research
9/23/2011
Specialized algorithms: Magne7c field line advec7on • Combines steady and unsteady
flow integra7on to advect field lines in a 7me-‐varying velocity field
– Algorithm proposed by Aake Nordlund, Neils Bohr Ins7tute and Pablo Mininni, U. Buenos Aires [Mininni et al, 2008]
S SP
Time T
S’
P’
Time T+1
S’
S
S’
Data courtesy Pablo Mininni
Computational and Information Systems Laboratory National Center for Atmospheric Research
9/23/2011
Key component (2) Data analysis
1. Quan7ta7ve informa7on available throughout GUI – E.g. histograms, probes, annota7on, user coordinates
2. GUI supports visualiza5on-‐aided analysis 3. Coupled with IDL® to calculate and visualize derived quan77es in
region-‐of-‐interest – Immediate analysis applied to data iden7fied in visualiza7on – Immediate visualiza7on of derived quan77es calculated in IDL
• Iden7fy region of interest • Export to IDL session • Import result into visualiza7on
4. Integrated Python (numpy/scipy) calcula7on engine
Computational and Information Systems Laboratory National Center for Atmospheric Research
9/23/2011
Key Component (3) Terabyte data handling from a desktop PC (or laptop)
• Progressive data access – Permit speed/quality tradeoffs – Region of Interest (ROI) iden7fica7on and isola7on
• Two wavelet-‐based models: – VDC1: mul7resolu7on – VDC2: mul7resolu7on and coefficient priori7za7on
Combina7on of visualiza7on, ROI isola7on, and mul7resolu7on data representa7on that provides sufficient data reduc3on to enable interac7ve work
Visual data browsing
Data manipulation
Quantitative analysis
Refine
Coarsen Think Google Earth!!!
Compu7ng technology performance increases from 1977 to 2006
Orders of magnitude difference between improvements in CPU speed and IO bandwidth
Balance between compute and IO is changing rapidly
Moore’s Law does not apply to these!!!
Computa7onal and Informa7on Systems Laboratory Na7onal Center for Atmospheric Research
9/23/2011
Storage capa
city & bandwid
th
What does this mean for data analysis and visualiza3on?
Computa7onal and Informa7on Systems Laboratory Na7onal Center for Atmospheric Research
9/23/2011
Time
Computational and Information Systems Laboratory National Center for Atmospheric Research
9/23/2011
Defini7on: A system is interac5ve if the 7me between a user event and the response to that event is short enough maintain my full alen7on
If the response 7me is… 1-‐5 seconds : I’m engaged 5-‐60 seconds : I’m tapping my foot 1-‐3 minutes : I’m reading email > 3 minutes : I’ve forgolen why I asked the ques7on!
What is meant by interac5ve analysis? Mark Rast, University of Colorado, 2005
Wait time in seconds for reading a 3D scalar volume
0.01
0.1
1
10
100
1000
10000
100000
512 804 1024 4096 12288
Volume resolution
Tim
in
seco
nd
s
100 MB/sec
400 MB/sec
4000 MB/sec
24,000 MB/sec
Star7ng plume
ES, Kaneda 2003 NSF Track 1 turbulence app.
Desktop
Worksta7on
Vis cluster
10% Jaguar Interac7ve
(7 TBs) (2 GBs) (4 GBs) (262 GBs)
Inadequate IO performance alone can preclude interac3ve work such as visualiza3on and analysis!!! What can be done?
Computa7onal and Informa7on Systems Laboratory Na7onal Center for Atmospheric Research
9/23/2011
Discrete Wavelet Transforms • Discrete Fourier transform
• Discrete Wavelet Transform
– Proper7es • Mul7resolu7on representa7on • Efficient: Linear 7me complexity • Adaptable: Can represent func7ons with discon7nui7es, bounded domains,
and arbitrary topology • Time frequency localiza7on: Many coefficients are zero or close to zero
€
f t( ) =1N
atej 2πnt N
n= 0
N−1
∑ 0 ≤ t ≤ N −1( )
€
f t( ) = c k( )k∑ φk t( ) + d j k( )
j= 0
log2 N
∑k∑ ψ j,k t( )
€
φ t( ) = hφ k( )k∑ 2φ 2t − k( ), k ∈ Z scaling function
€
ψ t( ) = hψ k( )k∑ 2φ 2t − k( ), k ∈ Z wavelet function
Scaling term (coarse representa7on of signal)
Detail term (high frequency components of signal)
Computa7onal and Informa7on Systems Laboratory Na7onal Center for Atmospheric Research
9/23/2011
Wavelet compression and progressive access (VDC1) Frequency trunca3on method
• Truncate “j” parameter of expansion: • Provides coarsened approxima7on at power-‐of-‐two
increments • Good
– Simple – Fast – Maintains structure of original grid
• Bad: – Limited to power-‐of-‐two reduc7ons – Compression quality
€
f t( ) = c k( )k∑ φk t( ) + d j k( )
j= 0
log2 N
∑k∑ ψ j,k t( )
Computa7onal and Informa7on Systems Laboratory Na7onal Center for Atmospheric Research
9/23/2011
• Goal: priori7ze coefficients used in linear expansion
€
f t( ) = anu t( )n= 0
N−1
∑ , original f (t)
€
ˆ f t( ) = amu t( )m= 0
M −1
∑ , M < N( ), compressed f (t)
€
L2 = f (t) − ˆ f (t)2
2
€
L2 = aπ i( )( )2
=2
2
f t( )− ˆ f t( ) , where aπ i( )i= M
N−1
∑ are discarded coefficients
If u(t) (φ(t) and ψ(t) in case of wavelet expansion func7ons) are orthonormal, then
• The error is the sum of the squares of the coefficients we leave out! • So to minimize the L2 error, we simply discard (or delay transfer) the smallest coefficients! • If discarded coefficients are zero, there is no informa7on loss!
€
orthonormal : uk t( ),ul t( ) = u∫ kt( )ul t( )dt =
0, k ≠ l1, k = l$ % &
' ( )
L2 error given by:
Wavelet compression and progressive access (VDC2) Coefficient priori3za3on method
Computational and Information Systems Laboratory National Center for Atmospheric Research
9/23/2011
Solar thermal plume at varying resolu7ons (VDC1) [M. Rast, 2006]
5042x2048 (na7ve)
2522x1024 1262x512 632x256
What have we lost???
Magne7c field line integra7on resolu7on comparison (VDC1)
15363 7683
3843 1923
• 15363 MHD Simula7on
• 4th order Runge-‐Kule
• Mininni et al. (2007)
Computa7onal and Informa7on Systems Laboratory Na7onal Center for Atmospheric Research
9/23/2011
100:1 Compression 10243 Taylor-‐Green turbulence (enstrophy field) [P. Mininni, 2006]
No compression Coefficient priori7za7on (VDC2) Computa7onal and Informa7on Systems Laboratory Na7onal Center for Atmospheric Research
9/23/2011
40963 Homogenous turbulence simula7on output visualized with VAPOR 2.0 Volume rendering of original enstrophy field and 800:1 compressed field
Computa7onal and Informa7on Systems Laboratory Na7onal Center for Atmospheric Research
Data provided by P.K. Yeung at Georgia Tech and Diego Donzis at Texas A&M
Original: 275GBs/field 800:1 compressed: 0.34GBs/field
9/23/2011
Live demos on a laptop
MHD decay [Mininni et al., PRL 97, 244503 (2006)] – 15363, pseudo-‐spectral method
– 12GBs/variable/7me-‐step – Exhibits new finding in MHD: current
“folding” and “roll-‐up”
Compressible convec7on
simula7on [Rast, et al, 2000] – 5122x256 – horizontally periodic, dimensions
6x6x1(deep) constant heat flux into the bolom, constant temperature on top
Computa7onal and Informa7on Systems Laboratory Na7onal Center for Atmospheric Research
9/23/2011
Computa7onal and Informa7on Systems Laboratory Na7onal Center for Atmospheric Research
9/23/2011
Future direc7ons
• Broadening scien7fic end user community – E.g. Weather researchers, ocean modelers
• Geo-‐referenced 2D & 3D data • Stretched grids • Missing & undefined data
• VAPOR progressive access data model – VAPOR data importers for VisIt and ParaView – Fortran-‐callable, distributed memory (MPI) API
• Extensible architecture – Facilitate 3rd party development
Computa7onal and Informa7on Systems Laboratory Na7onal Center for Atmospheric Research
9/23/2011
VAPOR Summary • Progressive data access
– Enables interac7ve explora7on of massive datasets – Hypothesis may be interac3vely explored with coarsened
data and later validated (perhaps non-‐interac3vely) with na3ve data
• Visualiza7on aided data analysis – Intended to be used by scien7sts, not visualiza7on specialist – Requirements defined by a steering commilee of scien7sts
• Narrow focus: Earth & space CFD simula7ons – Algorithms – Data types
• Emphasis on desktop/laptop pla}orms, not on visualiza7on supercomputers
Computational and Information Systems Laboratory National Center for Atmospheric Research
9/23/2011
Acknowledgements • Steering Commilee
– Benjamin Brown – U. of Wisconsin – Nic Brummell – CU – Gerry Creager – Texas A&M – Yuhong Fan -‐ NCAR, HAO – Aimé Fournier – NCAR, IMAGe – Pablo Mininni -‐ NCAR, IMAGe – Aake Nordlund -‐ University of
Copenhagen – Leigh Orf -‐ Central Michigan U. – Yannick Ponty -‐ Observatoire de la
Cote d'Azur – Thara Prabhakaran -‐ U. of Georgia – Annick Pouquet -‐ NCAR, ESSL – Mark Rast -‐ CU – Duane Rosenberg -‐ NCAR, IMAGe – Malhias Rempel -‐ NCAR, HAO – Geoff Vasil, CU
• Developers – John Clyne – NCAR, CISL – Dan Lagreca – NCAR, CISL – Alan Norton – NCAR, CISL – Kenny Gruchalla – NREL – Victor Snyder – CSM – Kendal Southwick – NCAR, CISL
• Research Collaborators – Kwan-‐Liu Ma -‐ U.C. Davis – Hiroshi Akiba -‐ U.C. Davis – Han-‐Wei Shen -‐ Ohio State – Liya Li -‐ Ohio State
• Systems Support – Joey Mendoza -‐ NCAR, CISL – Pam Gilman -‐ NCAR, CISL
Computational and Information Systems Laboratory National Center for Atmospheric Research
9/23/2011
Ques7ons???
www.vapor.ucar.edu [email protected]