Mining Turbulence Data
Ivan Marusic
Department of Aerospace Engineering and MechanicsUniversity of Minnesota
Collaborators: Victoria Interrante, George Karypis, Vipin Kumar Graham Candler, Ellen Longmire, Sean Garrick
Acknowledgement: National Science Foundation
Mathematical Challenges in Scientific Data MiningIPAM 14-18 January, 2002
Flow direction
Solid surface
Turbulent Boundary Layer(Flow visualization using Al flakes in water channel)
Outline
• Turbulent boundary layers: introduction and background Need for both simulation and experimental datasets
• Visualization and feature extraction What are the important features? What is to be “data mined”?
• Difficulties with present analysis approach
• New analysis strategy to investigate causal relationships
• Data mining issues and challenges
Flow direction
Solid surface
Turbulent Boundary Layer
Responsible for heat transfer, skin friction (drag), mixing of scalars
Issues in wall turbulence
• Described by Navier-Stokes equations (non-linear PDEs)
• Direct numerical simulation is restricted to low Re (Reynolds number) Re = ratio of inertia to viscous forces (U) No. of simulation grid points ~ (Re)9/4 , Cost ~ (Re)3 Present simulation: Re = O(103), Require Re = O(106)
• Also need experimental datasets to investigate high Re flows
• Better understanding of physics/causal relationships would lead to more accurate modeled simulation tools (CFD) and analytical scaling laws
What features do we extract?
• Flow field information involves in (x,y,z,t) : Velocity u, Pressure p, Temperature , etc
• Good candidate = Coherent vortex structures
Vortex identification using velocity gradient tensor
Flow topology classification
Isosurfaces of:
Decreasing threshold levels
Enstrophy
Discriminant
Volume rendered visualizations( DNS data Re = 700)
Discriminant
Cross-section of “blue” vortex
EXPERIMENTAL WIND TUNNEL FACILITY
PIV SETUP
Kodak Megaplus Cameras
1024 x 1024 pixels
Pulsed Lasers
Nd:YAG
= 15
In-plane Vorticity
In-plane Swirl
Difficulties with present analysis approach
Typical Turbulent Boundary Layer Simulation
• O(108) grid points
• Generates >10 Terabytes per day (every day)
• Write to disk every 1/1000 time steps (99.9% discarded)
• Final database ~1 Terabyte
• All analysis is done after final database is obtained
Present approach
New analysis approach
Some important trigger eventsassociated with drag
• “Bursting”
• High values of Reynolds shear stress (-uw) (associated with momentum transport)
Example of bursting events
N.B. High –uw region
Swirl (|ci|) Reynolds shear stress
Vorticity Wall-normal velocity
20Apr_06 zone1
Consistent with “packets of vortices” (together with other evidence):
SIMPLE SEARCH ALGORITHM
Dual threshold search routine
Define connected region only if 8 neighboring points
To search for ‘Packets of hairpin vortices’, define a region if Positive Vorticity in the bottom and Negative Vorticity in the top..
Additional search for (a) Low streamwise velocity (Low momentum) (b) High Reynolds shear stress
in the adjoining region of patches of vorticity
z+ = 92
All quantities non-dimensionalized usingU and
VORTICITY MOMENTUM
SWIRL STRENGTH
VORTICITY u’w’
z+ = 92
All quantities non-dimensionalized usingU and
VORTICITY u’w’
MOMENTUM
Adrian, Meinhart & Tomkins (2000)
Modeling Data With Graphs Beyond Transactions
Graphs are suitable for capturing arbitrary relations between the various objects.
VertexObject
Object’s Attributes
Relation Between Two Objects
Type Of Relation
Vertex Label
Edge Label
Edge
Data Instance Graph Instance
Frequent Subgraph DiscoveryDiscovery(FSG – Karypis & Kuramochi 2001)
Interesting Patterns Frequent Subgraphs
Discovering interesting patterns
Finding frequent, recurrent subgraphs
Efficient algorithms must be developed that operate and take advantage of the new representation.
Finding Frequent Subgraphs:Input and Output
Problem setting: similar to finding frequent itemsets for association rule discovery
Input Database of graph transactions
Undirected simple graph (no loops, no multiples edges) Each graph transaction has labeled edges/vertices. Transactions may not be connected
Minimum support threshold σ Output
Frequent subgraphs that satisfy the support threshold
Each frequent subgraph is connected.
Finding Frequent Subgraphs:Input and Output
Support = 100%
Support = 66%
Support = 66%
Input: Graph Transactions Output: Frequent Connected Subgraphs
Example
Example of datasets (Database type-B) for investigation using a Frequent Subgraph Discovery scheme:
- PIV data : In-plane swirl S(x,y) for multiple timesteps (with and without trigger signal)
- Full 3D data from simulation
Further Challenges
• Temporally and Spatially evolving structures (objects change)
• Interactions of vortex structures
C
BA
D