autonomic applications autonomic computational science...

10
Autonomic Computing Tutorial, ICAC 2004 1 Autonomic Applications Autonomic Computational Science and Engineering Manish Parashar The Applied Software Systems Laboratory Rutgers, The State University of New Jersey http://automate.rutgers.edu Salim Hariri High Performance Distributed Computing Laboratory The University of Arizona http://www.ece.arizona.edu/~hpdc ICAC 2004 Autonomic Computing Tutorial May 16, 2004 Autonomic Computing Tutorial, ICAC 2004 2 Computational Modeling of Physical Phenomenon Realistic, physically accurate computational modeling Large computation requirements e.g. simulation of the core-collapse of supernovae in 3D with reasonable resolution (500 3 ) would require ~ 10-20 teraflops for 1.5 months (i.e. ~100 Million CPUs!) and about 200 terabytes of storage e.g. turbulent flow simulations using active flow control in aerospace and biomedical engineering requires 5000x1000x500=2.5 109 points and approximately 107 time steps, i.e. with 1GFlop processors requires a runtime of ~7 106 CPU hours, or about one month on 10,000 CPUs! (with perfect speedup). Also with 700B/pt the memory requirement is ~1.75TB of run time memory and ~800TB of storage. Complex couplings multi-physics, multi-model, multi-resolution, …. Complex interactions application – application, application – resource, application – data, application – user, … Software/systems engineering/programmability volume and complexity of code, community of developers, … scores of models, hundreds of components, millions of lines of code, …

Upload: duongbao

Post on 31-Jan-2018

225 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Autonomic Applications Autonomic Computational Science …automate.rutgers.edu/tutorials/icac04-tutorial-slides/s3a-ac-acse.pdf · Autonomic Computational Science and Engineering

Autonomic Computing Tutorial, ICAC 2004 1

Autonomic Applications Autonomic Computational Science

and Engineering

Manish ParasharThe Applied Software Systems Laboratory

Rutgers, The State University of New Jerseyhttp://automate.rutgers.edu

Salim HaririHigh Performance Distributed Computing Laboratory

The University of Arizonahttp://www.ece.arizona.edu/~hpdc

ICAC 2004 Autonomic Computing TutorialMay 16, 2004

Autonomic Computing Tutorial, ICAC 2004 2

Computational Modeling of Physical Phenomenon

• Realistic, physically accurate computational modeling– Large computation requirements

• e.g. simulation of the core-collapse of supernovae in 3D with reasonable resolution (5003) would require ~ 10-20 teraflops for 1.5 months (i.e. ~100 Million CPUs!) and about 200 terabytes of storage

• e.g. turbulent flow simulations using active flow control in aerospace and biomedical engineering requires 5000x1000x500=2.5·109 points and approximately 107 time steps, i.e. with 1GFlop processors requires a runtime of ~7·106 CPU hours, or about one month on 10,000 CPUs! (with perfect speedup). Also with 700B/pt the memory requirement is ~1.75TB of run time memory and ~800TB of storage.

– Complex couplings• multi-physics, multi-model, multi-resolution, ….

– Complex interactions• application – application, application – resource, application – data, application –

user, …– Software/systems engineering/programmability

• volume and complexity of code, community of developers, …– scores of models, hundreds of components, millions of lines of code, …

Page 2: Autonomic Applications Autonomic Computational Science …automate.rutgers.edu/tutorials/icac04-tutorial-slides/s3a-ac-acse.pdf · Autonomic Computational Science and Engineering

Autonomic Computing Tutorial, ICAC 2004 2

Autonomic Computing Tutorial, ICAC 2004 3

The Grid• The Computational Grid

– Potential for aggregating resources • computational requirements

– Potential for seamless interactions• new applications formulations

• Developing application to utilize and exploit the Grid remains a significant challenge

– The problem: a level of complexity, heterogeneity, and dynamism for which our programming environments and infrastructure are becoming unmanageable, brittle and insecure

• System size, heterogeneity, dynamics, reliability, availability, usability• Currently typically proof-of-concept demos by “hero programmers”

– Requires fundamental changes in how applications are formulated, composed and managed

• Breaks current paradigms based on passive components and static compositions• autonomic components and their dynamic composition, opportunistic interactions, virtual

runtime, …– Resonance - heterogeneity and dynamics must match and exploit the heterogeneous

and dynamic nature of the Grid• Autonomic, adaptive, interactive application on the Grid offer the potential

solutions– Autonomic: context aware, self configuring, self adapting, self optimizing, self healing,...– Adaptive: resolution, algorithms, execution, scheduling, …– Interactive: peer interactions between computational objects and users, data,

resources, …

Autonomic Computing Tutorial, ICAC 2004 4

Autonomic Grid Computing: Seamless Interaction for Global Scientific Investigation

Models dynamically composed.

“WebServices” discovered &

invoked.

Resources discovered, negotiated, co-allocated

on-the-fly. Model/Simulation

deployed

Experts query, configure resources

Experts interact and collaborate using ubiquitous and

pervasive portals

Applications& Services

Model A

Model BLaptop

PDA

ComputerScientist

Scientist

Resources

Computers, Storage,Instruments, ...

Data Archive &Sensors

DataArchives

Sensors, Non-Traditional Data

Sources

Experts mine archive, match

real-time data with history

Real-time data injection (sensors, laboratory

investigations, instruments, data

archives),

Automated mining & matching

Models write into

the archive

Experts monitor/interact with/interrogate/steer models (“what if”

scenarios,…). Application notifies experts of interesting phenomenon.

– Oil Reservoir Design• Applications/Services:

– Reservoir models, economic models, data analysis services, visualization services,…

• Data Sources:– History data archives, real-time market data, real-time oil field data, ….

• Resources: – Instrumented oilfield (sensors/actuators), thin clients, compute servers, data

servers, …• Experts:

– Field engineer, Petroleum engineer, Scientist, Mathematician, Economist, etc.– Computational Epidemiology

• Applications/Services: – Epidemiology model, economic model, weather models/services, …

• Data Sources:– Conventional: Historical data, transmission rates, secondary injection rate,

pathogen characteristics (strain, strength, reproductive rate, virulence), pathogen mutations, temperature wind speed/direction, humidity, vaccine effectiveness

– Unconventional: drug sales, prescription records, … (IBM)• Resources/Instruments:

– Microscopes, sensors, thin clients, compute servers, data servers, …• Experts:

– Emergency worker, epidemiologist, doctor, microbiologist, mathematician, etc …

• Secure/Seamless Interactions– Client – Client -> Collaboration,– Client – Application -> Monitoring/Interaction/Steering,– Application – Application -> Application components, services, ….– Application – Data -> Dynamic Data Injection, Sensor networks,

I/O, Checkpoint/Restart, …– Client – Data -> Visualization, mining, …– Application – Resource -> Staging, dynamic resource allocation,

execution migration• Autonomic Infrastructure

– Autonomous components and dynamic composition, configuration, optimizations

– Self defining, self self-defining, self-configuring, self-optimizing, self-protecting, self-healing, context aware and anticipatory

– “Deductive” computational middleware– Dynamic, rule-based, component configuration, optimization,

and composition, constraint-based dynamic interactions– P2P Grid infrastructure

– Security, dynamic access control, discovery, messaging, resource management, QoS, collaboration, interaction, ….

Page 3: Autonomic Applications Autonomic Computational Science …automate.rutgers.edu/tutorials/icac04-tutorial-slides/s3a-ac-acse.pdf · Autonomic Computational Science and Engineering

Autonomic Computing Tutorial, ICAC 2004 3

Autonomic Computing Tutorial, ICAC 2004 5

A Data Intense Challenge: The Instrumented Oil Field of the Future (UT-CSM, UT-IG, RU, OSU, UMD, ANL)

• Production simulation via reservoir modeling• Monitor production by acquiring time lapse observation

of seismic data• Revise knowledge of reservoir model via imaging and

inversion of seismic data• Modify production strategy using an optimization

criteria

Detect and track changes in data during productionInvert data for reservoir properties

Assimilate data & reservoir properties intothe evolving reservoir model

Use simulation and optimization to guide future production

Autonomic Computing Tutorial, ICAC 2004 6

AutonomicAutonomic OilOil Well Placement (UT-CSM, UT-IG)

• Optimization algorithm: use VFSA (Very Fast Simulated Annealing) – requires function evaluation only, no gradients

• IPARS delivers – fast-forward model (guess->objective function value) – post-processing

• Formulate a parameter space – well position and pressure (y,z,P)

• Formulate an objective function: – maximize economic value Eval(y,z,P)(T)

• Normalize the objective function NEval(y,z,P) so that:

( ) ( )min max Neval y,z,P Eval y,z,P⇔

Page 4: Autonomic Applications Autonomic Computational Science …automate.rutgers.edu/tutorials/icac04-tutorial-slides/s3a-ac-acse.pdf · Autonomic Computational Science and Engineering

Autonomic Computing Tutorial, ICAC 2004 4

Autonomic Computing Tutorial, ICAC 2004 7

Components of the AORO Application

• IPARS : Integrated Parallel Accurate Reservoir Simulator– Parallel reservoir simulation framework

• IPARS Factory– Configures instances of IPARS simulations– Deploys them on resources on the Grid– Manages their execution

• VFSA : Very Fast Simulated Annealing– Optimizes the placement of wells and the inputs (pressure, temperature)

to IPARS simulations. • Economic Modeling Service

– Uses IPARS simulations outputs and current market parameters (oil prices, costs, etc.) to compute estimated revenues for a particular reservoir configuration.

• DISCOVER Computational Collaboratory– Interaction & Collaboration – Distributed Interactive Object Substrate (DIOS)– Collaborative Portals

Autonomic Computing Tutorial, ICAC 2004 8

VFSA Optimizatio

n

PAWN Substrate

IPARS FactoryClients

IPARS Instances

1

Client configures and

launches IPARS

Factory and VFSA

Optimization peers on

resource of choice

IPARS Factory discovers and

initializes VFSA Optimization

Service

2

3

Client can configure IPARS params

4

IPARS Factory gets initial guess from

VFSA Optimization Service launches

IPARS instance on resource of choice

5

IPARS connects to VFSA Optimization

Services and presents revenue

6VFSA

Optimization Service generates

new well placement

One optimal well placement is determined, IPARS Factory

launches IPARS run

7

Scientists/Engineers collaboratively

interact with IPARS

8

Current oil price,

market state, etc.

Page 5: Autonomic Applications Autonomic Computational Science …automate.rutgers.edu/tutorials/icac04-tutorial-slides/s3a-ac-acse.pdf · Autonomic Computational Science and Engineering

Autonomic Computing Tutorial, ICAC 2004 5

Autonomic Computing Tutorial, ICAC 2004 9

AutonomicAutonomic OilOil Well Placement

permeability Pressure contours3 wells, 2D profile

Contours of NEval(y,z,500)(10)

Requires NYxNZ (450)evaluations. Minimumappears here.

VFSA solution: “walk”: found after 20 (81) evaluations

Autonomic Computing Tutorial, ICAC 2004 10

Sample Results

Page 6: Autonomic Applications Autonomic Computational Science …automate.rutgers.edu/tutorials/icac04-tutorial-slides/s3a-ac-acse.pdf · Autonomic Computational Science and Engineering

Autonomic Computing Tutorial, ICAC 2004 6

Autonomic Computing Tutorial, ICAC 2004 11

Computational Modeling of Physical Phenomenon

• Realistic, physically accurate computational modeling– Large computation requirements

• e.g. simulation of the core-collapse of supernovae in 3D with reasonable resolution (5003) would require ~ 10-20 teraflops for 1.5 months (i.e. ~100 Million CPUs!) and about 200 terabytes of storage

• e.g. turbulent flow simulations using active flow control in aerospace and biomedical engineering requires 5000x1000x500=2.5·109 points and approximately 107 time steps, i.e. with 1GFlop processors requires a runtime of ~7·106 CPU hours, or about one month on 10,000 CPUs! (with perfect speedup). Also with 700B/pt the memory requirement is ~1.75TB of run time memory and ~800TB of storage.

– Dynamically adaptive behaviors– Complex couplings

• multi-physics, multi-model, multi-resolution, ….– Complex interactions

• application – application, application – resource, application – data, application – user, …– Software/systems engineering/programmability

• volume and complexity of code, community of developers, …– scores of models, hundreds of components, millions of lines of code, …

Autonomic Computing Tutorial, ICAC 2004 12

A Selection of SAMR Applications

Multi-block grid structure and oil concentrations contours (IPARS, M. Peszynska, UT Austin)

Blast wave in the presence of a uniform magnetic field) – 3 levels of refinement. (Zeus +

GrACE + Cactus, P. Li, NCSA, UCSD)

Mixture of H2 and Air in stoichiometricproportions with a non-uniform temperature field(GrACE + CCA, Jaideep Ray, SNL, Livermore)

Richtmyer-Meshkov - detonation in a deforming tube - 3 levels. Z=0 plane visualized on the right

(VTF + GrACE, R. Samtaney, CIT)

Page 7: Autonomic Applications Autonomic Computational Science …automate.rutgers.edu/tutorials/icac04-tutorial-slides/s3a-ac-acse.pdf · Autonomic Computational Science and Engineering

Autonomic Computing Tutorial, ICAC 2004 7

Autonomic Computing Tutorial, ICAC 2004 13

Autonomic Computational Science and Engineering

Autonomic Computing Tutorial, ICAC 2004 14

Application Runtime Management in V-Grid

Grid Resource Hierarchy

Application Domain Hierarchy

Virtual Grid Resource Autonomic Runtime Manager (ARM)

Loop for each level of Grid/Application hierarchy

V-Grid Monitoring( Self-observation, Context-awareness )

System states (CPU, Memory, Bandwidth, Availability etc.)Application states (Computation/Communication Ratio, Nature of Applications, Application Dynamics)

V-Grid Deduction( Self-adaptation, Self-optimization, Self-healing)

Identify and characterize natural regionsDefine objective functions and management strategyDefine VCUs

V-Grid ExecutionPartition, Map and Tune

NR1 NR2 NR5

VCU10

VCU11 VCU2

1VCUn

1

VCU12 VCU2

2 VCUi2 VCUj

2

Self-

lear

ning

NR: Application Natural RegionsVCU: Virtual Computational Unit

V-Grid

VirtualResource

Unit

SP2 ClusterBeowulf

Linux

VirtualResource

Unit

E10K

Workstation

Beowulf

Workstation

VirtualResource

Unit

IBM SP2

t

NCM

t

NCM

t

NCM

t

NCM

t

NCM

t

NCM

t

NCM

IBM SP2

WAN

Institutional

Divisional/Departmental

ComputingNode

Cap

abili

ty

Page 8: Autonomic Applications Autonomic Computational Science …automate.rutgers.edu/tutorials/icac04-tutorial-slides/s3a-ac-acse.pdf · Autonomic Computational Science and Engineering

Autonomic Computing Tutorial, ICAC 2004 8

Autonomic Computing Tutorial, ICAC 2004 15

Autonomic Runtime Management

Self-Optimization& Execution

Self-Observation

& Analysis

AutonomicPartitioning

Partition/ComposeRepartition/Recompose

VCUVCUVirtual Computation

Unit

VCUVirtualResource

Unit

Dynamic Driver Application

Monitoring &Context-Aware

Services

Application

Monitoring

Service

Resource

Monitoring

Service

Heterogeneous, Dynamic Computational Environment

Natural Region Characterization

PerformancePrediction

Module

CPUSystem

CapabilityModule

MemoryBandwidthAvailability

Access Policy

Resource History Module

System StateSynthesizer

Application State Characterization

Nature of Adaptation

ApplicationDynamics

Computation/Communication

ObjectiveFunction

Synthesizer

Prescriptions

MappingDistribution

Redistribution

Execution

NRM NWM

Normalized Work Metric

NormalizedResource Metric

Autonomic Scheduling

VGTS: Virtual Grid Time SchedulingVGSS: Virtual Grid Space Scheduling

Global GridScheduling

Local GridScheduling

VGTS VGSS VGTS VGSS

Virtual GridAutonomic

RuntimeManager

Current System State

Current Application State

DeductionEngine

DeductionEngine

DeductionEngine

Autonomic Computing Tutorial, ICAC 2004 16

ARMaDA: Application-sensitive Adaptations

• PAC tuple, 5-component metric• Octant approach: app. runtime state• GrACE (ISP), Vampire (pBD-ISP,

GMISP+SP) partitioners• ARMaDA framework

– Computation/communication– Application dynamics– Nature of adaptation

• RM3D, 64 procs on “Blue Horizon”– 100 steps, base grid 128*32*32– 3 levels, RF = 2, regrid 4 steps

Page 9: Autonomic Applications Autonomic Computational Science …automate.rutgers.edu/tutorials/icac04-tutorial-slides/s3a-ac-acse.pdf · Autonomic Computational Science and Engineering

Autonomic Computing Tutorial, ICAC 2004 9

Autonomic Computing Tutorial, ICAC 2004 17

ARMaDA: System-sensitive Adaptations

• System characteristics using NWS• RM3D compressible turbulence

application– 128x64x64 base (coarse) grid– 3 levels, factor 2 refinement

• System/Environment– University of Texas at Austin (32

nodes), Rutgers (16 nodes)

Capacity Calculator

System Sensitive Partitioner

CPU

Memory

Bandwidth

Capacity

Applications

Weights

Partitions

ResourceMonitoringTool

kbkmkpk BwMwPwC ++=

0100200300400500600700800900

Execution time (secs)

4 8 16 32

Number of processors

Non System-SensitiveSystem-Sensitive

430225842427264502924

805.5423.72

Static Sensing (s)

Dynamic Sensing (s)Procs

Autonomic Computing Tutorial, ICAC 2004 18

Autonomic Forest Fire Simulation

High computationzone

Predicts fire spread (the speed, direction and intensity of forest fire front) as the fire propagates, based on both dynamic and static environmental and vegetation conditions.

Page 10: Autonomic Applications Autonomic Computational Science …automate.rutgers.edu/tutorials/icac04-tutorial-slides/s3a-ac-acse.pdf · Autonomic Computational Science and Engineering

Autonomic Computing Tutorial, ICAC 2004 10

Autonomic Computing Tutorial, ICAC 2004 19

Optimization: Optimize metastable nanocompositionbased on atomistic, nanoscale and macroscale properties.

Nanoscale: Model the evolution and interaction of topologically complex nanosize

metastable structures and their effective behavior.

Atomistics: Simulate non-equilibrium solidification process, crystallization, diffusion

and growth

Computational Infrastructure: To develop autonomic computational infrastructures and runtime management techniquesfor scalable parallel/distributing computing, automated interaction and data exchange between scales, real time sensing and computational response, collaborative monitoring and steering.

Microscale: Model the collective behavior of assemblies of nanostructured particles

Interface Kinetics Thermodynamics

CompositionAtomistic Structure

Force Fields

Topological Characteristics of Nanostructured Particles

INPUT: COMPOSITION/PROCESSING

Strength and Ductility ofNanostuctured Particles

Effective Particle Behavior Hardening Mechanisms

OUTPUT: OPTIMIZED METASTABLE NANOCOMPOSITES

Autonomic Design of Nanomaterials

Autonomic Computing Tutorial, ICAC 2004 20

Conclusion

• Autonomic applications are necessary to address scale/complexity/heterogeneity/dynamism/reliability challenges

• AutoMate addresses key issues to enable the development of autonomic Grid applications – ACCORD: Autonomic application framework– RUDDER: Decentralized deductive engine – SESAME: Dynamic access control engine– Pawn: P2P messaging substrate– SQUID: P2P discovery service

• Application scenarios– Autonomic optimization of oil reservoirs– Autonomic runtime management

• More Information, publications, software, conference– http://automate.rutgers.edu– [email protected] / [email protected]– http://www.autonomic-conference.org