pnc, “collaboration: tools and infrastructure” december 7, 2012

27
PNC, “Collaboration: Tools and Infrastructure” December 7, 2012 Michael Frenklach Supported by AFOSR, Fung PrIMe: Integrated Infrastructures for Data and Analysis

Upload: thai

Post on 22-Feb-2016

32 views

Category:

Documents


0 download

DESCRIPTION

PNC, “Collaboration: Tools and Infrastructure” December 7, 2012. PrIMe : Integrated Infrastructures for Data and Analysis. Michael Frenklach. Supported by AFOSR, Fung. Combustion is Central to Energy. IMPACT ON SOCIETY Energy (power plants, car and jet engines, rockets, …) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

PNC, “Collaboration: Tools and Infrastructure”December 7, 2012

Michael Frenklach

Supported by AFOSR, Fung

PrIMe: Integrated Infrastructures for

Data and Analysis

Page 2: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

• IMPACT ON SOCIETY– Energy (power plants, car and jet engines, rockets, …)– Defense (engines, rockets, …)– Environment (pollutants, global modeling, …)– Space exploration– Astrophysics– Material synthesis

• ESTABLISHED PRACTICE OF COLLABORATION– Across different disciplines– Across different countries

• THERE IS AN ACCUMULATING EXPERIMENTAL PORTFOLIO• THEORY/MODELING LINKS FUNDAMENTAL TO APPLIED LEVEL

COMBUSTION IS CENTRAL TO ENERGY

Page 3: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

mechanism of:ignitionlaminar flamesNOx

soot...

500 1000 1500 2000 25000.00010.011100

1

2

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

45 2 17 11 3 9 58 1 29 33 47 4 73 82 5 6 98 …

individual reactions

modelmodel reduction analysis

numerical simulations

experiments theory

sensitivityreaction path…

Page 4: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

Methane Combustion: CH4 + 2 O2 CO2 + 2 H2O

1970’s: 15 reactions, 12 species

1980’s: 75 reactions, 25 species

1990’s: 300+ reactions, 50+ species

Larger molecular-size fuels:2000’s: 1,000+ reactions, 100+ species

2010’s: 10,000+ reactions, 1000+ species

Page 5: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

Methane Combustion: CH4 + 2 O2 CO2 + 2 H2O

The networks are complex, but the governing equations (rate laws) are known

Uncertainty exists, but much is known where the uncertainty lies (rate parameters)

Numerical simulations with parameters fixed to certain values may be performed “reliably”

There is an accumulating experimental portfolio on the system

and yet

Page 6: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

Methane Combustion: CH4 + 2 O2 CO2 + 2 H2O

Lack of predictability

Lack of consensus

but still

Page 7: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

PROBLEMS• current inability of truly predictive modeling

– conflicting data in/among sources– poor documentation of data/models– no uncertainty reporting or analysis– not much focus on integration of data

• resistance to data sharing– no personal incentives– no easy-to-use technology

• no recognition of the problem

Page 8: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

• models are not additive

• data are not additive

• need a system for synthesis of data

Page 9: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

PrIMeProcess Informatics Model

INFRASTRUCTURE FOR UQ-PREDICTIVE MODELING

http://primekinetics.org

Data sharing App sharing Automation

Page 10: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

CURRENT STATUS

• registered members ~400

• countries ~15

• data records ~100,000

• apps ~20

• active “players”− UCB (lead), NSCU, Stanford, MIT, Cambridge, KAUST, Tsinghua

Page 11: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

PrIMe

PortalAssess to distributed resources

User authorizationSocial networking

User forumsData evaluation panels

Help, tutorials, examplesCustomized Drupal

(PHP)platform independent

Workflow“Browser-based” software

User building projectsData/app linking

Binary XML interfacesRemote-server support

Project sharingC#, Windows, IEapps: C#, Matlab

WarehouseData collections

Models and ExperimentsControlled by schemas

Submission formsMultiple-mode accessWebDAV

XML

Page 12: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

DATA ORGANIZATION:

• conceptual abstraction

• practical realization

Page 13: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

Chemical Kinetics Model

Chemical Reactions

Chemical Species

Chemical Elements

composed of

composed of

composed of

haveatomic masses

rate law data -parameter values -uncertainties -reference

have

have

thermo datatransport data

CONCEPTUAL ABSTRACTION: DATA MODEL

Page 14: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

reactions

- -combustion modeling quantum chemistry

diagnosticsthermosciences

thermo molecular structure

spectra absorption coefficient

PRACTICAL OUTCOME:TRANS-DISCIPLINARY COLLABORATION

Page 15: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

PRIME DATA MODEL: EXPERIMENTSData Attribute (QOI, ‘target’)

a specific feature extracted for modeling:

– peak value– peak location– induction time– ratio of peaks(from multiple experiments) …

Experimental Record• reference• apparatus• conditions• observations

– inner: XML– remote: HDF5, …

• uncertainties• additional items

– links, docs, …– video files, …

archival record

VVUQ datain

strum

enta

l mod

el

Page 16: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

Initial Model:“Upload your data to PrIMe Warehouse” (“give me your data”)

New, Distributed Model:“You may, if choose, connect your data to the communal system”• with a switch in the OFF position: “you can use the

communal data and tools but your own data is private to you only”• “but please flip the switch to the ON position when you are

ready to share your own data”

PRIME DATA MODEL

Page 17: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

“Connect your code to the communal system”- you control your own code:• release version• user access, licenses• collect fees, if desired

SAME FOR APPS

Page 18: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

TECHNOLOGY: HOW

Remote server app—PrIMe Web Services (PWS)• no restrictions on platform• no restrictions on data formats• no restrictions on local programming language(s)

PrIMe Workflow Interface (PWI) is the only “standard”• developed, maintained, and controlled by the community

Page 19: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

client machine

client data

PrIMe web services

PrIMe Data Flow Network

PrIMe Dispatcher

Page 20: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

BIG DATAexcessively large data sets• do not move the data

• but use “smart agents” (eg, HTML5 walkers)

web services with user-reloaded tasks:fetch data features for user-requested analysis

Page 21: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

workflow projectuser specifies conditions of interest

workflow component retrieves archived data: a set of relevant targetstarget values and their uncertainty ranges

surrogate models developed for relevant targetsactive variables and their uncertainty ranges

data warehouse

workflow component performs:• retrieves the pertinent kinetics

model (via link in the dataset)• performs simulations on the fly for

the conditions specified and builds a new surrogate model

• performs UQ analysis combining the new surrogate model with the archived ones and the rest of the pertinent data

• reports results

Page 22: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

workflow projectworkflow component performs:• retrieves the pertinent kinetics

model (via link in the dataset)• performs simulations on the fly for

the new data and builds a new surrogate model

• performs UQ analysis combining the new surrogate model with the archived ones and the rest of the pertinent data

• reports results• adds the new data to the dataset

and archives in Warehouse

workflow component retrieves archived data: a set of relevant targetstarget values and their uncertainty ranges

surrogate models developed for relevant targetsactive variables and their uncertainty ranges

data warehouse enrichment

user specifiesa new setof data

Page 23: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

FOCUS ON ANSWERING QUESTIONS:prediction of (un)known observations

Page 24: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

FOCUS ON ANSWERING QUESTIONS:prediction of an (un)known parameter

Page 25: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

FOCUS ON ANSWERING QUESTIONS:prediction of multi-D correlations

Page 26: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

ANSWER QUESTIONS

• What causes/skews model predictiveness?

• Are there new experiments to be performed, old repeated, theoretical studies to be carried out?

• What impact could a planned experiment have?

• What is the information content of the data?

• What would it take to bring a given model to a desired level of accuracy?

Page 27: PNC, “Collaboration: Tools and Infrastructure” December 7, 2012

A PARADIGM SHIFT

from algorithm-centric view

to data-centric view

outputinput codedata data