Transcript
Page 1: Data Science Solutions by Materials Scientists: The Early Case Studies

Data Science Solutions by Materials Scientists

The Early Case Studies

Tony FastMaterials Data AnalystMaterials Informatics for Engineering Design

Woodruff School of Mechanical Engineering

Georgia Institute of Technology

*Any MINED shield is a link to a resource.

Page 2: Data Science Solutions by Materials Scientists: The Early Case Studies

An Archival and Self Describing Data Format using HDF5

Data and Metadata stored in one file, Support in many languages, and Ideal support for high-dimensional data

*MXADataModel – Archival Data Format – ONR/DARPA Dynamic 3-D Digital Structures Program

Page 3: Data Science Solutions by Materials Scientists: The Early Case Studies

HDF5 - The little zip file that could…

One Dataset – 1.6GB – 4 Experiments –with 160 Datasets each…..no long term value.

Page 4: Data Science Solutions by Materials Scientists: The Early Case Studies

Volume

Variety Velocity

= Big DataPolymer - MD Titanium

Jacobs -GaTech

Bamboo

Martensitic Steel SiC/SiC Al-Cu Solidification

Frasier -OSU Wegst - Dartmouth

Gumbusch Ritchie- LLNL Voorhees - NW

Materials Science

The velocity that data is generated will rise and the speed that it will be analyzed in will decrease.

Page 5: Data Science Solutions by Materials Scientists: The Early Case Studies

Rowenhorst, Lewis, Spanos, Acta Mat, 2010

β-Titanium

REDUCED OUTPUT:Grain sizeGrain FacesNumber of GrainsMean CurvatureNearest Grain Analysis

10 micron resolution with 4300 GrainsCompare with empirical models

Materials Science is a Big Data domain, but it is not treated that way.

Page 6: Data Science Solutions by Materials Scientists: The Early Case Studies

Scalable, objective, parametric materials descriptorsManage data with care for the futureInteroperability, Sharing, and CollaborationEducate data scientists who can extract value from data using statistics, computation, and materials domain knowledge

Embrace complexity in big materials

data

Example Databases

AFLOW, Curtarolo Group Harvard Clean Energy Project Database

Page 7: Data Science Solutions by Materials Scientists: The Early Case Studies

STRUCTURE INFORMATICSWORKFLOW

PHYSICS BASED MODELSSIMULATION EXPERIMENT

MICROSTRUCTURE (MATERIAL) SIGNAL

PROCESSING

ADVANCED & OBJECTIVE STATISTICAL ENCODING

DATA SCIENCE MODULES

INNOVATION ACCOUNTING

INTELL

IGEN

T

DESIG

N O

F EX

PER

IMEN

TS

Microstructure Informatics is a scalable, data-driven system to mine structure-property/processing connections from experimental and simulation materials science information; structure being the independent variable. The system is agnostic to material system and length scale, objectively quantifiable, and rapidly iterates in less cycles for both materials improvement and discovery.

Page 8: Data Science Solutions by Materials Scientists: The Early Case Studies

DATA SCIENCEMODULES

MicrostructureMaterial Structure

ProcessingProperty

Data science modules are machine learning and statistical tools to extract rich bi-directional structure-property/processing linkages from encodings of materials & microstructure datasets. Mining modules create structure taxonomies, homogenization and localization relationships, ground truth comparison between simulation and experiment, materials discovery, and materials improvement.

Page 9: Data Science Solutions by Materials Scientists: The Early Case Studies

ADVANCED & OBJECTIVE STATISTICAL ENCODING

THE MICROSTRUCTURE IS A SAMPLE IN AN IMMENSE STATISTICAL POPULATION.

α-β Titanium

Page 10: Data Science Solutions by Materials Scientists: The Early Case Studies

SPATIALSTATISTICS

t t

t

Statistical correlations between random points in space/time which reveal systematic patterns in the microstructure. Contains the original μS within a translation & inversion. An objective encoding for most materials datasets.

Page 11: Data Science Solutions by Materials Scientists: The Early Case Studies

CURRENT APPLICATIONSmetals, polymers, fuel cells, cmc, md, & a bunch of other

things

TYPES OF SIGNALS sparse, experimental, simulation, heterogeneous, surface,

bulk

The fidelity of the spatial statistics are impacted by how the material structure is parameterized as a signal.

Page 13: Data Science Solutions by Materials Scientists: The Early Case Studies

Mechanical Deformation of Polymer Chains

Molecular Dynamics of

Aluminum Atoms

Page 14: Data Science Solutions by Materials Scientists: The Early Case Studies

MPL

GDL

X-CTFinite Element ModelingStatisticsRegression to connect the statistics with diffusivity values from FEM

Bottom-up Homogenization Relationships

exac

t fit

simulation

mod

el

Page 15: Data Science Solutions by Materials Scientists: The Early Case Studies

FEMε=5e-4

Meta-modeling with Materials Knowledge SystemsTop-down localization relationships

The MKS design filters that capture the effect of the local arrangement of the microstructure on the response. The filters are learned from physics based models and can only be as accurate as

the model never better.

Page 16: Data Science Solutions by Materials Scientists: The Early Case Studies

INPUT OUTPUTControl

Meta-modeling with Materials Knowledge SystemsTop-down localization relationships

The MKS design filters that capture the effect of the local arrangement of the microstructure on the response. The filters are learned from physics based models and can only be as accurate as

the model never better.

Any M

odel

Page 17: Data Science Solutions by Materials Scientists: The Early Case Studies

OTHER APPLICATIONSSpinodal Decomposition, Grain Coarsening, Thermo-mechanical, Polycrystalline

Top-Down Localization Relationships for High Contrast Composites

The MKS is a scalable, parallel meta-model that learns from physics based models to enable rapid simulation at a cost in accuracy.

N2 vs. Nlog(N) complexity It learns top-down localization relationships to extra extreme value

events and enables multiscale integration.

Page 18: Data Science Solutions by Materials Scientists: The Early Case Studies

Structure-Processing MKS

Processing History

Structure-Property

Homogenization

Structure-Property

Localization

Objective parametric descriptors and data science enable integrationof bi-direction structure-property/processing linkages.

Page 19: Data Science Solutions by Materials Scientists: The Early Case Studies

Data enables bidirectional S-P/P, multiscale integration, and higher throughput

CORE TECHNOLOGIES TO FUEL THE DATA AGE OF MATERIALS SCIENCE

Open Access, Open Source Software, Scalable Databases, High-Statistical Throughput Simulation and Experiment, Image

Segmentation, Machine Learning, Scalable Databases, Metadata Integration, Mobile Technology, Visualization, High Performance Computing, Cyberinfrastructure/Collaboratories, Collaboration &

Sharing

Page 20: Data Science Solutions by Materials Scientists: The Early Case Studies

Selected Links

Any shield in this presentation is a link

HDF5 http://www.hdfgroup.org/HDF5/whatishdf5.htmlHDFView http://www.hdfgroup.org/hdf-java-html/hdfview/MXADataModel http://mxa.web.cmu.edu/Background.htmlCurtarolo Group http://www.mems.duke.edu/faculty/stefano-curtaroloAFLOW http://materials.duke.edu/apool.htmlHarvard Clean Energy Project http://www.molecularspace.org/Serial Sectioned Titanium https://cosmicweb.mse.iastate.edu/wiki/pages/viewpage.action?pageId=753830MATIN http://www.materials.gatech.edu/matinMaterials Genome Initiative http://www.whitehouse.gov/mgi


Top Related