community accessible datastore of high-throughput ...datasys.cs.iit.edu/events/mtags12/s05.pdfof...

22
Community Accessible Datastore of High-Throughput Calculations: Experiences from the Materials Project Dan Gunter, Shreyas Cholia, Anubhav Jain, Michael Kocher, Kristin Persson, Lavanya Ramakrishnan Shyue Ping Ong, Gerbrand Ceder

Upload: others

Post on 27-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Community Accessible Datastore of High-Throughput ...datasys.cs.iit.edu/events/MTAGS12/s05.pdfof High-Throughput Calculations: Experiences from the Materials Project! Dan Gunter, Shreyas

Community Accessible Datastore of High-Throughput Calculations: Experiences from the Materials

Project Dan Gunter, Shreyas Cholia, Anubhav Jain, Michael Kocher, Kristin Persson,

Lavanya Ramakrishnan

Shyue Ping Ong, Gerbrand Ceder

Page 2: Community Accessible Datastore of High-Throughput ...datasys.cs.iit.edu/events/MTAGS12/s05.pdfof High-Throughput Calculations: Experiences from the Materials Project! Dan Gunter, Shreyas

BACKGROUND

November  12,  2012   Slide  1  

Page 3: Community Accessible Datastore of High-Throughput ...datasys.cs.iit.edu/events/MTAGS12/s05.pdfof High-Throughput Calculations: Experiences from the Materials Project! Dan Gunter, Shreyas

November  12,  2012   2  

Our energy future relies on the rapid development of novel functional materials.

But it takes almost twenty years to develop new materials. How can we do it faster?

Solar cells, advanced batteries, TCOs, and fuel cells will all play a role in our energy future.

Page 4: Community Accessible Datastore of High-Throughput ...datasys.cs.iit.edu/events/MTAGS12/s05.pdfof High-Throughput Calculations: Experiences from the Materials Project! Dan Gunter, Shreyas

Materials Genome Initiative

November  12,  2012   3  

June  2011:  Materials  Genome  Ini/a/ve  which  aims  to  “fund  computa(onal  tools,  so-ware,  new  methods  for  material  characteriza2on,  and  the  development  of  open  standards  and  databases  that  will  make  the  process  of  discovery  and  development  of  advanced  materials  faster,  less  expensive,  and  more  predictable”  

Source:  "Materials  Genome  IniBaBve  for  Global  CompeBBveness"  hFp://www.whitehouse.gov/sites/default/files/microsites/ostp/materials_genome_iniBaBve-­‐final.pdf  

Page 5: Community Accessible Datastore of High-Throughput ...datasys.cs.iit.edu/events/MTAGS12/s05.pdfof High-Throughput Calculations: Experiences from the Materials Project! Dan Gunter, Shreyas

It's the , stupid!

November  12,  2012  

Really  hard  work  on  some  computaBons  

FantasBc  paper  in  a  journal  

Really  hard  work  on  some  computaBons  

FantasBc  paper  in  a  journal  

Black  Hole  

data

data

Drink  margaritas  

FantasBc  paper  in  a  journal  

DB  

data

Brilliant  analysis  

Brilliant  analysis  

Brilliant  analysis  

Escape  velocity?  

data data

data

data

Page 6: Community Accessible Datastore of High-Throughput ...datasys.cs.iit.edu/events/MTAGS12/s05.pdfof High-Throughput Calculations: Experiences from the Materials Project! Dan Gunter, Shreyas

Very specialized skill-set

November  12,  2012   5  

Physics  

Deep  dive  on  specific  soYware  

Computer  Science  

Really hard work on

some computations

Page 7: Community Accessible Datastore of High-Throughput ...datasys.cs.iit.edu/events/MTAGS12/s05.pdfof High-Throughput Calculations: Experiences from the Materials Project! Dan Gunter, Shreyas

Example

November  12,  2012   6  

Predicted and measured performance of of Li9V3(P2O7)3(PO4)2 during cell cycling.

The Materials Project used quantum chemistry calculations to screen over 20,000 materials as potential cathodes for Li ion batteries. From the results, three new materials were identified, tested, and currently have patents pending.

Page 8: Community Accessible Datastore of High-Throughput ...datasys.cs.iit.edu/events/MTAGS12/s05.pdfof High-Throughput Calculations: Experiences from the Materials Project! Dan Gunter, Shreyas

COMPONENTS

November  12,  2012   7  

Page 9: Community Accessible Datastore of High-Throughput ...datasys.cs.iit.edu/events/MTAGS12/s05.pdfof High-Throughput Calculations: Experiences from the Materials Project! Dan Gunter, Shreyas

November  12,  2012   8  

Parallel computation

Parallel HPC resources

Datastore

Data dissemination

Collaborative toolsWeb

server

Analysis library

Science apps

Data V&V

Midrange compute resources

Workflow

HPC storage

Data

Data analytics

Page 10: Community Accessible Datastore of High-Throughput ...datasys.cs.iit.edu/events/MTAGS12/s05.pdfof High-Throughput Calculations: Experiences from the Materials Project! Dan Gunter, Shreyas

NoSQL Datastore

November  12,  2012   9  

Powerful but simple query language Ease of administration Good performance on read-heavy workloads where most of the data can fit into memory. Poor performance at huge scale Bad for write-heavy workloads

Page 11: Community Accessible Datastore of High-Throughput ...datasys.cs.iit.edu/events/MTAGS12/s05.pdfof High-Throughput Calculations: Experiences from the Materials Project! Dan Gunter, Shreyas

FireWorks workflow engine

November  12,  2012   10  

Programmability. Scripting, not GUIs and DSL’s. Administration overhead. No extra servers. Flexibility. DB support, reconfiguring running workflows.

Re-runs Detours Duplicates Iteration

Why?!

Page 12: Community Accessible Datastore of High-Throughput ...datasys.cs.iit.edu/events/MTAGS12/s05.pdfof High-Throughput Calculations: Experiences from the Materials Project! Dan Gunter, Shreyas

Dissemination with REST

November  12,  2012   11  

https://www.materialsproject.org/rest/v1/materials/Fe2O3/vasp/energyPreamble Version Application I.D. Datatype Property

Page 13: Community Accessible Datastore of High-Throughput ...datasys.cs.iit.edu/events/MTAGS12/s05.pdfof High-Throughput Calculations: Experiences from the Materials Project! Dan Gunter, Shreyas

Web UI

November  12,  2012   12  

3-D model of unit cell

Disqus comment

button

Detailed structure

X-ray diffraction

pattern (interactive)

Bandstructure and Density of

states (interactive)

Calculation iterations

Comments

Page 14: Community Accessible Datastore of High-Throughput ...datasys.cs.iit.edu/events/MTAGS12/s05.pdfof High-Throughput Calculations: Experiences from the Materials Project! Dan Gunter, Shreyas

November  12,  2012   13  

WE'RE DOING IT WRONG?

Page 15: Community Accessible Datastore of High-Throughput ...datasys.cs.iit.edu/events/MTAGS12/s05.pdfof High-Throughput Calculations: Experiences from the Materials Project! Dan Gunter, Shreyas

Running on HPC

•  Batch queues and large numbers of jobs with unpredictable runtimes

•  Talking to the database

November  12,  2012   14  

Page 16: Community Accessible Datastore of High-Throughput ...datasys.cs.iit.edu/events/MTAGS12/s05.pdfof High-Throughput Calculations: Experiences from the Materials Project! Dan Gunter, Shreyas

Data analytics

•  Scaling community contributions to code •  Scaling analytic functions

November  12,  2012   15  

Page 17: Community Accessible Datastore of High-Throughput ...datasys.cs.iit.edu/events/MTAGS12/s05.pdfof High-Throughput Calculations: Experiences from the Materials Project! Dan Gunter, Shreyas

Data V&V

•  Loading new data into a production resource •  Constant validation and verification

November  12,  2012   16  

Page 18: Community Accessible Datastore of High-Throughput ...datasys.cs.iit.edu/events/MTAGS12/s05.pdfof High-Throughput Calculations: Experiences from the Materials Project! Dan Gunter, Shreyas

Data dissemination

•  Security and privacy •  Query performance

November  12,  2012   17  

Page 19: Community Accessible Datastore of High-Throughput ...datasys.cs.iit.edu/events/MTAGS12/s05.pdfof High-Throughput Calculations: Experiences from the Materials Project! Dan Gunter, Shreyas

FUTURE WORK

November  12,  2012   18  

Page 20: Community Accessible Datastore of High-Throughput ...datasys.cs.iit.edu/events/MTAGS12/s05.pdfof High-Throughput Calculations: Experiences from the Materials Project! Dan Gunter, Shreyas

Opening up data access

November  12,  2012   19  

Page 21: Community Accessible Datastore of High-Throughput ...datasys.cs.iit.edu/events/MTAGS12/s05.pdfof High-Throughput Calculations: Experiences from the Materials Project! Dan Gunter, Shreyas

November  12,  2012   20  

Compute properties

Stability and

synthesis

Materials Project Source

ideas

User sandboxes

MP Workflow

(b)

(a)

(c)

(d)

(e)

pymatgen

MP

datastore

(f)

Towards materials design

Page 22: Community Accessible Datastore of High-Throughput ...datasys.cs.iit.edu/events/MTAGS12/s05.pdfof High-Throughput Calculations: Experiences from the Materials Project! Dan Gunter, Shreyas

Questions?

November  12,  2012   21