challenges and opportunities in using workflow technology

33
INSTITUTE OF NANOTECHNOLOGY KIT – The Research University in the Helmholtz Association www.kit.edu Challenges and Opportunities in using Workflow Technology for Reproducible and Reusable Simulation Protocols

Upload: others

Post on 19-Oct-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Challenges and Opportunities in using Workflow Technology

INSTITUTE OF NANOTECHNOLOGY

KIT – The Research University in the Helmholtz Association www.kit.edu

Challenges and Opportunities in using Workflow Technologyfor Reproducible and Reusable Simulation Protocols

Page 2: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Challenges and Opportunities of Workflow Technology

Multiscale Materials Modelling and Virtual Design @KIT

Challenges in Materials Modelling

Opportunities of Workflow Technology

Challenges of Workflow Technology

9 July, 20192

Page 3: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Multiscale Materials Modelling and Virtual Design @KIT

Challenges in Materials Modelling

Opportunities of Workflow Technology

Challenges of Workflow Technology

9 July, 20193

Challenges and Opportunities of Workflow Technology

Page 4: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Multiscale Materials Modelling and Virtual DesignAG-Wenzel (INT)

9 July 20194

Development and application of methods for multi-scale simulations of nanoscale materials and devices

nanoscale electronics

• single molecule electronics

• organic electronics

carbon based systems

• graphene and carbon nanotubes

Workflows

• Rapid Prototyping with Graphical Workflow editor (SimStack)

Prof. Dr. Wolfgang Wenzel

à materials design and discovery

Page 5: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Multiscale Materials Modelling and Virtual DesignAG-Wenzel (INT)

9 July 20195

Page 6: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Multiscale simulation in Organic Electronics

1. Single molecule parametrization (QM)Geometry optimizationCustomized force-fields

2. Generation of atomistic morphologiesMolecules parametrized on quantum mechanical levelSimulation of physical vapor deposition

3. Calculation of charge hopping ratesFull quantum mechanical electronic structure analysisElectronic couplings, reorganization and orbital energies

4. Charge transport simulationsTime resolved charge carrier/exciton dynamicsIVs, IQEs, carrier balance, quenching, ...

9 July 20196

Content adapted from:

Page 7: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Multiscale Materials Modelling and Virtual Design @KIT

Challenges in Materials Modelling

Opportunities of Workflow Technology

Challenges of Workflow Technology

9 July, 20197

Challenges and Opportunities of Workflow Technology

Page 8: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Challenges - Integration of new employeesNew student in the office

Reproduction of a given result as first task

Application of the solution to another data

set (Bachelor level)

Improvement of the given method (Master

level)

Development of a new method

(PhD Cand. level)

Reality

Bachelor Thesis – 3 months

Barely enough time to get familiar with the

work environment (command line, ssh,

HPC, software)

9 July 20198

Training of new Group Members

Content adapted from:

Page 9: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Challenges – Reproducability/Replicability

MotivationContinue ongoing projectsBuild on work of others (Publications)Use existing data for Benchmarking/Datamining (Repository)Increase impact of own publications (citations)

9 July 20199

Reproducibility

xkcd.com Content adapted from:

Page 10: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Quality of Published Data (Journal/Database)

9 July 201910

Scientifically sound

Scientifically sound but missing

metadata

missing replicas

methodologigal errors

works only in one lab

Scientifically sound

vs.

fictive data

Challenges – Reproducability/Replicability

Page 11: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology9 July 201911

Friederich, Advanced Materials (2017)

Schaarschmidt, PLOS ONE (2014)

Preparation of input data

Missing steps/

customizations

e.g. Data sources

Hardware

Compiler options

Software Versions

à For Reproducibility, all

important aspects of a simulation

need to be captured

Challenges – Reproducability/Replicability

Page 12: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

4. Challenges - Scalability

RequirementsAvailability of ResourcesAutomatization of the simulation protocolHomogeneous input data

9 July 201912

Scalability

xkcd.com

Page 13: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Challenges - Competence Drain

Highly specialized employee (PhD cand., etc.) implements method

Results in highly specialized hard to use software tool

Employee leavesKnowledge about usage of software tool leaves with employeeUsage/Maintenance/Support of software tool hard to impossible

9 July 201913

//Peter wrote this, nobody knows what it does, don't change it!Content adapted from:

Page 14: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Challenges - InteroperabilitySoftware needs to be inter-operable

Multi-scale environmentsData-sharing

Software DevelopmentSoftware needs to be usable by other groupsSoftware aggregates need to be

preparedsharedarchivedpresented (Deliverables)

9 July 201914

Interoperability

//When I wrote this, only God and I understood what I was doing//Now, God only knows

Content adapted from:

Page 15: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Challenges – Scientific Group

9 July 201915

Training of new group members

Interoperability

Reproducibility

Scalability

Workflows

Content adapted from:

Page 16: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Multiscale Materials Modelling and Virtual Design @KIT

Challenges in Materials Modelling

Opportunities of Workflow Technology

Challenges of Workflow Technology

9 July, 201916

Challenges and Opportunities of Workflow Technology

Page 17: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Workflows

9 July 201917

A workflow represents the coordinated execution of repeatable computational steps while accounting for dependencies and concurrency of tasks.

https://github.com/MD-Studio/MDStudio/

Focus

• Automation• Speedup• Provenance• Innovation

Components

• Applications• (Web)services/Utilities• Libraries• Data• Control Elements

Abstraction Level

• Command Line• Script based• Workflow file• GUI

Backend

• HPC Clusters• Cloud/Grid Resources• Workstation

Page 18: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Workflows

9 July 201918

Focus

• Automation• Speedup• Provenance• Innovation

Components

• Applications• (Web)services/Utilities• Libraries• Data• Control Elements

Abstraction Level

• Command Line• Script based• Workflow file• GUI

Backend

• HPC Clusters• Cloud/Grid Resources• Workstation

Page 19: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

The multiscale simulation workflow for Organic Electronics

1. Single molecule parametrization (QM)Geometry optimizationCustomized force-fields

2. Generation of atomistic morphologiesMolecules parametrized on quantum mechanicallevelSimulation of physical vapor deposition

3. Calculation of charge hopping ratesFull quantum mechanical electronic structureanalysisElectronic couplings, reorganization and orbital energies

4. Charge transport simulationsTime resolved charge carrier/exciton dynamicsIVs, IQEs, carrier balance, quenching, ...

9 July 201919 Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

The multiscale simulation workflow for Organic Electronics

1. Single molecule parametrization (QM)Geometry optimizationCustomized force-fields

2. Generation of atomistic morphologiesMolecules parametrized on quantum mechanicallevelSimulation of physical vapor deposition

3. Calculation of charge hopping ratesFull quantum mechanical electronic structureanalysisElectronic couplings, reorganization and orbital energies

4. Charge transport simulationsTime resolved charge carrier/exciton dynamicsIVs, IQEs, carrier balance, quenching, ...

7 July 20193

Content adapted from:

Page 20: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

The multiscale simulation workflow for OrganicElectronics

à Traditional approach to multiscale modeling

High level of complexityof multiscale modeling:

Input preparationData transfer tocomputing resourcesJob submission andmonitoring,

9 July 201920

HPC

Content adapted from:

Page 21: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Translation enabled by SimStack

9 July 201921

Embedded Scientific modules = „WaNos“

Saved workflows forreproducible multiscalesimulations

Connect to remote computationalresources

Define input filesand parameters foreach module

Construct a workflowby drag & drop Content adapted from:

Page 22: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Translation enabled by SimStack

Open to arbitrary software modulesRapid prototyping: 30min to include new modules, 1 h to construct functional workflowsMaximal reusability and scalabilityModule interoperability:

Fully automated, file based data transfer between modulesSchema based data transfer in developmentCompatible with OWL ontologies (e.g. EMMO, once developed)

9 July, 201922

… a generic workflow platform conquering complexity

Content adapted from:

Page 23: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Deposit

Deposit is a Monte-Carlo tool to

generate organic thin-film

morphologies

Complex input language (roughly 80

parameters up front)

Minimum number of input files: 2

Minimum number of parameters,

which you have to actually know: 7

Learning curveLearn bash

Find out about the important parameters

Learn ssh

Scp files over

Learn specific qsub commands

Write (or at least adapt) a submission script

9 July 201923

Workflows @INT - Example Deposit

Content adapted from:

Page 24: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

The multiscale simulation workflow for Organic Electronics

9 July 201924

HPC

Translation by experts intoReusable workflow elements

Content adapted from:

Page 25: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Incorporation of Modules (Workflow Active Nodes - WaNos)

9 July 201925

Content adapted from:

Page 26: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology9 July 201926

Incorporation of Modules (Workflow Active Nodes - WaNos)

Content adapted from:

Page 27: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Deposit

Deposit is a Monte-Carlo tool to generate organic thin-film morphologies

Complex input language (roughly 80 parameters up front)

Minimum number of input files: 2

Minimum number of parameters, which you have to actually know: 7

9 July 201927

Workflows @INT - Example Deposit

Content adapted from:

Page 28: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

The multiscale simulation workflow for Organic Electronics

1. Single molecule parametrization (QM)Geometry optimizationCustomized force-fields

2. Generation of atomistic morphologiesMolecules parametrized on quantum mechanical levelSimulation of physical vapor deposition

3. Calculation of charge hopping ratesFull quantum mechanical electronic structure analysisElectronic couplings, reorganization and orbital energies

4. Charge transport simulationsTime resolved charge carrier/exciton dynamicsIVs, IQEs, carrier balance, quenching, ...

9 July 201928

Content adapted from:

Page 29: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Connection to Compute Resources

9 July 201929

HPC

Middleware actively developed in FZ JülichHandles User AuthenticationCan connect to all common schedulers

Handles data transfer between workflow stepsSetup in < 1h with installer provided by Nanomatch

Content adapted from:

Page 30: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Multiscale Materials Modelling and Virtual Design @KIT

Challenges in Materials Modelling

Opportunities of Workflow Technology

Challenges of Workflow Technology

9 July, 201930

Challenges and Opportunities of Workflow Technology

Page 31: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Reproducability of WorkflowsWhat needs to be captured?

Software, Tools, Scripts

WaNos

Workflow templates

Executed Workflows

Author

How to capture it

Containerization (e.g. udocker)

Versioning

Tags

Repositories

9 July 201931

https://github.com/indigo-dc/udocker

Page 32: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

Interoperability of workflow frameworks

9 July 201932

It is unlikely that one workflow framework will meet all the needs of the community

à Shared data Formats and/or Converters will be required

Page 33: Challenges and Opportunities in using Workflow Technology

Institute of NanotechnologyJörg Schaarschmidt - Challenges and Opportunities in using Workflow technology

AcknowledgementsProf. Dr. Wolfgang WenzelDr. Ivan Kondov (SCC)

Dr. Timo StrunkDr. Tobias Neumann

9 July 201933

Thank You!

[email protected]

http://www.nanomatch.com

http://www.simstack.eu