from provenance standards and tools to queries and actionable provenance

38
From Provenance Standards and Tools to Queries and Actionable Provenance Bertram Ludäscher et al. DataONE (Jones, Budden, Vieglais, … ) SKOPE (Bocinsky, Kintigh, Kohler, McPhillips, ...) KURATOR (Morris, McPhillips, Zhang, ...) WHOLE-TALE (Turk, Stodden, ...) YesWorkflow (McPhillips, ...)

Upload: bertram-ludaescher

Post on 17-Mar-2018

87 views

Category:

Data & Analytics


4 download

TRANSCRIPT

Page 1: From Provenance Standards and Tools to Queries and Actionable Provenance

FromProvenance Standards andTools toQueries andActionableProvenance

BertramLudäscher etal.DataONE (Jones,Budden,Vieglais,…)

SKOPE (Bocinsky,Kintigh,Kohler,McPhillips,...)KURATOR (Morris,McPhillips,Zhang,...)

WHOLE-TALE (Turk,Stodden,...)YesWorkflow (McPhillips,...)

Page 2: From Provenance Standards and Tools to Queries and Actionable Provenance

Provenance(Lineage)matters…

• Oneofthesesoldfor$180M,theotheronefor$22K(butcouldbeworthmore...definitelymaybe...)

• Whichonewouldyouliketoown?

Ludäscher:Queries&ActionableProvenance 2

Page 3: From Provenance Standards and Tools to Queries and Actionable Provenance

Provenance(Lineage)matters…

• Oneofthesesoldfor$180M,theotheronefor…• …$450M!!!Ludäscher:Queries&ActionableProvenance 3

Page 4: From Provenance Standards and Tools to Queries and Actionable Provenance

Provenanceis:keepingrecords …

• GrandCanyon’srocklayersarearecordoftheearlygeologichistoryofNorthAmerica.Theancestralpuebloan granariesatNankoweap Creektellarchaeologistsaboutmorerecenthumanhistory.(ByDrenaline,licensedunderCCBY-SA3.0)

• Notshown:computationalarchaeologistsreconstructingpastclimatefrommultipletree-ringdatabasesè computationalprovenanceiskeyfortransparency &reproducibility

Ludäscher:Queries&ActionableProvenance 4

Page 5: From Provenance Standards and Tools to Queries and Actionable Provenance

...andprovenanceis:Understandingwhathappened!

Zrzavý,Jan,DavidStorch,and StanislavMihulka.Evolution:EinLese-Lehrbuch.

Springer-Verlag,2009.

Author:Jkwchui (BasedondrawingbyTruth-seeker2004)

Ludäscher:Queries&ActionableProvenance 5

Page 6: From Provenance Standards and Tools to Queries and Actionable Provenance

"The government are very keen on amassing statistics. They collect them, add them, raise them to the nth power, take the cube root and prepare wonderful diagrams. But you must never forget that every one of these figures comes in the first instance from the village watchman, who just puts down what he damn pleases.”

Ludäscher:Queries&ActionableProvenance 6

Whyweneeddatalineageandcomputationalprovenance

Page 7: From Provenance Standards and Tools to Queries and Actionable Provenance

Computational Provenance …• Origin,processinghistoryofartifacts– dataproducts,figures,...– also:underlyingworkflowè understandmethods,dataflow,anddependencies

Ludäscher:Queries&ActionableProvenance 7

Climate Change Impacts in the United States

U.S. National Climate AssessmentU.S. Global Change Research Program

Page 8: From Provenance Standards and Tools to Queries and Actionable Provenance

EvolutiontowardstheLivingPaper

• 1st Generation:– narrative (prose)

• 2nd Generation:plus …– name..identify..include(accessto)data

• 3rd Generation:plus …– name..reference..includecode (software)..– andprovenance …andexecenvironment(containers)

Ludäscher:Queries&ActionableProvenance 8

WholeTale

WholeTaleDashboard

Page 9: From Provenance Standards and Tools to Queries and Actionable Provenance

9

DataONE:SearchandProvenanceDisplay

Ludäscher:Queries&ActionableProvenance

Page 10: From Provenance Standards and Tools to Queries and Actionable Provenance

DataONE:SearchandProvenanceDisplay

10Ludäscher:Queries&ActionableProvenance

Page 11: From Provenance Standards and Tools to Queries and Actionable Provenance

DataONE:SearchandProvenanceDisplay

11Ludäscher:Queries&ActionableProvenance

Page 12: From Provenance Standards and Tools to Queries and Actionable Provenance

Adding YesWorkflow to DataONEYaxing’s script withinputs &outputproducts

Christopher’sYesWorkflow

model

ChristopherusingYaxing’s outputsasinputsforhisscript

Christopher’sresultscanbetracedbackall

thewaytoYaxing’sinput

Ludäscher:Queries&ActionableProvenance 12

Page 13: From Provenance Standards and Tools to Queries and Actionable Provenance

RuntimeProvenance(a.k.a.traces,logs,

retrospectiveprovenance,“Trace-land”)

WorkflowModeling&Design(a.k.a.prospective provenance

“Workflow-land”)

Ludäscher:Queries&ActionableProvenance 13

Workflowsó Provenanceanimportantlink!

Page 14: From Provenance Standards and Tools to Queries and Actionable Provenance

14

Trace

Workflow

Data (extensible)

See purl.dataone.org/provone-v1-dev

Page 15: From Provenance Standards and Tools to Queries and Actionable Provenance

ProvenanceSupportforReproducibleScienceExample:PaleoclimateReconstruction

Sciencepaper(OA)uses:• opensourcecode:– R,PaleoCAR,…

• Isthatallweneed?• Whatwasthe“workflow”?

• Isthereprospectiveand/orretrospectiveprovenance?

Ludäscher:Queries&ActionableProvenance 15

Page 16: From Provenance Standards and Tools to Queries and Actionable Provenance

SKOPE:SynthesizedKnowledgeOfPastEnvironmentsBocinsky,Kohleretal.studyrain-fedmaizeof Anasazi

– FourCorners;AD600–1500. ClimatechangeinfluencedMesaVerdeMigrations;late13thcenturyAD.Usesnetworkoftree-ringchronologiestoreconstructaspatio-temporalclimatefieldatafairlyhighresolution(~800m)fromAD1–2000.Algorithmestimatesjointinformationintree-ringsandaclimatesignaltoidentify“best” tree-ringchronologiesforclimatereconstructing.

K.Bocinsky,T.Kohler,A2000-yearreconstructionoftherain-fedmaizeagriculturalnicheintheUSSouthwest.Nature

Communications.doi:10.1038/ncomms6618

… implemented as an R Script … Ludäscher:Queries&ActionableProvenance 16

Page 17: From Provenance Standards and Tools to Queries and Actionable Provenance

YesWorkflow:Prospective&RetrospectiveProvenance…(almost)forfree!

• YWannotationsina(Python,R,…)scriptrecreateaworkflowviewfromthescript…

cassette_id

sample_score_cutoff

sample_spreadsheetfile:cassette_{cassette_id}_spreadsheet.csv

calibration_imagefile:calibration.img

initialize_run

run_logfile:run/run_log.txt

load_screening_results

sample_namesample_quality

calculate_strategy

rejected_sample accepted_sample num_images energies

log_rejected_sample

rejection_logfile:/run/rejected_samples.txt

collect_data_set

sample_id energy frame_numberraw_image

file:run/raw/{cassette_id}/{sample_id}/e{energy}/image_{frame_number}.raw

transform_images

corrected_imagefile:data/{sample_id}/{sample_id}_{energy}eV_{frame_number}.img

total_intensitypixel_count corrected_image_path

log_average_image_intensity

collection_logfile:run/collected_images.csv

YW!

Ludäscher:Queries&ActionableProvenance 17

@BEGIN..@END..@IN..@OUT..@URI..@LOG..

Page 18: From Provenance Standards and Tools to Queries and Actionable Provenance

GetModernClimate

PRISM_annual_growing_season_precipitation

SubsetAllData

dendro_series_for_calibration

dendro_series_for_reconstruction CAR_Analysis_unique

cellwise_unique_selected_linear_models

CAR_Analysis_union

cellwise_union_selected_linear_models

CAR_Reconstruction_union

raster_brick_spatial_reconstruction raster_brick_spatial_reconstruction_errors

CAR_Reconstruction_union_output

ZuniCibola_PRISM_grow_prcp_ols_loocv_union_recons.tif ZuniCibola_PRISM_grow_prcp_ols_loocv_union_errors.tif

master_data_directory prism_directory

tree_ring_datacalibration_years retrodiction_years

Paleoclimate Reconstruction(openSKOPE.org)• …explainedusingYesWorkflow!

KyleB.,(computational)archaeologist:"Ittookmeabout20minutestocomment.LessthananhourtolearnandYW-annotate,all-told."

Ludäscher:Queries&ActionableProvenance 18

Page 19: From Provenance Standards and Tools to Queries and Actionable Provenance

YWDemoUseCases(IDCC’17)Domain Usecase Programminglanguage Provenancemethods

Climatescience C3C4 MATLAB YW+MATLABRunManager

Astrophysics LIGO Python YW+NW(code-level)

Protein crystalsamples Simulatedatacollection

Python YW+NW(code-level)

Biodiversitydatacuration

kurator-SPNHC Python YW-recon+YW-logging

Socialnetwork analysis Twitter Python YW +NW(file-level)

Oceanography OHIBC Howe Sound(multi-run multi-script)

R YW +RRunManager

Ludäscher:Queries&ActionableProvenance 19

Page 20: From Provenance Standards and Tools to Queries and Actionable Provenance

run/  

├──  raw  

│      └──  q55  

│              ├──  DRT240  

│              │      ├──  e10000  

│              │      │      ├──  image_001.raw  

...          ...  ...  ...  

│              │      │      └──  image_037.raw  

│              │      └──  e11000  

│              │              ├──  image_001.raw  

...          ...          ...  

│              │              └──  image_037.raw  

│              └──  DRT322  

│                      ├──  e10000  

│                      │      ├──  image_001.raw  

...                  ...  ...  

│                      │      └──  image_030.raw  

│                      └──  e11000  

│                              ├──  image_001.raw  

...                          ...  

│                              └──  image_030.raw  

├──  data  

│      ├──  DRT240  

│      │      ├──  DRT240_10000eV_001.img  

...  ...  ...  

│      │      └──  DRT240_11000eV_037.img  

│      └──  DRT322  

│              ├──  DRT322_10000eV_001.img  

...          ...  

│              └──  DRT322_11000eV_030.img  

│  

├──  collected_images.csv  

├──  rejected_samples.txt  

└──  run_log.txt  

 

YW-RECON:Prospective&RetrospectiveProvenance…(almost)forfree!

cassette_id

sample_score_cutoff

sample_spreadsheetfile:cassette_{cassette_id}_spreadsheet.csv

calibration_imagefile:calibration.img

initialize_run

run_logfile:run/run_log.txt

load_screening_results

sample_namesample_quality

calculate_strategy

rejected_sample accepted_sample num_images energies

log_rejected_sample

rejection_logfile:/run/rejected_samples.txt

collect_data_set

sample_id energy frame_numberraw_image

file:run/raw/{cassette_id}/{sample_id}/e{energy}/image_{frame_number}.raw

transform_images

corrected_imagefile:data/{sample_id}/{sample_id}_{energy}eV_{frame_number}.img

total_intensitypixel_count corrected_image_path

log_average_image_intensity

collection_logfile:run/collected_images.csv

• URI-templateslink conceptualentitiestoruntimeprovenance“leftbehind”bythescriptauthor…

• …facilitatingprovenancereconstructionProvenance@DUG-2017 20

Page 21: From Provenance Standards and Tools to Queries and Actionable Provenance

initialize_run

run_logfile:run/run_log.txt

load_screening_results

sample_name sample_quality

calculate_strategy

rejected_sample accepted_sample num_imagesenergies

log_rejected_sample

rejection_logfile:/run/rejected_samples.txt

collect_data_set

sample_idenergyframe_numberraw_image

file:run/raw/{cassette_id}/{sample_id}/e{energy}/image_{frame_number}.raw

transform_images

corrected_imagefile:data/{sample_id}/{sample_id}_{energy}eV_{frame_number}.img

total_intensitypixel_count corrected_image_path

log_average_image_intensity

collection_logfile:run/collected_images.csv

sample_spreadsheetfile:cassette_{cassette_id}_spreadsheet.csv

calibration_imagefile:calibration.img

cassette_id

sample_score_cutoff

Q1:Whatsamples didthescriptruncollectimagesfrom?

run/  

├──  raw  

│      └──  q55  

│              ├──  DRT240  

│              │      ├──  e10000  

│              │      │      ├──  image_001.raw  

...          ...  ...  ...  

│              │      │      └──  image_037.raw  

│              │      └──  e11000  

│              │              ├──  image_001.raw  

...          ...          ...  

│              │              └──  image_037.raw  

│              └──  DRT322  

│                      ├──  e10000  

│                      │      ├──  image_001.raw  

...                  ...  ...  

│                      │      └──  image_030.raw  

│                      └──  e11000  

│                              ├──  image_001.raw  

...                          ...  

│                              └──  image_030.raw  

├──  data  

│      ├──  DRT240  

│      │      ├──  DRT240_10000eV_001.img  

...  ...  ...  

│      │      └──  DRT240_11000eV_037.img  

│      └──  DRT322  

│              ├──  DRT322_10000eV_001.img  

...          ...  

│              └──  DRT322_11000eV_030.img  

│  

├──  collected_images.csv  

├──  rejected_samples.txt  

└──  run_log.txt  

 

Provenance@DUG-2017 21

Page 22: From Provenance Standards and Tools to Queries and Actionable Provenance

initialize_run

run_logfile:run/run_log.txt

load_screening_results

sample_name sample_quality

calculate_strategy

rejected_sample accepted_sample num_imagesenergies

log_rejected_sample

rejection_logfile:/run/rejected_samples.txt

collect_data_set

sample_idenergyframe_numberraw_image

file:run/raw/{cassette_id}/{sample_id}/e{energy}/image_{frame_number}.raw

transform_images

corrected_imagefile:data/{sample_id}/{sample_id}_{energy}eV_{frame_number}.img

total_intensitypixel_count corrected_image_path

log_average_image_intensity

collection_logfile:run/collected_images.csv

sample_spreadsheetfile:cassette_{cassette_id}_spreadsheet.csv

calibration_imagefile:calibration.img

cassette_id

sample_score_cutoff

Q2:Whatenergies wereusedforimagecollectionfromsampleDRT322?

run/  

├──  raw  

│      └──  q55  

│              ├──  DRT240  

│              │      ├──  e10000  

│              │      │      ├──  image_001.raw  

...          ...  ...  ...  

│              │      │      └──  image_037.raw  

│              │      └──  e11000  

│              │              ├──  image_001.raw  

...          ...          ...  

│              │              └──  image_037.raw  

│              └──  DRT322  

│                      ├──  e10000  

│                      │      ├──  image_001.raw  

...                  ...  ...  

│                      │      └──  image_030.raw  

│                      └──  e11000  

│                              ├──  image_001.raw  

...                          ...  

│                              └──  image_030.raw  

├──  data  

│      ├──  DRT240  

│      │      ├──  DRT240_10000eV_001.img  

...  ...  ...  

│      │      └──  DRT240_11000eV_037.img  

│      └──  DRT322  

│              ├──  DRT322_10000eV_001.img  

...          ...  

│              └──  DRT322_11000eV_030.img  

│  

├──  collected_images.csv  

├──  rejected_samples.txt  

└──  run_log.txt  

 

Provenance@DUG-2017 22

Page 23: From Provenance Standards and Tools to Queries and Actionable Provenance

initialize_run

run_logfile:run/run_log.txt

load_screening_results

sample_name sample_quality

calculate_strategy

rejected_sample accepted_sample num_imagesenergies

log_rejected_sample

rejection_logfile:/run/rejected_samples.txt

collect_data_set

sample_idenergyframe_numberraw_image

file:run/raw/{cassette_id}/{sample_id}/e{energy}/image_{frame_number}.raw

transform_images

corrected_imagefile:data/{sample_id}/{sample_id}_{energy}eV_{frame_number}.img

total_intensitypixel_count corrected_image_path

log_average_image_intensity

collection_logfile:run/collected_images.csv

sample_spreadsheetfile:cassette_{cassette_id}_spreadsheet.csv

calibration_imagefile:calibration.img

cassette_id

sample_score_cutoff

Q3:WhereistherawimageofthecorrectedimageDRT322_11000ev_030.img?run/  

├──  raw  

│      └──  q55  

│              ├──  DRT240  

│              │      ├──  e10000  

│              │      │      ├──  image_001.raw  

...          ...  ...  ...  

│              │      │      └──  image_037.raw  

│              │      └──  e11000  

│              │              ├──  image_001.raw  

...          ...          ...  

│              │              └──  image_037.raw  

│              └──  DRT322  

│                      ├──  e10000  

│                      │      ├──  image_001.raw  

...                  ...  ...  

│                      │      └──  image_030.raw  

│                      └──  e11000  

│                              ├──  image_001.raw  

...                          ...  

│                              └──  image_030.raw  

├──  data  

│      ├──  DRT240  

│      │      ├──  DRT240_10000eV_001.img  

...  ...  ...  

│      │      └──  DRT240_11000eV_037.img  

│      └──  DRT322  

│              ├──  DRT322_10000eV_001.img  

...          ...  

│              └──  DRT322_11000eV_030.img  

│  

├──  collected_images.csv  

├──  rejected_samples.txt  

└──  run_log.txt  

 

Provenance@DUG-2017 23

Page 24: From Provenance Standards and Tools to Queries and Actionable Provenance

initialize_run

run_logfile:run/run_log.txt

load_screening_results

sample_name sample_quality

calculate_strategy

rejected_sample accepted_sample num_imagesenergies

log_rejected_sample

rejection_logfile:/run/rejected_samples.txt

collect_data_set

sample_idenergyframe_numberraw_image

file:run/raw/{cassette_id}/{sample_id}/e{energy}/image_{frame_number}.raw

transform_images

corrected_imagefile:data/{sample_id}/{sample_id}_{energy}eV_{frame_number}.img

total_intensitypixel_count corrected_image_path

log_average_image_intensity

collection_logfile:run/collected_images.csv

sample_spreadsheetfile:cassette_{cassette_id}_spreadsheet.csv

calibration_imagefile:calibration.img

cassette_id

sample_score_cutoff

run/  

├──  raw  

│      └──  q55  

│              ├──  DRT240  

│              │      ├──  e10000  

│              │      │      ├──  image_001.raw  

...          ...  ...  ...  

│              │      │      └──  image_037.raw  

│              │      └──  e11000  

│              │              ├──  image_001.raw  

...          ...          ...  

│              │              └──  image_037.raw  

│              └──  DRT322  

│                      ├──  e10000  

│                      │      ├──  image_001.raw  

...                  ...  ...  

│                      │      └──  image_030.raw  

│                      └──  e11000  

│                              ├──  image_001.raw  

...                          ...  

│                              └──  image_030.raw  

├──  data  

│      ├──  DRT240  

│      │      ├──  DRT240_10000eV_001.img  

...  ...  ...  

│      │      └──  DRT240_11000eV_037.img  

│      └──  DRT322  

│              ├──  DRT322_10000eV_001.img  

...          ...  

│              └──  DRT322_11000eV_030.img  

│  

├──  collected_images.csv  

├──  rejected_samples.txt  

└──  run_log.txt  

 

Q5:Whatcassette-idhadthesampleleadingtoDRT240_10000ev_001.img?

Provenance@DUG-2017 24

Page 25: From Provenance Standards and Tools to Queries and Actionable Provenance

Hybrid Provenance:YWModel + RuntimeObservables (filelevel)

Ludäscher:Queries&ActionableProvenance 25

�����������������

�����

���������

��������������

����������������

����������

�����������������

����������������

�������

����������

������������������

����������������

�����������������

�������������������

�����������

������������������

����������

�����������������

�����������

������������

�������������

���������������������

�������������������������������������������������������������������

�����������������

�������������������������������������������������������������������������

• TheYWmodelcanbeconnectedwithruntimeobservables

• è YWrecon(prov reconstruction)• Here:• Whatspecificfileswereread,writtenandwheredotheyoccurintheworkflow?

Page 26: From Provenance Standards and Tools to Queries and Actionable Provenance

C3-C4ProspectiveProvenance

Ludäscher:Queries&ActionableProvenance

C3_C4_map_present_NA

fetch_SYNMAP_land_cover_map_variable

lon_variable lat_variable lon_bnds_variable lat_bnds_variable

fetch_monthly_mean_air_temperature_data

Tair_Matrix

fetch_monthly_mean_precipitation_data

Rain_Matrix

initialize_Grass_Matrix

Grass_variable

examine_pixels_for_grass

C3_Data C4_Data

generate_netcdf_file_for_C3_fraction

C3_fraction_datafile:outputs/SYNMAP_PRESENTVEG_C3Grass_RelaFrac_NA_v2.0.nc

generate_netcdf_file_for_C4_fraction

C4_fraction_datafile:outputs/SYNMAP_PRESENTVEG_C4Grass_RelaFrac_NA_v2.0.nc

generate_netcdf_file_for_Grass_fraction

Grass_fraction_datafile:outputs/SYNMAP_PRESENTVEG_Grass_Fraction_NA_v2.0.nc

SYNMAP_land_cover_map_datainputs/land_cover/SYNMAP_NA_QD.nc

mean_airtempfile:inputs/narr_air.2m_monthly/air.2m_monthly_{start_year}_{end_year}_mean.{month}.nc

mean_precipfile:inputs/narr_apcp_rescaled_monthly/apcp_monthly_{start_year}_{end_year}_mean.{month}.nc

26

Page 27: From Provenance Standards and Tools to Queries and Actionable Provenance

WhatdoesC4_fraction_data dependon?C3_C4_map_present_NA

examine_pixels_for_grass

C4_Data

fetch_SYNMAP_land_cover_map_variable

lon_variable lat_variable lon_bnds_variable lat_bnds_variable

fetch_monthly_mean_precipitation_data

Rain_Matrix

fetch_monthly_mean_air_temperature_data

Tair_Matrix

generate_netcdf_file_for_C4_fraction

C4_fraction_data

SYNMAP_land_cover_map_data

mean_airtempmean_precipC3_C4_map_present_NA

fetch_SYNMAP_land_cover_map_variable

lon_variable lat_variable lon_bnds_variable lat_bnds_variable

fetch_monthly_mean_air_temperature_data

Tair_Matrix

fetch_monthly_mean_precipitation_data

Rain_Matrix

initialize_Grass_Matrix

Grass_variable

examine_pixels_for_grass

C3_Data C4_Data

generate_netcdf_file_for_C3_fraction

C3_fraction_data

generate_netcdf_file_for_C4_fraction

C4_fraction_data

generate_netcdf_file_for_Grass_fraction

Grass_fraction_data

SYNMAP_land_cover_map_data

mean_airtempmean_precip

C4_fraction_datalineage verysimilartooverallworkflowgraph!

Ludäscher:Queries&ActionableProvenance 27

Page 28: From Provenance Standards and Tools to Queries and Actionable Provenance

WhatdoesGrass_fraction_data dependon?

C3_C4_map_present_NA

fetch_SYNMAP_land_cover_map_variable

lon_variable lat_variable lon_bnds_variable lat_bnds_variable

fetch_monthly_mean_air_temperature_data

Tair_Matrix

fetch_monthly_mean_precipitation_data

Rain_Matrix

initialize_Grass_Matrix

Grass_variable

examine_pixels_for_grass

C3_Data C4_Data

generate_netcdf_file_for_C3_fraction

C3_fraction_data

generate_netcdf_file_for_C4_fraction

C4_fraction_data

generate_netcdf_file_for_Grass_fraction

Grass_fraction_data

SYNMAP_land_cover_map_data

mean_airtempmean_precip

C4_fraction_datalineage differentfromoverallworkflowgraph!- Smaller subgraph- Dependsononly1of3inputs!

C3_C4_map_present_NA

initialize_Grass_Matrix

Grass_variable

fetch_SYNMAP_land_cover_map_variable

lon_variable lat_variable lon_bnds_variable lat_bnds_variable

generate_netcdf_file_for_Grass_fraction

Grass_fraction_data

SYNMAP_land_cover_map_data

Ludäscher:Queries&ActionableProvenance 28

Page 29: From Provenance Standards and Tools to Queries and Actionable Provenance

Whathappensafterrunningthescript?Hybrid provenancegraph!

• 3inputsspreadacross25 (=2x24+1)files

• Doall3outputfilesdependonall25inputs?

C3_C4_map_present_NA

fetch_SYNMAP_land_cover_map_variable

lon_variable lat_variable lon_bnds_variable lat_bnds_variable

fetch_monthly_mean_air_temperature_data

Tair_Matrix

fetch_monthly_mean_precipitation_data

Rain_Matrix

initialize_Grass_Matrix

Grass_variable

examine_pixels_for_grass

C3_Data C4_Data

generate_netcdf_file_for_C3_fraction

C3_fraction_data

outputs/SYNMAP_PRESENTVEG_C3Grass_RelaFrac_NA_v2.0.nc

generate_netcdf_file_for_C4_fraction

C4_fraction_data

outputs/SYNMAP_PRESENTVEG_C4Grass_RelaFrac_NA_v2.0.nc

generate_netcdf_file_for_Grass_fraction

Grass_fraction_data

outputs/SYNMAP_PRESENTVEG_Grass_Fraction_NA_v2.0.nc

SYNMAP_land_cover_map_data

inputs/land_cover/SYNMAP_NA_QD.nc

mean_airtemp

inputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.9.ncinputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.2.ncinputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.1.ncinputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.6.ncinputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.10.ncinputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.3.ncinputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.7.ncinputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.11.ncinputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.4.ncinputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.8.ncinputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.12.ncinputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.5.nc

mean_precip

inputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.4.ncinputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.8.ncinputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.1.ncinputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.12.ncinputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.5.ncinputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.9.ncinputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.2.ncinputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.6.ncinputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.10.ncinputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.3.ncinputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.7.ncinputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.11.nc

Ludäscher:Queries&ActionableProvenance 29

Page 30: From Provenance Standards and Tools to Queries and Actionable Provenance

WhatC4_fraction_datadependson(hybrid)…

C3_C4_map_present_NA

examine_pixels_for_grass

C4_Data

fetch_SYNMAP_land_cover_map_variable

lon_variable lat_variable lon_bnds_variable lat_bnds_variable

fetch_monthly_mean_precipitation_data

Rain_Matrix

fetch_monthly_mean_air_temperature_data

Tair_Matrix

generate_netcdf_file_for_C4_fraction

C4_fraction_data

SYNMAP_land_cover_map_data

mean_airtempmean_precip

Earlierprospectivequeryresult

C3_C4_map_present_NA

examine_pixels_for_grass

C4_Data

fetch_SYNMAP_land_cover_map_variable

lon_variable lat_variable lon_bnds_variable lat_bnds_variable

fetch_monthly_mean_precipitation_data

Rain_Matrix

fetch_monthly_mean_air_temperature_data

Tair_Matrix

generate_netcdf_file_for_C4_fraction

C4_fraction_data

outputs/SYNMAP_PRESENTVEG_C4Grass_RelaFrac_NA_v2.0.nc

SYNMAP_land_cover_map_data

inputs/land_cover/SYNMAP_NA_QD.nc

mean_airtemp

inputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.4.ncinputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.8.ncinputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.1.ncinputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.12.ncinputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.5.ncinputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.9.ncinputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.2.ncinputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.6.ncinputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.10.ncinputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.3.ncinputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.7.ncinputs/narr_air.2m_monthly/air.2m_monthly_2000_2010_mean.11.nc

mean_precip

inputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.10.ncinputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.3.ncinputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.7.ncinputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.11.ncinputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.4.ncinputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.8.ncinputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.1.ncinputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.12.ncinputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.5.ncinputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.9.ncinputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.2.ncinputs/narr_apcp_rescaled_monthly/apcp_monthly_2000_2010_mean.6.nc

Ludäscher:Queries&ActionableProvenance 30

Page 31: From Provenance Standards and Tools to Queries and Actionable Provenance

WhatGrass_fraction_data dependson(hybrid)…

C3_C4_map_present_NA

initialize_Grass_Matrix

Grass_variable

fetch_SYNMAP_land_cover_map_variable

lon_variable lat_variable lon_bnds_variable lat_bnds_variable

generate_netcdf_file_for_Grass_fraction

Grass_fraction_data

SYNMAP_land_cover_map_data

C3_C4_map_present_NA

initialize_Grass_Matrix

Grass_variable

fetch_SYNMAP_land_cover_map_variable

lon_variable lat_variable lon_bnds_variable lat_bnds_variable

generate_netcdf_file_for_Grass_fraction

Grass_fraction_data

outputs/SYNMAP_PRESENTVEG_Grass_Fraction_NA_v2.0.nc

SYNMAP_land_cover_map_data

inputs/land_cover/SYNMAP_NA_QD.ncC3_C4_map_present_NA

fetch_SYNMAP_land_cover_map_variable

lon_variable lat_variable lon_bnds_variable lat_bnds_variable

fetch_monthly_mean_air_temperature_data

Tair_Matrix

fetch_monthly_mean_precipitation_data

Rain_Matrix

initialize_Grass_Matrix

Grass_variable

examine_pixels_for_grass

C3_Data C4_Data

generate_netcdf_file_for_C3_fraction

C3_fraction_data

generate_netcdf_file_for_C4_fraction

C4_fraction_data

generate_netcdf_file_for_Grass_fraction

Grass_fraction_data

SYNMAP_land_cover_map_data

mean_airtempmean_precip

Overall workflow

UpstreamofGrass_fraction_data

(prospective)

UpstreamofGrass_fraction_data(hybrid)

# @BEGIN

Gravitational_Wave_Detection

# @IN fn_d @as FN_Detector

# @IN fn_sr @as FN_Sampling_Rate

# @OUT shifted.wav @as

shifted_wave

# @OUT whitenbp.wav @as

whitened_bandpass

import numpy as np

from scipy import signal

# @BEGIN

Amplitude_Spectral_Density

# @IN strain_H1

# @IN strain_L1

# @PARAM fs

# @OUT psd_H1

# @OUT psd_L1

# @OUT GW150914_ASDs.png @URI …

NFFT = 1*fs

fmin, fmax = 10, 2000

YesWorkflow-annotatedscripts

File I/OEvents

Log filesLogicrulesforreconstructing,

querying,andvisualizingprospective andretrospective

provenancetogether

upstream(strain_LI_whitenbp) [NW-recon]

WHITENING

strain_L1_whitenstrain_L1_whiten = array([8.494, -1.672, ..., 72.156])

AMPLITUDE_SPECTRAL_DENSITY

PSD_L1psd_L1 = scipy.interpolate.interpolate.interp1d

object at 0x113969418

LOAD_DATA

strain_L1strain_L1 = array([-1.779e-18, -1.765e-18, ..., -1.719e-18])

BANDPASSING

strain_L1_whitenbpstrain_L1_whitenbp = array([8.184, 19.935,..., -0.684])

FN_Detectorfn_d = L-L1_LOSC_4_V1-1126259446-32.hdf5

fsfs = 4096

upstream(strain_LI_whitenbp) [prospective]

WHITENING

strain_H1_whiten strain_L1_whiten

AMPLITUDE_SPECTRAL_DENSITY

PSD_H1 PSD_L1

LOAD_DATA

strain_H1 strain_L1

BANDPASSING

strain_L1_whitenbp

FN_Detectorfile:{Detector}_LOSC_4_V1-...

FN_Sampling_ratefile:H-H1_LOSC_{Rate}_V1-...

fs

upstream(strain_L1_whitenbp) [URI-recon]

WHITENING

strain_H1_whiten strain_L1_whiten

AMPLITUDE_SPECTRAL_DENSITY

PSD_H1 PSD_L1

LOAD_DATA

strain_H1 strain_L1

BANDPASSING

strain_L1_whitenbp

FN_Detector

L-L1_LOSC_4_V1-1126259446-32.hdf5H-H1_LOSC_4_V1-1126259446-32.hdf5

FN_Sampling_rate

H-H1_LOSC_4_V1-1126259446-32.hdf5H-H1_LOSC_16_V1-1126259446-32.hdf5

fs

ProvenanceRecorders

Functioncallgraphandvariabledependencies

Rawruntimeobservations

YesWorkflow toolkitExtract annotationsand

model scriptasaworkflow

YesWorkflow toolkitReconstruct scriptrunandretrospectiveprovenance

YesWorkflow toolkitRenderworkflowmodelgraphically

ProspectiveProvenanceuser-defined

workflowmodels

HybridProvenance

Generalpurposeprovenancebridges

ProvenancequeriesQuery provenance(esp.graphs)andvisualize results

ProvenanceExportersQuery andvisualize

provenance

noWorkflowtoolkitQuery andvisualize

provenance

RetrospectiveProvenancePythonruntimeobservables

prospective+code-levelruntimeobservables

subgraph

NW_FILTERED_LINEAGE_GRAPH_FOR_STRAIN_L1_WHITENBP

whiten

141 fn_d = 'L-L1_LOSC_4_V1-1126259446-32.hdf5'

142 loaddata = (array([ -1.77955839e-18, ... 1, 1, 1], dtype=uint32)})

142 time_L1 = array([ 1.12625945e+09, ... 8e+09, 1.12625948e+09]) 142 strain_L1 = array([ -1.77955839e-18, ... 6e-18, -1.71969299e-18]) 151 fs = 4096

153 time = array([ 1.12625945e+09, ... 8e+09, 1.12625948e+09])

155 dt = 0.000244140625

266 NFFT = 4096

270 psd = (array([ 2.22851728e-36, ... e+03, 2.04800000e+03]))

270 freqs = array([ 0.00000000e+00, ... 0e+03, 2.04800000e+03]) 270 Pxx_L1 = array([ 2.22851728e-36, ... 5e-46, 1.77059496e-46])

274 psd_L1 = <scipy.interpolate.interp ... 1d object at 0x1095b0260>

334 return = array([ 8.49413154, -1. ... .39942945, 72.15659253])

333 white_ht = array([ 8.49413154, -1. ... .39942945, 72.15659253])

325 strain = array([ -1.77955839e-18, ... 6e-18, -1.71969299e-18])

325 interp_psd = <scipy.interpolate.interp ... 1d object at 0x1095b0260>

325 dt = 0.000244140625

326 len

326 Nt = 131072

327 rfftfreq = array([ 0.00000000e+00, ... 5e+03, 2.04800000e+03])

327 freqs = array([ 0.00000000e+00, ... 5e+03, 2.04800000e+03])

331 rfft = array([ -2.39692348e-13 + ... 54e-19 +0.00000000e+00j])

331 hf = array([ -2.39692348e-13 + ... 54e-19 +0.00000000e+00j]) 332 (np.sqrt(interp_psd(freqs) /dt/2.))

332 white_hf = array([ -3.54798023e+03 + ... 58e+02 +0.00000000e+00j])

333 irfft = array([ 8.49413154, -1. ... .39942945, 72.15659253])

338 strain_L1_whiten = array([ 8.49413154, -1. ... .39942945, 72.15659253])

362 butter = (array([ 0.0012848 , 0. ... 9166733 , 0.32217438]))

362 ab = array([ 1. , -6. ... .9166733 , 0.32217438])362 bb = array([ 0.0012848 , 0. ... 0. , 0.0012848 ])

364 filtfilt = array([ 8.18464884, 19. ... .18198039, -0.68432653])

364 strain_L1_whitenbp = array([ 8.18464884, 19. ... .18198039, -0.68432653])

whiten

write_wavfile write_wavfile

write_wavfilewrite_wavfile

get_filter_coefs

iir_bandstopsiir_bandstops iir_bandstopsiir_bandstops iir_bandstopsiir_bandstopsiir_bandstops iir_bandstopsiir_bandstopsiir_bandstopsiir_bandstops iir_bandstops iir_bandstops iir_bandstops iir_bandstopsiir_bandstops

reqshift reqshift

write_wavfile

write_wavfile

reqshift

whiten whiten

filter_data

filter_data filter_datafilter_data

136 loaddata

135 fn_H1

136 time_H1 136 strain_H1 136 chan_dict_H1139 loaddata

138 fn_L1

139 time_L1 139 strain_L1139 chan_dict_L1

163 genfromtxt163 ndarray.transpose163 NR_H1163 NRtime

175 len175 ndarray.min 175 ndarray.mean 175 ndarray.max 176 len 176 ndarray.min 176 ndarray.mean 176 ndarray.max 177 len 177 ndarray.min 177 ndarray.mean 177 ndarray.max

181 len

180 bits

181 ndarray.min 181 array_str181 ndarray.mean 181 ndarray.max 181 array_str183 len

182 bits

183 ndarray.min 183 array_str183 ndarray.mean 183 ndarray.max 183 array_str 185 len

184 bits

185 ndarray.min 185 array_str185 ndarray.mean 185 ndarray.max 185 array_str187 len

186 bits

187 ndarray.min 187 array_str 187 ndarray.mean 187 ndarray.max 187 array_str189 len

188 bits

189 ndarray.min 189 array_str 189 ndarray.mean 189 ndarray.max 189 array_str 191 len

190 bits

191 ndarray.min 191 array_str 191 ndarray.mean 191 ndarray.max 191 array_str

207 where

204 tevent205 deltat

207 indxt

209 figure

210 plot 211 plot

212 str(tevent)212 xlabel 212 str(tevent)

213 ylabel 214 legend 215 title GW150914_strain.png

216 savefig

258 psd

142 fs

255 NFFT

258 Pxx_H1 258 freqs259 psd259 freqs 259 Pxx_L1

262 psd_H1 263 psd_L1

266 figure

267 np.sqrt(Pxx_H1)267 loglog267 np.sqrt(Pxx_H1) 268 np.sqrt(Pxx_L1)268 loglog 268 np.sqrt(Pxx_L1)

269 axis

256 fmin 257 fmax

270 grid 271 ylabel 272 xlabel 273 legend 274 title GW150914_ASDs.png

275 savefig

323 return

322 white_ht

314 strain 314 interp_psd 314 dt

146 dt

315 len 315 Nt

316 rfftfreq 316 freqs

320 rfft320 hf

321 (np.sqrt(interp_psd(freqs) /dt/2.))321 white_hf

322 irfft

144 time

326 strain_H1_whiten

323 return

322 white_ht

314 strain 314 interp_psd314 dt

315 len315 Nt

316 rfftfreq 316 freqs

320 rfft320 hf

321 (np.sqrt(interp_psd(freqs) /dt/2.))321 white_hf

322 irfft

327 strain_L1_whiten

323 return

322 white_ht

314 strain314 interp_psd 314 dt

315 len315 Nt

316 rfftfreq 316 freqs

320 rfft320 hf

321 (np.sqrt(interp_psd(freqs) /dt/2.))321 white_hf

322 irfft

328 NR_H1_whiten 351 butter351 ab 351 bb

352 filtfilt 352 strain_H1_whitenbp 353 filtfilt 353 strain_L1_whitenbp354 filtfilt 354 NR_H1_whitenbp

368 int(0.007*fs)368 roll368 strain_L1_shift 368 int(0.007*fs)

370 figure

371 plot

372 plot

373 plot

374 xlim 375 ylim

376 str(tevent)376 xlabel 376 str(tevent)

377 ylabel 378 legend 379 title GW150914_strain_whitened.png

380 savefig

414 where

411 tevent 412 deltat

414 indxt 422 blackman

417 NFFT

422 window

431 figure

433 plt.specgram(strain_H1[in ... xextent=[-deltat,deltat])

427 spec_cmap419 NOVL

432 im 432 spec_H1 432 freqs 432 bins

433 specgram433 plt.specgram(strain_H1[in ... xextent=[-deltat,deltat])

434 str(tevent)434 xlabel 434 str(tevent)

435 ylabel 436 colorbar

437 axis

438 title GW150914_H1_spectrogram.png

439 savefig

442 figure

444 plt.specgram(strain_L1[in ... xextent=[-deltat,deltat])

443 im 443 spec_H1 443 freqs 443 bins

444 specgram 444 plt.specgram(strain_L1[in ... xextent=[-deltat,deltat])

445 str(tevent)445 xlabel 445 str(tevent)

446 ylabel 447 colorbar

448 axis

449 title GW150914_L1_spectrogram.png

450 savefig

478 where

475 tevent

476 deltat

478 indxt 486 blackman

481 NFFT

486 window

489 figure

491 plt.specgram(strain_H1_wh ... xextent=[-deltat,deltat])

483 NOVL

490 im 490 spec_H1 490 freqs490 bins

491 specgram491 plt.specgram(strain_H1_wh ... xextent=[-deltat,deltat])

492 str(tevent)492 xlabel 492 str(tevent)

493 ylabel 494 colorbar 495 axis 496 title GW150914_H1_spectrogram_whitened.png

497 savefig

500 figure

502 plt.specgram(strain_L1_wh ... xextent=[-deltat,deltat])

501 im 501 spec_H1 501 freqs 501 bins

502 specgram 502 plt.specgram(strain_L1_wh ... xextent=[-deltat,deltat])

503 str(tevent)503 xlabel 503 str(tevent)

504 ylabel 505 colorbar 506 axis 507 title GW150914_L1_spectrogram_whitened.png

508 savefig

608 return

575 coefs

572 fs

586 butter

580 order584 low 585 high

586 ab586 bb

587 list.append

593 np.array( [14.0,3 ... 331.49, 510.02, 1009.99])

591 notchesAbsolute

593 array

597 array

535 fstops

569 return

568 a 568 b

535 fs

545 array545 zd546 array546 pd

559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

554 low 555 high 556 low2 557 high2

542 nyq

558 p 558 k 558 z

559 iirdesign 559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

560 append560 zd561 append 561 pd

564 zpk2tf564 aPrelim 564 bPrelim

565 freqz565 outg0565 outFreq

568 zpk2tf

597 bn597 an

598 list.append

597 array

596 notchf

535 fstops

569 return

568 a 568 b

535 fs

545 array545 zd546 array546 pd

559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

554 low 555 high 556 low2 557 high2

542 nyq

558 p 558 k 558 z

559 iirdesign 559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

560 append560 zd561 append 561 pd

564 zpk2tf564 aPrelim 564 bPrelim

565 freqz565 outg0565 outFreq

568 zpk2tf

597 bn 597 an

598 list.append

597 array

596 notchf

535 fstops

569 return

568 a 568 b

535 fs

545 array545 zd546 array546 pd

559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

554 low 555 high 556 low2 557 high2

542 nyq

558 p 558 k 558 z

559 iirdesign 559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

560 append560 zd561 append 561 pd

564 zpk2tf564 aPrelim 564 bPrelim

565 freqz565 outg0565 outFreq

568 zpk2tf

597 bn 597 an

598 list.append

597 array

596 notchf

535 fstops

569 return

568 a 568 b

535 fs

545 array545 zd546 array546 pd

559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

554 low 555 high 556 low2 557 high2

542 nyq

558 p 558 k 558 z

559 iirdesign 559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

560 append560 zd561 append 561 pd

564 zpk2tf564 aPrelim 564 bPrelim

565 freqz565 outg0565 outFreq

568 zpk2tf

597 bn 597 an

598 list.append

597 array

596 notchf

535 fstops

569 return

568 a 568 b

535 fs

545 array545 zd546 array546 pd

559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

554 low 555 high 556 low2 557 high2

542 nyq

558 p 558 k 558 z

559 iirdesign 559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

560 append560 zd561 append 561 pd

564 zpk2tf564 aPrelim 564 bPrelim

565 freqz565 outg0565 outFreq

568 zpk2tf

597 bn597 an

598 list.append

597 array

596 notchf

535 fstops

569 return

568 a 568 b

535 fs

545 array545 zd546 array546 pd

559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

554 low 555 high 556 low2 557 high2

542 nyq

558 p 558 k 558 z

559 iirdesign 559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

560 append560 zd561 append 561 pd

564 zpk2tf564 aPrelim 564 bPrelim

565 freqz565 outg0565 outFreq

568 zpk2tf

597 bn 597 an

598 list.append

597 array

596 notchf

535 fstops

569 return

568 a 568 b

535 fs

545 array545 zd546 array546 pd

559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

554 low 555 high 556 low2 557 high2

542 nyq

558 p 558 k 558 z

559 iirdesign 559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

560 append560 zd561 append 561 pd

564 zpk2tf564 aPrelim 564 bPrelim

565 freqz565 outg0565 outFreq

568 zpk2tf

597 bn 597 an

598 list.append

597 array

596 notchf

535 fstops

569 return

568 a 568 b

535 fs

545 array545 zd546 array546 pd

559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

554 low 555 high 556 low2 557 high2

542 nyq

558 p 558 k 558 z

559 iirdesign 559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

560 append560 zd561 append 561 pd

564 zpk2tf564 aPrelim 564 bPrelim

565 freqz565 outg0565 outFreq

568 zpk2tf

597 bn 597 an

598 list.append

597 array

596 notchf

535 fstops

569 return

568 a 568 b

535 fs

545 array545 zd546 array546 pd

559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

554 low 555 high 556 low2 557 high2

542 nyq

558 p 558 k 558 z

559 iirdesign 559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

560 append560 zd561 append 561 pd

564 zpk2tf564 aPrelim 564 bPrelim

565 freqz565 outg0565 outFreq

568 zpk2tf

597 bn597 an

598 list.append

597 array

596 notchf

535 fstops

569 return

568 a 568 b

535 fs

545 array545 zd546 array546 pd

559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

554 low 555 high 556 low2 557 high2

542 nyq

558 p 558 k 558 z

559 iirdesign 559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

560 append560 zd561 append 561 pd

564 zpk2tf564 aPrelim 564 bPrelim

565 freqz565 outg0565 outFreq

568 zpk2tf

597 bn597 an

598 list.append

597 array

596 notchf

535 fstops

569 return

568 a 568 b

535 fs

545 array545 zd546 array546 pd

559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

554 low 555 high 556 low2 557 high2

542 nyq

558 p 558 k 558 z

559 iirdesign 559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

560 append560 zd561 append 561 pd

564 zpk2tf564 aPrelim 564 bPrelim

565 freqz565 outg0565 outFreq

568 zpk2tf

597 bn597 an

598 list.append

597 array

596 notchf

535 fstops

569 return

568 a 568 b

535 fs

545 array545 zd546 array546 pd

559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

554 low 555 high 556 low2 557 high2

542 nyq

558 p 558 k 558 z

559 iirdesign 559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

560 append560 zd561 append 561 pd

564 zpk2tf564 aPrelim 564 bPrelim

565 freqz565 outg0565 outFreq

568 zpk2tf

597 bn597 an

598 list.append

597 array

596 notchf

535 fstops

569 return

568 a 568 b

535 fs

545 array545 zd546 array546 pd

559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

554 low 555 high 556 low2 557 high2

542 nyq

558 p 558 k 558 z

559 iirdesign 559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

560 append560 zd561 append 561 pd

564 zpk2tf564 aPrelim 564 bPrelim

565 freqz565 outg0565 outFreq

568 zpk2tf

597 bn597 an

598 list.append

597 array

596 notchf

535 fstops

569 return

568 a 568 b

535 fs

545 array545 zd546 array546 pd

559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

554 low 555 high 556 low2 557 high2

542 nyq

558 p 558 k 558 z

559 iirdesign 559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

560 append560 zd561 append 561 pd

564 zpk2tf564 aPrelim 564 bPrelim

565 freqz565 outg0565 outFreq

568 zpk2tf

597 bn597 an

598 list.append

601 array

535 fstops

569 return

568 a 568 b

535 fs

545 array545 zd546 array546 pd

559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

554 low 555 high 556 low2 557 high2

542 nyq

558 p 558 k 558 z

559 iirdesign 559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

560 append560 zd561 append 561 pd

564 zpk2tf564 aPrelim 564 bPrelim

565 freqz565 outg0565 outFreq

568 zpk2tf

601 bn 601 an

602 list.append

605 array

535 fstops

569 return

568 a 568 b

535 fs

545 array545 zd546 array546 pd

559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

554 low 555 high 556 low2 557 high2

542 nyq

558 p 558 k 558 z

559 iirdesign 559 iirdesign([low,high], [lo ... pe='ellip', output='zpk')

560 append560 zd561 append 561 pd

564 zpk2tf564 aPrelim 564 bPrelim

565 freqz565 outg0565 outFreq

568 zpk2tf

605 bn605 an

606 list.append

639 coefs642 RandomState.randn 642 data

631 return

630 data

624 data_in624 coefs

625 ndarray.copy625 data

630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

645 resp

649 psd

648 NFFT

649 freqs649 Pxx_data

650 psd650 Pxx_resp650 freqs

653 np.sqrt(Pxx_data)653 ndarray.mean 653 np.sqrt(Pxx_data)653 norm

654 np.sqrt(Pxx_data)654 asd_data

655 np.sqrt(Pxx_resp)655 asd_resp

659 ones

658 Nc

659 filt_resp662 freqz

661 b661 a

662 r662 w

663 np.abs(r)663 filt_resp

662 freqz

661 b661 a

662 r 662 w

663 np.abs(r)663 filt_resp

662 freqz

661 b661 a

662 r 662 w

663 np.abs(r)663 filt_resp

662 freqz

661 b661 a

662 r 662 w

663 np.abs(r)663 filt_resp

662 freqz

661 b661 a

662 r 662 w

663 np.abs(r)663 filt_resp

662 freqz

661 b661 a

662 r 662 w

663 np.abs(r)663 filt_resp

662 freqz

661 b661 a

662 r 662 w

663 np.abs(r)663 filt_resp

662 freqz

661 b661 a

662 r 662 w

663 np.abs(r)663 filt_resp

662 freqz

661 b 661 a

662 r 662 w

663 np.abs(r)663 filt_resp

662 freqz

661 b 661 a

662 r 662 w

663 np.abs(r)663 filt_resp

662 freqz

661 b661 a

662 r662 w

663 np.abs(r)663 filt_resp

662 freqz

661 b 661 a

662 r662 w

663 np.abs(r)663 filt_resp

662 freqz

661 b 661 a

662 r662 w

663 np.abs(r)663 filt_resp

662 freqz

661 b 661 a

662 r662 w

663 np.abs(r)663 filt_resp

662 freqz

661 b 661 a

662 r662 w

663 np.abs(r)663 filt_resp

662 freqz

661 b 661 a

662 r662 w

663 np.abs(r)663 filt_resp

662 freqz

661 b661 a

662 r662 w

663 np.abs(r)663 filt_resp

669 figure

670 plot

671 plot

672 plot

664 freqf

666 filt_resp

673 xlim 674 grid 675 ylabel 676 xlabel 677 legend GW150914_filter.png

678 savefig

631 return

630 data

624 data_in 624 coefs

625 ndarray.copy 625 data

630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

688 strain_H1_filt

631 return

630 data

624 data_in 624 coefs

625 ndarray.copy 625 data

630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

689 strain_L1_filt

631 return

630 data

624 data_in 624 coefs

625 ndarray.copy 625 data

630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

630 data630 filtfilt

627 b 627 a

692 NR_H1_filt

710 figure

711 plot 712 plot

713 xlim

714 str(tevent)714 xlabel 714 str(tevent)

715 ylabel 716 legend 717 title GW150914_H1_strain_unfiltered.png

718 savefig

722 int(0.007*fs)722 roll722 strain_L1_fils 722 int(0.007*fs)

724 figure

725 plot

726 plot

727 plot

728 xlim 729 ylim

730 str(tevent)730 xlabel 730 str(tevent)

731 ylabel 732 legend 733 title GW150914_H1_strain_filtered.png

734 savefig

776 where

772 tevent773 deltat

776 indxt

779 int(fs)

768 fs 768 data768 filename

769 np.abs(data)769 amax769 np.int16(data/np.max(np.abs(data)) * 32767 * 0.9)769 d 769 np.abs(data)

770 int(fs)770 write 770 int(fs)

GW150914_H1_whitenbp.wav

779 int(fs) 780 int(fs)

768 fs 768 data768 filename

769 np.abs(data)769 amax769 np.int16(data/np.max(np.abs(data)) * 32767 * 0.9)769 d 769 np.abs(data)

770 int(fs)770 write 770 int(fs)

GW150914_L1_whitenbp.wav

780 int(fs) 781 int(fs)

768 fs 768 data768 filename

769 np.abs(data)769 amax769 np.int16(data/np.max(np.abs(data)) * 32767 * 0.9)769 d 769 np.abs(data)

770 int(fs)770 write 770 int(fs)

GW150914_NR_whitenbp.wav

781 int(fs)

824 int(float(fs)*float(speedup)) 824 float(fs)824 float(speedup)

821 fs823 speedup

824 fss

818 return

817 z

808 data 808 fshift

822 fshift

808 sample_rate

811 rfft811 x

812 len812 T812 float(sample_rate)

814 int(fshift/df)

813 df

814 nbins

816 roll816 y 816 roll

817 irfft

827 strain_H1_shifted

818 return

817 z

808 data808 fshift808 sample_rate

811 rfft811 x

812 len812 T812 float(sample_rate)

814 int(fshift/df)

813 df

814 nbins

816 roll816 y 816 roll

817 irfft

828 strain_L1_shifted

818 return

817 z

808 data 808 fshift808 sample_rate

811 rfft811 x

812 len 812 T 812 float(sample_rate)

814 int(fshift/df)

813 df

814 nbins

816 roll816 y 816 roll

817 irfft

829 NR_H1_shifted

845 int(fs)

768 fs 768 data768 filename

769 np.abs(data)769 amax769 np.int16(data/np.max(np.abs(data)) * 32767 * 0.9)769 d 769 np.abs(data)

770 int(fs)770 write 770 int(fs)

GW150914_H1_shifted.wav

845 int(fs) 846 int(fs)

768 fs 768 data768 filename

769 np.abs(data)769 amax769 np.int16(data/np.max(np.abs(data)) * 32767 * 0.9)769 d 769 np.abs(data)

770 int(fs)770 write 770 int(fs)

GW150914_L1_shifted.wav

846 int(fs) 847 int(fs)

768 fs 768 data768 filename

769 np.abs(data)769 amax769 np.int16(data/np.max(np.abs(data)) * 32767 * 0.9)769 d 769 np.abs(data)

770 int(fs)770 write 770 int(fs)

GW150914_NR_shifted.wav

847 int(fs)

876 loaddata

875 fn_16

876 time_16876 strain_16876 chan_dict

878 loaddata

877 fn_4

878 time_4878 strain_4 878 chan_dict883 psd

881 fs

882 NFFT

883 freqs_16 883 Pxx_16

887 psd

885 fs

886 NFFT

887 Pxx_4 887 freqs_4

892 figure

893 np.sqrt(Pxx_16)893 loglog 893 np.sqrt(Pxx_16)

894 np.sqrt(Pxx_4)894 loglog894 np.sqrt(Pxx_4) 895 axis

889 fmin 890 fmax

896 grid 897 ylabel 898 xlabel 899 legend 900 title GW150914_H1_ASD_16384.png

901 savefig

913 figure

914 np.sqrt(Pxx_16)914 plot 914 np.sqrt(Pxx_16)

915 np.sqrt(Pxx_4)915 plot915 np.sqrt(Pxx_4) 916 axis

910 fmin 911 fmax

917 grid 918 ylabel 919 xlabel 920 legend 921 title GW150914_H1_ASD_16384_zoom.png

922 savefig

937 decimate

935 factor936 numtaps

937 strain_4new

941 psd

939 fs

940 NFFT

941 Pxx_4new 941 freqs_4

946 figure947 np.sqrt(Pxx_4new)947 plot947 np.sqrt(Pxx_4new) 948 np.sqrt(Pxx_4)948 plot 948 np.sqrt(Pxx_4) 949 axis

943 fmin 944 fmax

950 grid 951 ylabel 952 xlabel 953 legend 954 title GW150914_H1_ASD_4096_zoom.png

955 savefig

979 loaddata

978 fn

979 strain 979 chan_dict 979 time

982 dict.items982 keys 982 values 982 keys 982 values 982 keys 982 values 982 keys 982 values 982 keys 982 values 982 keys982 values 982 keys 982 values 982 keys 982 values 982 keys 982 values 982 keys 982 values 982 keys 982 values 982 keys 982 values 982 keys 982 values

984 array_str 984 array_str 984 array_str 984 array_str 984 array_str 984 array_str 984 array_str 984 array_str 984 array_str 984 array_str 984 array_str 984 array_str 984 array_str

989 np.isnan(strain)989 sum 989 np.isnan(strain) 990 len 995 dq_channel_to_seglist

993 DQflag

995 segment_list

996 len

1003 len

1002 seg_strain

1009 dq_channel_to_seglist1009 segment_list

1010 len

1015 len

1014 seg_strain

Workflowmodel(graph)Facts(Prolog)

ReconstructedprovenanceFacts(Prolog)

RunobservationsFacts(Prolog)

prospective+file-level runtimeobservables

Ludäscher:Queries&ActionableProvenance 31

Page 32: From Provenance Standards and Tools to Queries and Actionable Provenance

LIGOexample:Whatstrain_L1_whitenbp dependson…

Overall workflow

Upstreamofstrain_L1_whitenbp

(prospective)

GRAVITATIONAL_WAVE_DETECTION

LOAD_DATA

Load hdf5 data.

strain_H1strain_L1 strain_16 strain_4

AMPLITUDE_SPECTRAL_DENSITY

Amplitude spectral density.

ASDsfile:GW150914_ASDs.png

PSD_H1PSD_L1

WHITENING

suppress low frequencies noise.

strain_H1_whiten strain_L1_whiten

BANDPASSING

remove high frequency noise.

strain_H1_whitenbp strain_L1_whitenbp

STRAIN_WAVEFORM_FOR_WHITENED_DATA

plot whitened data.

WHITENED_strain_datafile:GW150914_strain_whitened.png

SPECTROGRAMS_FOR_STRAIN_DATA

plot spectrogram for strain data.

spectrogramfile:GW150914_{detector}_spectrogram.png

SPECTROGRAMS_FOR_WHITEND_DATA

plot spectrogram for whitened data.

spectrogram_whitenedfile:GW150914_{detector}_spectrogram_whitened.png

FILTER_COEFS

Filter signal in time domain (bandpassing).

COEFFICIENTS

FILTER_DATA

filter data.

filtered_white_noise_datafile:GW150914_filter.png

strain_H1_filtstrain_L1_filt

STRAIN_WAVEFORM_FOR_FILTERED_DATA

plot the filtered data.

H1_strain_filteredfile:GW150914_H1_strain_filtered.png

H1_strain_unfilteredfile:GW150914_H1_strain_unfiltered.png

WAVE_FILE_GENERATOR_FOR_WHITENED_DATA

Make sound files for whitened data.

whitened_bandpass_wavefilefile:GW150914_{detector}_whitenbp.wav

SHIFT_FREQUENCY_BANDPASSED

shift frequency of bandpassed signal.

strain_H1_shifted strain_L1_shifted

WAVE_FILE_GENERATOR_FOR_SHIFTED_DATA

Make sound files for shifted data.

shifted_wavefilefile:GW150914_{detector}_shifted.wav

DOWNSAMPLING

Downsampling from 16384 Hz to 4096 Hz.

H1_ASD_SamplingRatefile:GW150914_H1_ASD_{SamplingRate}.png

FN_Detectorfile:{Detector}_LOSC_4_V1-1126259446-32.hdf5

FN_Sampling_ratefile:H-H1_LOSC_{DownSampling}_V1-1126259446-32.hdf5

fs

upstream(strain_LI_whitenbp) [prospective]

WHITENING

strain_H1_whiten strain_L1_whiten

AMPLITUDE_SPECTRAL_DENSITY

PSD_H1 PSD_L1

LOAD_DATA

strain_H1 strain_L1

BANDPASSING

strain_L1_whitenbp

FN_Detectorfile:{Detector}_LOSC_4_V1-...

FN_Sampling_ratefile:H-H1_LOSC_{Rate}_V1-...

fs

upstream(strain_L1_whitenbp) [URI-recon]

WHITENING

strain_H1_whiten strain_L1_whiten

AMPLITUDE_SPECTRAL_DENSITY

PSD_H1 PSD_L1

LOAD_DATA

strain_H1 strain_L1

BANDPASSING

strain_L1_whitenbp

FN_Detector

L-L1_LOSC_4_V1-1126259446-32.hdf5H-H1_LOSC_4_V1-1126259446-32.hdf5

FN_Sampling_rate

H-H1_LOSC_4_V1-1126259446-32.hdf5H-H1_LOSC_16_V1-1126259446-32.hdf5

fs

upstream(strain_LI_whitenbp) [NW-recon]

WHITENING

strain_L1_whitenstrain_L1_whiten = array([8.494, -1.672, ..., 72.156])

AMPLITUDE_SPECTRAL_DENSITY

PSD_L1psd_L1 = scipy.interpolate.interpolate.interp1d

object at 0x113969418

LOAD_DATA

strain_L1strain_L1 = array([-1.779e-18, -1.765e-18, ..., -1.719e-18])

BANDPASSING

strain_L1_whitenbpstrain_L1_whitenbp = array([8.184, 19.935,..., -0.684])

FN_Detectorfn_d = L-L1_LOSC_4_V1-1126259446-32.hdf5

fsfs = 4096

Upstreamofstrain_L1_whitenbp(hybridYW-NWatthecode-

level)

Upstreamofstrain_L1_whitenbp(hybridYW-NWatthefile-level)

3inputsspreadacross5 (=2x2+1)files

Doesintermediatedatastrain_L1_whitenbpdependonall5inputs?

• Intermediatedatastrain_L1_whitenbpdependonlyon2 outof5inputs!

Ludäscher:Queries&ActionableProvenance 32

Page 33: From Provenance Standards and Tools to Queries and Actionable Provenance

DwCA TaxonLookupWorkflow

• Declareinputs,outputs,andsteps ofascript(orwf)withYWannotationsto...– communicateprovenancegraphically(viagraphviz)

– combine differentformsofprovenance

– query provenance• SimpleYWannotationsincomments:– @BEGINStep,@ENDStep– @INData,@OUTData– @URITemplate,@LOGPattern

Ludäscher:Queries&ActionableProvenance 33

�����������������

�������������������������������������������������������������������

��������������������������������������������������������������

������������������������������������������������

�������������������������

�������������������������������������������������������������

����������

�������������������������������������������������������������������������������������������������������

����������������

���������������������

�������������������������������������������������������

����������������

�������������������������������������������������������

�������������������

������������������������������������������

������������������

����������������������������������������

�����������������

���������������������������������������

������������

�������������������������������������������������������������������

��������������������������������������������������������

�����������������

Page 34: From Provenance Standards and Tools to Queries and Actionable Provenance

TaxonLookupWorkflow:DataViewandProcessView

Ludäscher:Queries&ActionableProvenance 34

Page 35: From Provenance Standards and Tools to Queries and Actionable Provenance

Thestoryoftwoindividual

records

Ludäscher:Queries&ActionableProvenance 35

�����������������

�����������������

�������������������

�������

����������

����������

�����������������

�����

���������

��������������

����������������

����������

���������������

�����������������

����������������

������

������������������

����������������

�������������������������������

�����������

������������������

����

�����������

������������

�������������

���������������������

�������������������������������������������������������������������

�����������������

�������������������������������������������������������������������������

�����������������

������������������

����������������

�������

����������

�����������

������������������

�����

���������

��������������

����������������

����������

���������������

�����������������

����������������

���������

�����������������

�������������������

���������������������������������

����������

�����������������

��������������������������������������

�����������

������������

�������������

���������������������

�������������������������������������������������������������������

�����������������

������������������������������������������������������������������

• OnetooktheGBIFroute,while…

• … theotherwentallWORMS!

Page 36: From Provenance Standards and Tools to Queries and Actionable Provenance

Theaggregate story..

Ludäscher:Queries&ActionableProvenance 36

�����������������

�����

���������

��������������

����������������

��������������������

�����������������

��������������������������

�������

����������

������������������

�������������������������

�����������������

����������������������������

�����������

�������������������������������

���������

����������

������������������������������

��������

�����������

������������

�������������

���������������������

�������������������������������������������������������������������

�����������������

�������������������������������������������������������������������������

• Howmanyrecordswereobservedasinputsoroutputsofworkflowsteps?

• WerethereanyNULLvalues?Howmany?

Page 37: From Provenance Standards and Tools to Queries and Actionable Provenance

Summary• YWannotationscanbeaddedeasilytoyourscriptstoreapworkflowbenefits– Documentation ofwhat’simportant

– Visualization ofdependencies– Queryingprovenance(prospective,retrospective,andhybrid)

èmakeprovenanceactionableè provenanceforself!

=> github.com/yesworkflow-org/yw=> try.yesworkflow.org

Ludäscher:Queries&ActionableProvenance 37

�����������������

�������������������������������������������������������������������

��������������������������������������������������������������

������������������������������������������������

�������������������������

�������������������������������������������������������������

����������

�������������������������������������������������������������������������������������������������������

����������������

���������������������

�������������������������������������������������������

����������������

�������������������������������������������������������

�������������������

������������������������������������������

������������������

����������������������������������������

�����������������

���������������������������������������

������������

�������������������������������������������������������������������

��������������������������������������������������������

�����������������

�����������������

�����

���������

��������������

����������������

��������������������

�����������������

��������������������������

�������

����������

������������������

�������������������������

�����������������

����������������������������

�����������

�������������������������������

���������

����������

������������������������������

��������

�����������

������������

�������������

���������������������

�������������������������������������������������������������������

�����������������

�������������������������������������������������������������������������

Page 38: From Provenance Standards and Tools to Queries and Actionable Provenance

DemoTime

Ludäscher:Queries&ActionableProvenance 38

(Disclaimer) https://github.com/idaks/dataone-ahm-2016-posterhttps://github.com/idaks/wt-prov-summer-2017https://github.com/yesworkflow-org/yw-idcc-17