Third Provenance ChallengeThird Provenance ChallengeUniversity of Texas at El University of Texas at El Paso Paso Team’s PresentationTeam’s Presentation
Team: Paulo Pinheiro da Silva, Nicholas Del Rio, Leonardo Salayandia
Presenter: James Michaelis (RPI)
http://trust.utep.edu
OverviewOverview
UTEP Approach: Process and Provenance Separation
Process: Workflow-Driven Ontologies (WDO) and Semantic Abstract Workflow (SAW)◦ PC3 WDO and SAWs
Provenance: Proof Markup Language (PML)◦ PC3 PML◦ Capturing PC3 PML
Answering PC3 QuestionsConclusions
UTEP ApproachUTEP ApproachDifferent than OPM that considers process and
provenance knowledge altogether, UTEP uses Inference Web technology that has an explicit separation between process and provenance knowledge ◦ Inference Web work on provenance was originally
developed in the context of theorem provers instead of scientific workflows
◦ Inference Web has been expanded to include support for scientific workflows
◦ Separation between process and provenance has been preserved (and is considered beneficial considering many provenance scenarios without process knowledge)
Process knowledge: Workflow-Driven Ontology (WDO) and Semantic Abstract Workflow (SAW)
Provenance knowledge: Proof Markup Language (PML)
WDOs and SAWsWDOs and SAWsWDOs are OWL-based ontologies used
to represent process-related concepts, which are classified either as Data or Methods
WDO concepts can be created or reused from other domain ontologies as needed during the specification of processes
SAWs are built using instances of the WDO concepts connected through isInputTo and isOutputOf relations (and their inverses)
WDO-It! is a graphic editor for WDOs and SAWs
PC3 Semantic Abstract PC3 Semantic Abstract WorkflowWorkflow
WDO Data instances
WDO Method
instances
PML-P Agent instances: Data comes from or goes to PML-P
Agent
Data isOutputOf Method
Data isInputTo Method
Abstraction at multiple levels of
detail
Proof Markup Language Proof Markup Language (PML)(PML)PML is an OWL-based ontology
composed of three modules:◦PML-J (justifications): used to build
information manipulation traces (or justifications) for a given response (or result)
◦PML-P (provenance): used to annotate PML-J documents with metadata about sources, methods (called inference rules), and agents
◦PML-T (trust): used to annotate PML-J with trust and belief metadata about agents and conclusions
PC3 PML EncodingPC3 PML Encoding<rdf:RDF> <NodeSet rdf:about="http://iw.utep.edu/pml/compactedDerbyDB_.owl#answer"> <hasConclusion> <pmlp:Information> <pmlp:hasURL rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI" > http://iw.cs.utep.edu/pc3/databases/J062941_LoadDB_022030949845896586 </pmlp:hasURL> <pmlp:hasFormat rdf:resource="http://iw.utep.edu/registry/FMT/derbyDB.owl#derbyDB"/> </pmlp:Information> </hasConclusion> <isConsequentOf> <InferenceStep>
<hasInferenceEngine rdf:resource="http://iw.utep.edu/registry/IE/PC3-PSLoadExecutable.owl#PC3"/> <hasInferenceRule rdf:resource="http://iw.utep.edu/registry/RUL/compactDB.owl#compactDB"/> <hasIndex rdf:datatype="http://www.w3.org/2001/XMLSchema#int" >0</hasIndex> <hasAntecedentList> <NodeSetList> <ds:first rdf:resource="http://iw.utep.edu/pml/derbyDB_3.owl#answer"/> </NodeSetList> </hasAntecedentList> </InferenceStep> </isConsequentOf> </NodeSet> </rdf:RDF>
OPM:Artifact
OPM:Process
OPM:WasGeneratedBy
OPM:WasControlledBy
PML CapturePML CaptureFrom a given SAW, WDO-It! has
two options to generate code capable of capturing provenance: ◦Generate PML wrappers
used for run-time capture of provenance
◦Generate PML data annotators used for post-execution generation of
provenance
Answering PC3 Questions :Answering PC3 Questions :What proc. steps were used?What proc. steps were used?
SPARQL can be used to query the PML provenance graph.
This example shows how a SPARQL query could use the PML graph to answer what processing steps were used to generate some artifact.
ConclusionConclusionThe full encoding of the WDO, SAWs
and PML for PC3 was done in 36 hours
UTEP’s approach relies on tools to:◦Understand and speed-up the encoding
of process knowledge (as WDOs and SAWs)
◦Use process knowledge to create PML wrappers and/or PML data annotators
◦Visualize and browse provenance◦Use provenance for explanations, trust
computation, data discovery, etc.
AcknowledgementsAcknowledgementsUTEP would like to thank James
Michaelis for his effort to understand our work and represent our team at the 3rd Provenance Challenge
UTEP would like to thank the 3rd Provenance Challenge organizers and Paul Groth in particular for creating an opportunity for our team to be represented at the event