network resource selection for data transfer processes in scientific workflows

33
Network resource selection for data transfer processes in scientific workflows Zhiming Zhao Paola Grosso, Ralph Koning, Jeroen van der Ham, Cees de Laat System and Network Engineering (SNE) University of Amsterdam (UvA) Z.Zhao et al., Network resource selection for data transfer processes in scientific workflow s , WORKS10, New Orleans, 2010.

Upload: harlow

Post on 23-Feb-2016

25 views

Category:

Documents


0 download

DESCRIPTION

Network resource selection for data transfer processes in scientific workflows. Zhiming Zhao Paola Grosso , Ralph Koning , Jeroen van der Ham, Cees de Laat System and Network Engineering (SNE) University of Amsterdam ( UvA ). - PowerPoint PPT Presentation

TRANSCRIPT

Slide 1

Network resource selection for data transfer processes in scientific workflowsZhiming ZhaoPaola Grosso, Ralph Koning, Jeroen van der Ham, Cees de Laat

System and Network Engineering (SNE)University of Amsterdam (UvA)

Z.Zhao et al., Network resource selection for data transfer processes in scientific workflow s, WORKS10, New Orleans, 2010.

OutlineBackground: e-Science, Scientific workflows and advanced network infrastructureResearch problem: including network QoS in scientific workflowsNEWQoSPlanner: an agent based solutionA use case: Quality guaranteed video delivery on demandDiscussionConclusions and future workBackground: e-Science and scientific workflowE-Science applications are characterized byMassive data (acquiring and storing)Intensive computing (Simulation, visualization and data processing)Large scale collaboration (among processes, resources and domain scientists)A workflow management systemAutomates the execution of experiment processes Controls the flow (data and control ) between processesAllows scientists focus on experiments at different levels of abstractionsHides the low level technical details from scientistsHas been recognized as a core e-Science service. 3Workflow execution: mapping between resources

Abstract processesConcreteworkflowStorage, computing elementsNetwork

VisualizationData acquisitionProcessingStoring results4Quality tuning in scientific workflowVisualization

Abstract processes:Refine application logicConcreteworkflow: select optimal services, components Storage, computing elements: select high performance resourcesNetwork: network path selection.Data acquisitionProcessingStoring resultsIn traditional loopNew loop5Why including advanced network in the loop?Data movement causes performance bottleneck for workflow, Scientific workflows are often data intensive;and quality control at high level is not sufficient;Existing workflow systems did not take network service into accountExisting network infrastructure provides limited flexibility for application level control.Advanced network , e.g., multi layer and programmable network, offer high level application new opportunities:Path selection;Provisioning;Allocation.

6Related work: QoS in the workflow lifecycleQoS in workflow descriptionQoS texonomy [Sabata, 97], QoS ontology [Gramm, 03], QML [Frolund, 98], Vienna composition language (VCL) [Rosenberg, 09]. Resource brokerbudget based scheduling, Nimroad-G, GRACE [Buyya, 02].Constraints between quality parameters (such as execution time, reliability etc.) and economic cost.Service selection Composition: requirement specification [Jia 05], service selection [Zeng 04], [Brandic 05]. Enactment and scheduling [Yash, 06], planning, and resource reservation [Benkner, 04].Network control in workflowVLAM and interactive network [Belloum et. al, 09]QoS constraint solvingShortest path finding algorithm;Multi objective optimization problem: Ant colony optimization (ACO).

What did we observe?Most of workflow systems do not include network quality parameters in the workflow scheduling and execution control. The work in VLAM and interactive network integrates the workflow engine with special network using a customized solution, which does not promote the reusability of the solution.We need a new solution!8Research context and approachCineGrid projectMain mission: dedicated network, share large quantities of very high quality media material.What has been developed:Semantic description of the resourcesNetwork description language (NDL);CineGrid description language (CDL).ApproachPropose an independent service, which can be plugged in existing workflow system to provide network QoS features

9Network for Workflow QoS planner (NEWQoSPlanner)VisualizationData acquisitionProcessingStoring results?A planner for optimizing data movement related workflow processesSelect network resourcesMake provisioning plansGenerate network QoS aware sub workflowNEWQoSPlannerProvisioning planSelected candidateResource DiscoveryAgent (RDA)QoS aware Workflow Planner (QoSWP)Workflow engineWorkflowComposerAgent (WCA)UserrequestNetwork resource descriptionsResource ProvisionPlanner (RPP)Provision planDatadelivery workflowrequirementsResource candidatesMedia delivery workflowSelected candidateMulti agent system for QoS aware workflow managementQoS Monitoring Agent (QMA)Provenance Service Agent (PSA)ResourcesNEtwork awareWorkflow QoS Planner (NEWQoSPlanner)Provisioning planSelected candidateResource DiscoveryAgent (RDA)QoS aware Workflow Planner (QoSWP)Workflow engineWorkflowComposerAgent (WCA)UserrequestNetwork resource descriptionsResource ProvisionPlanner (RPP)Provision planDatadelivery workflowrequirementsResource candidatesMedia delivery workflowSelected candidateMulti agent system for QoS aware workflow managementQoS Monitoring Agent (QMA)Provenance Service Agent (PSA)Resources1NEtwork awareWorkflow QoS Planner (NEWQoSPlanner)Provisioning planSelected candidateResource DiscoveryAgent (RDA)QoS aware Workflow Planner (QoSWP)Workflow engineWorkflowComposerAgent (WCA)UserrequestNetwork resource descriptionsResource ProvisionPlanner (RPP)Provision planDatadelivery workflowrequirementsResource candidatesMedia delivery workflowSelected candidateMulti agent system for QoS aware workflow managementQoS Monitoring Agent (QMA)Provenance Service Agent (PSA)Resources12NEtwork awareWorkflow QoS Planner (NEWQoSPlanner)Provisioning planSelected candidateResource DiscoveryAgent (RDA)QoS aware Workflow Planner (QoSWP)Workflow engineWorkflowComposerAgent (WCA)UserrequestNetwork resource descriptionsResource ProvisionPlanner (RPP)Provision planDatadelivery workflowrequirementsResource candidatesMedia delivery workflowSelected candidateMulti agent system for QoS aware workflow managementQoS Monitoring Agent (QMA)Provenance Service Agent (PSA)Resources123NEtwork awareWorkflow QoS Planner (NEWQoSPlanner)Provisioning planSelected candidateResource DiscoveryAgent (RDA)QoS aware Workflow Planner (QoSWP)Workflow engineWorkflowComposerAgent (WCA)UserrequestNetwork resource descriptionsResource ProvisionPlanner (RPP)Provision planDatadelivery workflowrequirementsResource candidatesMedia delivery workflowSelected candidateMulti agent system for QoS aware workflow managementQoS Monitoring Agent (QMA)Provenance Service Agent (PSA)Resources12344NEtwork awareWorkflow QoS Planner (NEWQoSPlanner)Provisioning planSelected candidateResource DiscoveryAgent (RDA)QoS aware Workflow Planner (QoSWP)Workflow engineWorkflowComposerAgent (WCA)UserrequestNetwork resource descriptionsResource ProvisionPlanner (RPP)Provision planDatadelivery workflowrequirementsResource candidatesMedia delivery workflowSelected candidateMulti agent system for QoS aware workflow managementQoS Monitoring Agent (QMA)Provenance Service Agent (PSA)Resources1234545NEtwork awareWorkflow QoS Planner (NEWQoSPlanner)Provisioning planSelected candidateResource DiscoveryAgent (RDA)QoS aware Workflow Planner (QoSWP)Workflow engineWorkflowComposerAgent (WCA)UserrequestNetwork resource descriptionsResource ProvisionPlanner (RPP)Provision planDatadelivery workflowrequirementsResource candidatesMedia delivery workflowSelected candidateMulti agent system for QoS aware workflow managementQoS Monitoring Agent (QMA)Provenance Service Agent (PSA)Resources12345645NEtwork awareWorkflow QoS Planner (NEWQoSPlanner)Provisioning planSelected candidateResource DiscoveryAgent (RDA)QoS aware Workflow Planner (QoSWP)Workflow engineWorkflowComposerAgent (WCA)UserrequestNetwork resource descriptionsResource ProvisionPlanner (RPP)Provision planDatadelivery workflowrequirementsResource candidatesMedia delivery workflowSelected candidateMulti agent system for QoS aware workflow managementQoS Monitoring Agent (QMA)Provenance Service Agent (PSA)Resources1234564577NEtwork awareWorkflow QoS Planner (NEWQoSPlanner)Implementation issuesQoS requirementsResource selectionWorkflow compositionResource monitoringAdaptable network resource planningImplementation issuesQoS requirementsResource selectionWorkflow compositionResource monitoringAdaptable network resource planningNetwork and Cine Grid description languageCineGrid resource Description LanguageContent: video/audio/dataServices: storage, visualization, streaming etc.Devices: host, screen, projector, etc.Network Description LanguageInterfaceDevicesConnection pointsOntologies are integrated via propertyowl:equivalentClassowl:equivalentPropertyowl:sameAs

QoS abstract workflow process description schemaData related processPre/Execution/Post conditionQoS (attributes)

Ontology mapping

Resource selectionFrom resource description and requirements to derive set of candidates (data sources, destinations and network paths)Data sources are derived from the pre conditions of the processData destinations are derived from the process and post conditionNetwork paths: paths between source and destinationRanking: order the candidates based on the qualitySearching procedure

Current prototypeSWIProlog/Semantic web libraryRDF triples manipulationsGraph finding algorihm -> network pathSolving constraintsJAVA Prolog interface (JPL)Manipulate Prolog functions via JavaJava Agent development frameworkAgent communication language (ACL) between agentsXMLRPC: between agent and web portalUse case: QoS guaranteed media delivery on demandMedia delivery on demandSearch moviePropose network pathPlayback the moviePortal + search engine (RDA)

Query time and triplesThe above figure shows the time costs for a query while the number of triples loaded in the search engine increases. It is measured while all previous queries are kept in the memory. The result implies the cost while concurrent queries are made. In the actual situation, the server cleans the history of a query after it expired. A query usually contains 20 ~30 triples.Query time cost The figure shows the time costs for some typical queries. The cost of a query depends on the number of constraints, and the quantity of available meta information of the resource. DiscussionThe QoSAWF can describe most of the cases we need in the use case. Quality evaluation of the candidateHow precise the descriptions are?The monitoring of the actual state of the networkStatic analysis

ConclusionsNetwork quality tuning is crucial for improving performance of data movement processes in scientific workflows;Using the semantic web technology, the QoSAWF ontology provides a lightweight solution to describing QoS requirements for data operation related workflow process;The network resource discovery agent provides necessary service for tuning data transfer processes from the application level.

Future workSemantic search of movie dataFrom single process searching to multiple processesAutomatic composition of provisioning plan and workflowReferencesQoSAWF: http://cinegrid.uvalight.nl/owl/qosawf.owlCDL: http://cinegrid.uvalight.nl/owl/cdl/2.0NDL domain: http://cinegrid.uvalight.nl/owl/ndl-domain.owlNDL topology: http://cinegrid.uvalight.nl/owl/ndl-topology.owlPortal: http://cinegrid.uvalight.nl/Booth at SC10: Dutch research, #4049QosAWF_ThingReliabilityCodecQualityResolutionPrecisionThroughputSecurity_LevelTimeliness ArchiveDataPlaybackDataQuality_AttributeRequestDataConditionProcessCompressionRateFramerateOr_ConditionAnd_ConditionMediaScientific_DataVideoAudioSensor_DataSimulation_ResultsIs-aIs-aIs-aIs-aIs-aIs-aIs-aIs-aIs-aIs-aIs-aIs-aIs-aIs-aIs-aIs-aIs-aIs-aIs-aIs-aIs-aIs-aIs-aIs-arequire_Functionalitycontain_Conditioncreate_Datarequire_Datarequire_Qualityexecution_Conditionpost_Conditionpre_ConditionMoveDataIs-aParse requirement, Process- pre_condition {({Data},{QoD})}- execution_condition{(QoS)}- post_condition{({Data}, {QoD})}Select Cpre=({Data}, {QoD}) from pre_condition:{Rdata}: (D1 D2 Dn){Rqod}: (Qa1 Qa2 Qan)1If(post_condition) not empty {Dsource} which meets Cpre1If (execution_cond) not emptySelect Cexe=({Qos}) from pre_condition:{RQoS}: (Qs1 Qs2 Qsn)Select Cpost=({Data}, {QoD}) from post_condition:{Rdata}: (D1 D2 Dn){Rqod}: (Qa1 Qa2 Qan)1222334{Ddestination}, which provides services with Cexe, and produces Cpost. Search data sourcesIf(pre_condition) not emptySearch data destinationSearch data destination{(Dsource, Ddestination, {Path(Dsource, Ddestination)})}Search network paths for {Hsource}, {Hdestination}Compute candidate quality rankValidate qualityCandidates={(Dsource, Ddestination, {Path(Dsource, Ddestination), Quality})}