resource and performance optimization of big data ...esaule/nsf-pi-csr-2017...resource and...

1
RESEARCH POSTER PRESENTATION DESIGN © 2015 www.PosterPresentations.com Supercomputing facilitates and accelerates scientific discovery BIG DATA: from T to P, to E, to Z, to Y, and beyond… Simulation, Experimental, Observational Collaborative research and discovery A team of scientists with complementary expertise working jointly Massively distributed resources Hardware - Computing facilities, Storage system, i.e., Cloud, Special rendering engines, Display devices (tiled display, powerwall, etc.) Software - Domain-specific analysis/processing tools/programs... Data - Data canter in cloud Data-, computation-, and network-intensive scientific applications pose new challenges An increasing trend in many big data scientific applications to move computing workflow executions to a cloud environment INTRODUCTION PROBLEMS AND OBJECTIVES TECHNICAL SOLUTIONS PERFORMANCE EVALUATION AND COMPARISON CONCLUSION AND FUTURE WORK SELECTED REFERENCES Othman, J.Nicod, L. Philippe, and V. Sonigo, “Optimal energy consumption and throughput for workflow applications on distributed architectures,” J. of Sustainable Computing - Informatics and Systems, vol. 4, no. 1, pp. 55–74, 2014. G. Aupy, A. Benoit, and Y. Robert, “Energy-aware scheduling under reliability and makespan constraints,” in Proc. of the 19th Int. Conf. on High Performance Computing, December 18-22 2012. Agarwalla, N. Ahmed, D. Hilley, and U. Ramachandran, “Streamline: a scheduling heuristic for streaming application on the grid,” in Proc. of the 13th Multimedia Comp. and Net. Conf., San Jose, CA, 2006. Y. Gu, Q. Wu, X. Liu, and D. Yu, “Distributed throughput optimization for large-scale scientific workflows under fault-tolerance constraint,” J. of Grid Computing, vol. 11, no. 3, pp. 361–379, 2013. A. K. Gupta, S. Saxena, and S. Khakhil, “A Journey towards Workflow Scheduling of Cloud Computing,” International Journal of Computer Applications, vol. 123, no. 4, Aug 2015. E. Deelman, G. Singh, M. Livny, B. Berriman, and J. Good, “The cost of doing science on the cloud: The montage example,” in the proceedings of Int. Conf. for High Performance Computing, Networking, Storage and Analysis, SC 2008, NJ, 2008. ACKNOWLEGEMENT Most of the implementations are done by my Ph.D. students – Huda Khalid Al Rammah and Hugh Matlock This research is sponsored by the National Science Foundation under Grant No. 1525537 with Middle Tennessee State University. A three-layer architecture for performance modeling and optimization of big data scientific workflows in distributed environments Yi Gu, Dept. of Computer Science, Middle Tennessee State University Resource and Performance Optimization of Big Data Scientific Workflows in Distributed Network Environments Problem 1 Given a cloud environment modeled as a faulty heterogeneous computer network , where each node or link is associated with an independent failure rate, and a bound for the overall failure rate (OFR); Each computer node is DVFS enabled; A user submits a request of a streaming application modeled as a DAG-structured computing workflow , and a bound on the frame rate (FR) or throughput. Objective : To find a workflow mapping schedule such that the total energy consumption (TEC) is minimized and overall performance satisfies the and constraints. ( , ) cn cn cn G V E ( , ) w w w G V E Problem 2 Given a computing workflow, a cloud model with different types of VMs, and a cost objective function. o processing power (instructions/second) o storage access bandwidth (bytes/second) o charge rate ($/second) Objective : To assign workflow tasks to VM instances such that the total execution time and total cost are minimized. Problem 1 Module execution time Data transfer time Reliability and overall failure rate (OFR) Power consumption/cost model (DVFS) o Three power modes o EC on a server o TEC Execution time, CPU frequency and reliability o Execution time at frequency o Execution time at frequency under failure rate o Transfer time under failure rate v s v 3 v 1 v d v q v x v p …... …... w 0 w m-1 w 1 w l w t …... …... …... Computing workflow Computer network w 2 w s v y v 2 w 5 …... w 4 w 3 layer 0 layer 1 layer 2 …... layer l-1 layer l-2 v 4 v 5 f Problem 2 Run time of each task (seconds) I/O time of each task (seconds) Makespan Cost ($) The total cost ($) Problem 1 Energy and FR with the optimal frequency = 2.2 GHz Problem 2 Breeze logo EC comparison of four algorithms LEES vs Othman Tackled a tri-objective optimization problem Developed a Breeze simulator to illustrate the workflow execution using different scheduling algorithms To conduct more experiments using real-life large-scale application workflows in a real cloud environment To define more realistic cloud model and pricing structure multi-core VMs modeling local storage performance allotting some time to provision a VM Breeze framework o AutoCloud Java library decides a virtual machine instance type to run each task o Breeze Simulator utilizes two libraries WorkflowSim CloudSim o Breeze Visualizer runs in the user’s web browser depicts the workflow structure statically and dynamically n I(n) n V V ccost = makespanCost + r vmtime makespanCost = makespan costOfTime k k,i j i, j i I(M(i)) f + f ciotime = b i i I(M(i)) c cruntime = p all possible mappings min ( ), Where 1 OFR F EC BT F FR f hk trans ij hk trans ij hk r T (e ,l ) RT (e ,l )= R (t) h exec i h exec i h r T (m ,v , f) RT (m ,v , f)= R (t) h,max exec i h exec i h h,max cpu f T (m ,v , f)= T (m ,v , f )* ( ( -1)+1) f 3 active mode () idle mode sleep mode j j j j j j f Pt h k v ,Active h exec i h i=1 E = P RT (m ,v ) 1 0 h N v ,Active h= TEC = E ( ) () ( ) j a j k m aj m pre m exec i k v k C z T (m ,v )= S p , hk ij trans ij ij hk l hk hk z T (z ,e l )=S d b () b b b t r R t e 1 0 () cn b N E sd r b R R t 1 OFR sd F R

Upload: others

Post on 11-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Resource and Performance Optimization of Big Data ...esaule/NSF-PI-CSR-2017...Resource and Performance Optimization of Big Data Scientific Workflows in Distributed Network Environments

RESEARCH POSTER PRESENTATION DESIGN © 2015

www.PosterPresentations.com

• Supercomputing facilitates and accelerates scientific discovery

• BIG DATA: from T to P, to E, to Z, to Y, and beyond…

Simulation, Experimental, Observational

• Collaborative research and discovery

A team of scientists with complementary expertise working jointly

• Massively distributed resources

Hardware - Computing facilities, Storage system, i.e., Cloud,

Special rendering engines, Display devices (tiled display,

powerwall, etc.)

Software - Domain-specific analysis/processing tools/programs...

Data - Data canter in cloud

• Data-, computation-, and network-intensive scientific

applications pose new challenges

• An increasing trend in many big data scientific applications to

move computing workflow executions to a cloud environment

INTRODUCTION

PROBLEMS AND OBJECTIVES

TECHNICAL SOLUTIONS

PERFORMANCE EVALUATION AND COMPARISON

CONCLUSION AND FUTURE WORK

SELECTED REFERENCES• Othman, J.Nicod, L. Philippe, and V. Sonigo, “Optimal energy consumption and throughput for

workflow applications on distributed architectures,” J. of Sustainable Computing - Informatics and

Systems, vol. 4, no. 1, pp. 55–74, 2014.

• G. Aupy, A. Benoit, and Y. Robert, “Energy-aware scheduling under reliability and makespan

constraints,” in Proc. of the 19th Int. Conf. on High Performance Computing, December 18-22 2012.

• Agarwalla, N. Ahmed, D. Hilley, and U. Ramachandran, “Streamline: a scheduling heuristic for

streaming application on the grid,” in Proc. of the 13th Multimedia Comp. and Net. Conf., San Jose,

CA, 2006.

• Y. Gu, Q. Wu, X. Liu, and D. Yu, “Distributed throughput optimization for large-scale scientific

workflows under fault-tolerance constraint,” J. of Grid Computing, vol. 11, no. 3, pp. 361–379, 2013.

• A. K. Gupta, S. Saxena, and S. Khakhil, “A Journey towards Workflow Scheduling of Cloud

Computing,” International Journal of Computer Applications, vol. 123, no. 4, Aug 2015.

• E. Deelman, G. Singh, M. Livny, B. Berriman, and J. Good, “The cost of doing science on the cloud:

The montage example,” in the proceedings of Int. Conf. for High Performance Computing,

Networking, Storage and Analysis, SC 2008, NJ, 2008.

ACKNOWLEGEMENTMost of the implementations are done by my Ph.D. students –

Huda Khalid Al Rammah and Hugh Matlock

This research is sponsored by the National Science Foundation under Grant No.

1525537

with Middle Tennessee State University.

• A three-layer architecture for performance modeling and

optimization of big data scientific workflows in distributed

environments

Yi Gu, Dept. of Computer Science, Middle Tennessee State University

Resource and Performance Optimization of Big Data Scientific Workflows in Distributed Network Environments

• Problem 1

Given a cloud environment modeled as a faulty heterogeneous

computer network , where each node or link is

associated with an independent failure rate, and a bound 𝑭 for

the overall failure rate (OFR);

Each computer node is DVFS enabled;

A user submits a request of a streaming application modeled as

a DAG-structured computing workflow , and a bound

𝑭𝑹 on the frame rate (FR) or throughput.

Objective: To find a workflow mapping schedule such that the

total energy consumption (TEC) is minimized and overall

performance satisfies the 𝑭 and 𝑭𝑹 constraints.

( , )cn cn cnG V E

( , )w w wG V E

• Problem 2

Given a computing workflow, a cloud model with different types of

VMs, and a cost objective function.

o processing power (instructions/second)

o storage access bandwidth (bytes/second)

o charge rate ($/second)

Objective: To assign workflow tasks to VM instances such that

the total execution time and total cost are minimized.

• Problem 1

Module execution time

Data transfer time

Reliability and overall failure rate (OFR)

Power consumption/cost model (DVFS)

o Three power modes

o EC on a server

o TEC

Execution time, CPU frequency and reliability

o Execution time at frequency

o Execution time at frequency under failure rate

o Transfer time under failure rate

vs

v3

v1

vd

vq

vx

vp

…...

…...

w0 wm-1

w1

wl

wt…...

…...

…...

Computing workflow

Computer network

w2

ws

vy

v2

w5 …...

w4

w3

layer 0 layer 1 layer 2 …... layer l-1layer l-2

v4

v5

f

• Problem 2

Run time of each task (seconds)

I/O time of each task (seconds)

Makespan Cost ($)

The total cost ($)

• Problem 1

Energy and FR with the optimal

frequency = 2.2 GHz

• Problem 2

Breeze logo

EC comparison of four algorithms LEES vs Othman

• Tackled a tri-objective optimization problem

• Developed a Breeze simulator to illustrate the workflow

execution using different scheduling algorithms

• To conduct more experiments using real-life large-scale

application workflows in a real cloud environment

• To define more realistic cloud model and pricing structure

multi-core VMs

modeling local storage performance

allotting some time to provision a VM

Breeze framework

o AutoCloud Java library

decides a virtual machine instance type to run each task

o Breeze Simulator utilizes two libraries

WorkflowSim

CloudSim

o Breeze Visualizer

runs in the user’s web browser

depicts the workflow structure statically and dynamically

n

I(n) n

V V

ccost = makespanCost + r vmtime

makespanCost = makespan costOfTime

k k,i j i, j

i

I(M(i))

f + fciotime =

b

ii

I(M(i))

ccruntime =

p

all possible mappings  min ( ), Where 1

OFRF

EC

BT

F

FR

f

hk

trans ij hk

trans ij hk

r

T (e ,l )RT (e ,l )=

R (t)

h

exec i hexec i h

r

T (m ,v , f)RT (m ,v , f)=

R (t)

h,max

exec i h exec i h h,max cpu

fT (m ,v , f)= T (m ,v , f )* ( ( -1)+1)

f

3 active mode

( ) idle mode

sleep mode

j j j

j j

j

f

P t

h

k

v ,Active h exec i h

i=1

E = P RT (m ,v )

1

0h

N

v ,Active

h=

TEC = E

( )

( )

( )

j

a j

k

m aj

m pre m

exec i k v

k

C z

T (m ,v )= Sp

,hk

ij

trans ij ij hk l hk

hk

zT (z ,e l )= S d

b

( ) b b

b

t

rR t e

1

0

( )cn

b

N E

sd r

b

R R t

1OFR sdF R