data parallel application development and performance with windows azure

31
Data Parallel Application Development and Performance with Windows Azure Advisor : Professor Gagan Agrawal Present by : Yu Zhang

Upload: greg

Post on 24-Feb-2016

33 views

Category:

Documents


0 download

DESCRIPTION

Data Parallel Application Development and Performance with Windows Azure. Advisor : Professor Gagan Agrawal Present by : Yu Zhang . Agenda. Introduction to Windows Azure Parallel Model in Azure Implementation with Queue Implementation with WCF Experimental Evaluation - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data Parallel Application   Development and Performance with   Windows Azure

Data Parallel Application Development and Performance with Windows Azure

Advisor : Professor Gagan AgrawalPresent by : Yu Zhang

Page 2: Data Parallel Application   Development and Performance with   Windows Azure

Agenda

Introduction to Windows AzureParallel Model in AzureImplementation with QueueImplementation with WCFExperimental Evaluation Conclusion

Page 3: Data Parallel Application   Development and Performance with   Windows Azure

Motivation

Emergency of Cloud Computing • Windows Azure• Amazon EC2• Google App EngineMain Target of Clouds • Changing the way we provision hardware and software for on-demand capacity fulfillment.• Hosting web service• Interest from Scientific Community

Page 4: Data Parallel Application   Development and Performance with   Windows Azure

Goals

Develop Data Parallel App in Azure is feasible

• How to develop parallel applications on Azure?• What is the resulting performance?

Specific Aim• Simulate MPI reduce and all-reduce on Azure• Build data parallel applications

Page 5: Data Parallel Application   Development and Performance with   Windows Azure

Introduction to Windows Azure

The same facilities that a desktop OS provides, but on a set of connected servers:

• Abstract execution environment• Shared file system• Resource allocation• Programming environments

Utility computing

• 24/7 operation• Pay for what you use• Simpler, transparent administration

Page 6: Data Parallel Application   Development and Performance with   Windows Azure

What Is Windows Azure?

It is an operating system for the cloudIt is designed for utility computingIt has four primary features: Write your apps (developer experience) Host your apps (compute) Manage your apps (service management) Store your data (storage)

Page 7: Data Parallel Application   Development and Performance with   Windows Azure

Windows Azure Components

Windows Azure PaaSApplications Windows Azure Service Model

Runtimes .NET 3.5/4, ASP .NET, PHP

Operating System Windows Server 2008/R2-Compatible OS

Virtualization Windows Azure Hypervisor

Server Microsoft Blades

Database SQL Azure

Storage Windows Azure Storage (Blob, Queue, Table)

Networking Windows Azure-Configured Networking

Page 8: Data Parallel Application   Development and Performance with   Windows Azure

The Windows Azure Service Model

A Windows Azure application is called a “service”Definition informationConfiguration informationAt least one “role”

Service definition is in ServiceDefinition.csde Defines aspects of a service that cannot be changed without redeployment

Types of roles and static role configurationSet of configuration settings for a roleContract with the environment code runs

Page 9: Data Parallel Application   Development and Performance with   Windows Azure

The Windows Azure Service Model

Service configuration is in ServiceConfiguration.cscfgDefines values for properties that can be dynamically updated for a running deployment

Values of a configuration parameterNumber of running instances

Page 10: Data Parallel Application   Development and Performance with   Windows Azure

The Windows Azure Service Model Role Content

Definition: • Role name• Role type• VM size (e.g. small, medium, etc.)• Network endpoints

Code: • Web/Worker Role: Hosted DLL

and other executables• VM Role: VHD

Configuration:• Number of instances• Number of update and fault domains

Page 11: Data Parallel Application   Development and Performance with   Windows Azure

Desktop And Related Azure Concepts

Desktop

EXEApplication ConfigurationManifestDLL• Windows forms library• Windows serviceLocal data stores

Windows Azure

Service packageService configurationService definitionService role• Web role• Worker roleInternet data stores

Page 12: Data Parallel Application   Development and Performance with   Windows Azure

Web Role

Storage Services

Public Internet

Web RoleLoad

Balancer

Web Role handles request from the internetIIS7 hosted web core Hosts ASP.NET XML based configuration of IIS7 Integrated managed pipeline Supports SSL

Page 13: Data Parallel Application   Development and Performance with   Windows Azure

Worker Role No inbound

network connections Can read requests

from queue in storage or through Windows

Communication Foundation

Storage Service

Worker Role

Worker RoleWorker

RoleWorker

Role

Web Role

Page 14: Data Parallel Application   Development and Performance with   Windows Azure

Windows Azure Storage Abstractions

Blobs – provide a simple interface for storing named files along with metadata for the fileTables – provide structured storage. A table is a set of entities, which contain a set of propertiesQueues – provide reliable storage and delivery of messages for an application

Page 15: Data Parallel Application   Development and Performance with   Windows Azure

Windows Azure Queues

Queue is highly scalable, available and provide reliable message deliverySimple, asynchronous work dispatchA storage account can create any number of queues8K message size limit and default expiry of 7 daysProgramming semantics ensure that a message must be processed at least

once • Get message to make the message invisible• Delete message to remove the message

Page 16: Data Parallel Application   Development and Performance with   Windows Azure

Queues Tips

Messages > 8KB => Use blobs or tables to store and message contains the blob or table entity VisibilityTimeout

A queue message will reappear in VisibilityTimeOut (default 30sec)

2 1

C1

C2

1234

Producers Consumers

P2

P1

3 12

Queue Usage Example

Page 17: Data Parallel Application   Development and Performance with   Windows Azure

MPI programming model

Communicating sequential processesEach process runs in its own local address space.Processes exchange data and synchronize via

message passing. ( Usually, but not always, same code executed by all processes.)

Need to take care of locality, in order to achieve performance – message passing does this explicitly.

Page 18: Data Parallel Application   Development and Performance with   Windows Azure

Azure Parallel Programming Model

VMS

LBIIS

VMS

Web Role Worker

Role

Queue or WCF

Web role hosts IIS service to accept outside requestWeb role distributes workload to Worker roleWorker roles run and compute simultaneouslyCommunication between roles: Queue or WCF

Page 19: Data Parallel Application   Development and Performance with   Windows Azure

Simulation of MPI_Reduce in Azure

MPI_Reduce(inbuf, outbuf, count, type, op, root, comm)

Inbuf : address of input buffer Outbuf: address of output buffer Count : number of elements in input buffer Type : datatype of input buffer elements Op : operation Root : process id of root process

While (True){if (queue1.Exists()) { var msg = queue1.GetMessage(); if (msg != null) { DoWork(); queue1.DeleteMessage(msg); }if (queue2.Exists()) { var msg = queue2.GetMessage(); if (msg != null) { DoWork(); queue2.DeleteMessage(msg); }..……if (!queue1.Exists()&&(!queue2.Exists()&&(!queue3.Exists()&&……) { Break; }} Compute ();………………..}

public class WorkerRole : RoleEntryPoint { Public override void Run() { doWork(); var msg = new CloudQueueMessage(); queue.AddMessage(msg); }

Page 20: Data Parallel Application   Development and Performance with   Windows Azure

Simulation of MPI_ALLReduce in Azure

MPI_Allreduce(inbuf, outbuf, count, type, op, comm)

Inbuf : address of input buffer Outbuf: address of output buffer Count : number of elements in input buffer Type : datatype of input buffer elements Op : operation

While (True){if (queue1.Exists()) { var msg = queue1.GetMessage(); if (msg != null) { DoWork(); queue1.DeleteMessage(msg); }if (queue2.Exists()) { var msg = queue2.GetMessage(); if (msg != null) { DoWork(); queue2.DeleteMessage(msg); }..……if (!queue1.Exists()&&(!queue2.Exists()&&(!queue3.Exists()&&……) { Break; }} Compute (); var msg = new CloudQueueMessage();  queue1. AddMessage(msg); queue2. AddMessage(msg); ………………..………………..}

public class WorkerRole : RoleEntryPoint { Public override void Run() { if (queue.Exists()) { var msg = queue.GetMessage(); if (msg != null) { DoWork(); queue1.DeleteMessage(msg); } doWork(); var msg = new CloudQueueMessage(); queue.AddMessage(msg); }

Page 21: Data Parallel Application   Development and Performance with   Windows Azure

Matrix Multiplication Each worker role reads the data from matrix BDecouple the matrix A into n parts, n is the number of the worker

roles.Each worker role gets one part of matrix A, for a N×N matrix, each

worker role has two data sets, one is matrix B, the other is part of matrix A, say AK (1≤k≤n) n is the number of worker roles.

Each worker role computes the AK×B and add the result to its queueWeb role performs the reduce operation gets the final result.

Matrix A Matrix B

Page 22: Data Parallel Application   Development and Performance with   Windows Azure

K Means

1. Web role calculates the initial means2 .Broadcast the k centroids to all worker roles3. Each worker role computes distance of each local document vector to the centroids4. Assign points to closest centroid and compute local MSE (Mean Squared Error)5. Perform reduction for global centroids and global MSE value6. Web role broadcast new cnetroids to all worker role until no points move.

Page 23: Data Parallel Application   Development and Performance with   Windows Azure

KNN

1. Web role be the master, the other N worker roles are slaves.2. Master divides the training samples to N subsets, and distributes 1 subset

for each worker role. 3. Each individual worker role now computes the distance measures independently and storing the computes measures in a local array4. When each worker role terminates distance calculation, it transmits a message to the web role indicating end of processing5. Web role then notes the end of processing for the sender and acquires the computes measures by reduction.6. After the web role has claimed all distance measures from all WRs, the following steps are performed:• Select top k measures• Sort all distance measures in ascending order• Count the number of classes in the top k measures• The input element’s class will belong to the class having the higher count among top k measures

Page 24: Data Parallel Application   Development and Performance with   Windows Azure

An Optimatized Solution --- WCF

What is Windows Communication Foundation (WCF)? WCF is Microsoft’s implementation of industry standards to provide a communication subsystem enabling applications on one machine (process boundary) or across multiple machines to communicate. WCF is a core component of the .NET Framework 3.0 and later versions which is included with Windows 7 and Vista platforms as well as the future version of Windows Server. The WCF API unifies ASMX Web Services, .NET Remoting, distributed transactions and messaging into a single programming model service orientation tenable. Fundamental to .NET Framework.

ASMX

WSE

.NET Remotin

g

COM+ (Enterpr

ise Service

s)

MSMQ

WCF

Page 25: Data Parallel Application   Development and Performance with   Windows Azure

WCF: Address, Binding, Contract

Client Service

Message

Address Binding ContractWhere? How? What?

EndpointABC A B C

EndpointsA B C

WCF Services are deployed, discovered and consumed as endpoints

Page 26: Data Parallel Application   Development and Performance with   Windows Azure

WCF : Endpoint Contract

All services expose a Contract.WCF uses 5 types of contracts:Service Contract – Exposes the service.Operation Contract- Exposes the service members.Data Contract – Describes service parameters.. <!-- configuration file used by above code --><configuration xmlns="http://schemas.microsoft.com/.NetConfiguration/v2.0"> <system.serviceModel> <services> <!-- service element references the service type --> <service type="MM"> <!-- endpoint element defines the ABC's of the endpoint --> <endpoint address="http://localhost/MM/Ep1" binding="netTCPBinding" contract="IMM"/> </service> </services> </system.serviceModel></configuration>

AddressAn Address uniquely identifies a service.Provides the transport protocol, name of targetmachine (host) and port if applicable.Expressed as an explicit path or URI:[transport]://[machine][:optional port]http://localhost:8081/Servicenet.tcp://localhost:8082/Service

BindingBindings provide “canned” method regarding the transport protocol, message encoding, communicationpattern, reliability, security policies.the WCF features required to support the designgoals of the service. Some common bindings include:BasicHttpBindingNetTcpBindingWSHttpBinding

Page 27: Data Parallel Application   Development and Performance with   Windows Azure

WCF in Azure

Worker Role [ServiceContract] Public interface IService { [OperationContract] String compute(); } ServiceHost sh = new

ServiceHost(typeof(IService)); //use the AddEndpoint helper method

to create the ServiceEndpoint and add it to the ServiceDescription

sh.AddServiceEndpoint( typeof(IService), //contract type new NetTCPbinding(), //one of the

built-in bindings "http://localhost/IService/Ep1"); //the

endpoint's address

Web RoleNetTcpBinding b = new NetTcpBinding(SecurityMode.None);var facotory= new ChannelFactory<WorkerRole.IService>(b);var channel = facotory.CreateChannel(GetEndpoint( ));channel.compute(); // call the service hosted on worker role

maxBufferSize="10485760" maxReceivedMessageSize="10485760"

Page 28: Data Parallel Application   Development and Performance with   Windows Azure

From Objects to Services

PolymorphismEncapsulationSubclassing1980s

Interface-basedDynamic LoadingRuntime Metadata

1990s

Object-Oriented

Service-Oriented

Component-Based

Message-basedSchema+ContractBinding via Policy

2000s

C&C++ with MPI

Queue with Azure

WCF with Azure

Page 29: Data Parallel Application   Development and Performance with   Windows Azure

Experimental Evaluation

8 Pro-cessors4 Pro-

cessors2 pro-cessors

0510152025

MPIQueue

WCF

MPIQueueWCF

MPI Queue WCF8 Processors 0.0993sec 8.8726sec 4.4533sec4 Processors 0.1656sec 13.9872sec 6.349sec2 processors 0.4723sec 20.6536sec 11.5783sec

8 Pro-cessors4 Pro-

cessors2 pro-cessors

0

2

4

6

8

MPIQueue

WCF

MPIQueueWCF

MPI Queue WCF

8 Processors 0.1023 2.8902 1.9234

4 Processors 0.2512 4.1224 3.4267

2 processors 0.5420 7.6238 5.5263

8 Pro-cessors4 Pro-

cessors2 pro-cessors

0123456

MPIQueue

WCF

MPIQueueWCF

MPI Queue WCF8 Processors 0.4272 sec 1.0623 sec 0.8976 sec4 Processors 1.2567 sec 2.3457 sec 1.5214 sec2 processors 2.0233 sec 5.2356 sec 4.1218 sec

Time(sec)

Time(sec)

Time(sec)

Matrix Multiplication Kmeans KNN

Fastest Read: 31ms Slowest Read: 203ms Fastest Write: 31ms Slowest Write: 234ms

Fastest Delete: 0ms Slowest Delete: 593mssimply a reliable method of delivering messages between

processes

QUEUE Performance

Page 30: Data Parallel Application   Development and Performance with   Windows Azure

Azure VS Traditional Cluster

Hardware

Operating System The OS running on Glenn is Linux which has a lightweight kernel can make full

use of hardware resources.

Programming Language C is only one level of abstraction away from machine language. C# running on

the .Net framework is at a minimum 3 levels of abstraction away from assembler.

CPU Ram BandwidthGlenn 2.7Ghz 8 G 20 Gbps

Azure 1.6Ghz 2 G 10 Gbps

Page 31: Data Parallel Application   Development and Performance with   Windows Azure

Conclusion

MPI applications can harness the advantages of cloud computingApplications running on the cloud can achieve high efficiency by simulation of MPI parallelization on Windows Azure Platform.Introduce the different inter roles communication methods in parallel way which can be considered as a prototype of Azure MPI Library which most likely will be developed and utilized in the near future.