virtual machine aware communication libraries for high performance...

28
SC'07 -- Nov 13th, 2007 Virtual Machine Aware Communication Libraries for High Performance Computing Wei Huang, Matthew Koop, Qi Gao, and Dhabaleswar K. Panda Network Based Computing Laboratory The Ohio State University

Upload: others

Post on 20-Jan-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

Virtual Machine Aware Communication Libraries for

High Performance Computing

Wei Huang, Matthew Koop, Qi Gao, and Dhabaleswar K. Panda

Network Based Computing Laboratory The Ohio State University

Page 2: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

In this presentation …

• We target high performance computing with virtual machines

• Why do we want to do this?• What is missing?

– Performance concerns– Efficient inter-VM communication

• What do we do? – IVC: Inter-VM Communication library– MPI: hiding the design complexities– Performance evaluation

• Conclusion

Page 3: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

Virtual machine environment

• Virtual machine (VM) technologies allow running OSes on virtualized hardware instead of native hardware

• A wide adoption of VM environments:– Server consolidation: efficiently utilize the resources– Debugging and development: safety and efficiency

Native hardware

OS kernel

OS service and Applications

Native computing environment

Native hardware

Virtual machine monitor

VM-based computing environment

OS kernel

OS service and

Applications

OS kernel

OS service and

Applications

Guest VM

Page 4: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

VM for HPC: how and why?

• Applications are running on virtual clusters consisting of multiple VMs• VMs can be migrated among physical hosts• Why VM based environments?

– Management: hardware maintenance …– Fault tolerance– And many others: customized OSes, load balancing, performance isolation …

Native hardware

Virtual machine monitor

Guest VMGuest VM

Physical Resources

Native hardware

Virtual machine monitor

Guest VM

Native hardware

Virtual machine monitor

Guest VM

Virtual cluster

Page 5: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

VM for HPC: why not?• Despite many promising features, VMs have not

yet been widely used for HPC• One of the most important reasons: perceived

overhead from the virtualization layer• Is this true?

– CPU & memory virtualization: • Not really: HPC is full of non-privileged instructions, which

can be executed natively

– Communication I/O virtualization: • VMM-bypass I/O for network communication• Is that all?

Page 6: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

A closer look at communication I/O

• Native environment running MPI job:– Inter-node communication through high speed interconnect– Intra-node communication through shared memory

• More efficient: no network contention• Supported by MVAPICH/MVAPICH2, OMPI, etc …

Native hardware

OS

Computing process

Native computing environment

Computing Process

Inter-node communication

Intra-node communication

Page 7: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

A closer look at communication I/O

• VM-based environment:– Computing processes are hosted on separate VMs for scheduling flexibility– Inter-node communication through high speed interconnect

• Support from VMM-bypass I/O – native level performance– Intra-node communication has to go through loop back as well

• Extremely undesirable especially with the wide-spread adoption of multi-core architecture!

* Jiuxing Liu, Wei Huang, Bulent Abali, and Dhabaleswar Panda. High Performance VMM-bypass I/O in Virtual Machines. In USENIX’06

Native hardware

Virtual machine monitor

VM-based computing environment

OS

Computing Process

OS

Computing Process

Inter-node communication

?Intra-node

communication

Page 8: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

Our contributions

• Design IVC, an Inter-VM Communication library providing efficient intra-physical node communication through shared memory

• Hide all design complexities by designing MVAPICH2-ivc, a VM-aware MPI library

• Evaluate our design on multi-core computing systems, showing great potential for VM-based HPC

Page 9: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

Inter-VM Communication

• Objectives– Providing efficient inter-VM (intra-physical

node) communication through shared memory• How to setup shared memory region?• How to find peers on the same node?

– Handling VM migration• How to tear-down/establish inter-VM

communication?

Page 10: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

Shared memory setup: a client-server model

• Step 0: register – kernel drivers helps computing processes to find out peers on the same computing node• Step 1: A user process initiates the setup process

– Call into the IVC user communication library• Step 2 & 3: IVC user library allocates shared memory space and grant page access to the remote VM through

VMM• Step 4: reference information is sent to the remote IVC library• Step 5 & 6: Map the shared memory pages to process’ address space• Step 7: computing processes get notified

Native hardware

IVC kernel driver

Computing process Computing Process

Virtual machine monitor

IVC kernel driver

IVC library IVC library OS Services

(IVC user library)

OS kernel

(IVC kernel driver)

VMM and HW

End-user application

1

2

3

5

6

4

7 7

Shared memory pages Shared memory pages

0 0

Page 11: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

When VM migrates …

• IVC is a intra-node (physical node) communication library:– IVC connections to VMs on the original host must

be torn down– IVC connections can be established to VMs on the

new host• Require peer coordination

Page 12: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

When VM migrates …

• Step 1: IVC kernel driver on the migrating VM receives a callback when VM is about to migrate

• Step 2 & 3: all peers stop send operations and acknowledge• Step 4: computing processes get notified• Step 5: return from callback• Step 6: migrate to the new host and establish new IVC connection

Native hardware

IVC library and kernel driver

Computing process 1

Virtual machine monitor

Computing Process 2

IVC library and kernel driver

Native hardware

Computing Process 3

Virtual machine monitor

IVC library and kernel driver

A three process parallel job

23

4 4

51

6

Page 13: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

Now we have IVC …• Benefits:

– Application can have efficient communication over shared memory, even when the computing processes are not in the same guest VM

– Possible to support VM migration• Concerns: application needs to

– Written with our API– Take care of both intra- and inter-node communication

• Not a big deal! – Most applications are written in standardized APIs, like MPI– We can integrate our design into those API implementations

Page 14: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

MVAPICH2-ivc: hiding the complexities

• Choosing MPI: the de facto standard for parallel programming

• MVAPICH2: a popular MPI-2 library over InfiniBand from our lab, used by 580 organizations world wide

• MVAPICH2-ivc: extends MVAPICH2, automatically choosing between IVC or network (IB) communication

• Hiding the complexities of IVC-specific APIs –transparently benefits user applications

Page 15: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

Architectural overview

•ADI3 manages message delivery•Communication coordinator manages IVC and network channels setup (dynamic)

•ADI3 manages message delivery•Shared memory and ADI3 channels are statically setup

MVAPICH2-ivcMVAPICH2 (native)

•IVC channel communicates of shared memory (IVC library/driver provide mapping)

•Shared memory channel communications over shared memory (OS provides mapping service)

Application

MPI Layer

ADI3 Layer

Shared memory Channel

Network Channel

Shared Memory InfiniBand API

MPILibrary

Communication device API

Native Hardware

Application

MPI Layer

ADI3 Layer

IVC

Virtual MachineAwareMPI Library

Communication device API

Virtualized Hardware

IVC channel Network channel

Communication coordinator

Native MVAPICH2 Modified: MVAPICH2-IVC

VMM-bypass I/OIVC VMM-bypass I/O

IVC channel

Communication coordinator

•Network channel communicates over VMM-bypass over InfiniBand (transparent)

•Network channel communicates over InfiniBand

Page 16: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

Handling VM migration• Key issue: ensuring message

in-order delivery when setting up and tearing down IVC connections during migration

• VC: virtual connections encapsulating communication mechanisms:

– Network– IVC

• VC has four states:– IVC_CLOSE: all VC start with

this state– IVC_ACTIVE: IVC connection

is ready to use– IVC_SUSPEND: IVC

connection is being torn down– IVC_READY: IVC connection

is setup, but not ready to use due to in-flight message over network

IVC_CLOSE

IVC_ACTIVE

Establish IVC connection (init stage)

Migration call back: IVC no longer available

drain receive buffers

Establish IVC connection (during migration)

IVC_READY

Network messages flushed (receive a flush message)

IVC_SUSPEND

Page 17: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

Now we have a MPI …

• Unmodified MPI applications can benefit from our design

• Regarding the performance concerns:– What’s the benefit of IVC?– How does a VM-based environment with IVC

compare with a native environment?

Page 18: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

Experimental setup• Testbed A: dual socket Intel Clovertown (Quad-core)

processors, 4 GB memory, PCI-Express InfiniBand HCA• Testbed B: 64 node dual socket single core cluster (32

Xeon and 32 Opteron), 2GB memory, PCI-Express InfiniBand HCA

• Xen-3.0, dom0 running RHEL 4• DomU using ttylinux (tiny linux distribution)• Configurations:

– IVC: mvapich2-ivc running in VM-based environment– No-IVC: unmodified mvapich2 in VM-based environment– Native: unmodified mvapich2 in native Linux environment

Page 19: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

Latency and bandwidth

Very close to native~3.2us through IB loopback

Sub-1us through Shared memory

Latency

IVCNo-IVCNative

0

2

4

6

8

10

12

0 2 8 32 128 512 2kMessage Size (Bytes)

Late

ncy

(us)

Native No-IVC IVC

0

200

400

600

800

1000

1200

1400

1600

2 8 32 128 512 2k 8k 32k 128k 512k

Message Size (bytes)

Ban

dwid

th (M

B/s

)

Native No-IVC IVC

Native-level performance

Getting better for large messages

Much higher for mid-size messages

Bandwidth

Page 20: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

VM migration2K Message Latency (us)

02468

101214161820

0

2K Message Bandwidth (MB/s)

0

200

400

600

800

1000

1200

0

• MVAPICH2-ivc automatically switches to IVC whenever the target peers are on the same physical nodes

• Above two graphs show decreased latency and increased bandwidth when two processes in separate VMs are migrated to the same physical nodes

Start with VMs on separate physical nodes

Migrating to the same physical node

Switch to IVC, latency 9us 3us

Migrate again, switch back to network

Page 21: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

CollectivesCollectives (16 KBytes)

0

0.5

1

1.5

2

2.5

3

Allgather Allreduce Alltoall Bcast Reduce

Nor

mal

ized

Tim

e

no-ivc ivc nativeCollectives (16 bytes)

0

1

2

3

4

5

Allgather Allreduce Alltoall Bcast Reduce

Nor

mal

ized

Tim

eno-ivc ivc native

Collectives (256 KBytes)

0

0.5

1

1.5

2

2.5

Allgather Allreduce Alltoall Bcast Reduce

Nor

mal

ized

Tim

e

no-ivc ivc native • With inter-VM communication, mvapich2-ivc largely closes the gap between native and VM based environments

• Results collected on 8-core systems using Intel MPI Benchmarks (IMB) (8x2)

Page 22: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

Application-level benchmarks

• Number taken on 16 processes• Benefits of IVC show for several benchmarks, e.g. IS (11%), CG

(9%), LAMMPS (5.9%), SMP2000 (11.8%), and NAMD (3.4%)

0

0.2

0.4

0.6

0.8

1

1.2

1.4

cg.B is.B ep.B bt.A ft.B mg.B sp.A lu.A NAM D SM G2000 LAM M PS

Rel

ativ

e P

erfo

rman

ce

No-IVC IVC Native

Page 23: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

Larger scale?

• Based on individual benchmarks, intra-node communication is still an important part!

• Percentage of intra-node communication is well above average

8x2

8x8

8x640

10

20

30

40

50

60

70

80

90

LAMMPS SMG2000 NAMD2 BT CG EP FT IS MG LU SP

% o

f int

ra-n

ode

com

mun

icat

ion

8x2 8x8 8x64

Page 24: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

Overheads on 64 node cluster

• Performance comparison on a 64 node dual processor cluster• We do see very close performance (~1%)• NAS-FT shows around 5% overhead with its large message all-to-all

communication pattern

0

0.2

0.4

0.6

0.8

1

1.2

bt.C cg.C lu.C is.C mg.C ft.C ep.C sp.C

Norm

aliz

ed E

xecu

tion

Tim

e

ivc native HPL with different Configurations

0

0.2

0.4

0.6

0.8

1

1.2

8x2 16x2 32x2 64x2

Norm

aliz

ed E

xecu

tion

Tim

e ivc native

Page 25: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

Conclusion

• We propose Inter-VM communication (IVC), allowing efficient shared memory communication between VMs

• We modify MVAPICH2 to hide all complexities and allow user applications to benefit transparently

• With our evaluation, we show: virtualization is NOT introducing much overhead

• With its benefits for system management, VMs are an attractive solution for HPC!

Page 26: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

Future work

• More optimizations can be made to improve the performance of Inter-VM communication– Dynamically map user buffers to achieve one-

copy communication• Looking more into management

frameworks for VM-based computing environments (load balancing, fault-tolerance …)

Page 27: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

AcknowledgementsOur research at the Ohio State University is supported by the following organizations:

• Current Funding support by

• Current Equipment support by

Page 28: Virtual Machine Aware Communication Libraries for High Performance …nowlab.cse.ohio-state.edu/static/media/publications/... · 2017. 7. 18. · • Design IVC, an Inter-VM Communication

SC'07 -- Nov 13th, 2007

Thank you!

Network-Based Computing Laboratoryhttp://nowlab.cse.ohio-state.edu/