kernel recipes 2014 - performance does matter

10
Infrastructure Benchmarking Methodology & Tooling

Upload: anne-nicolas

Post on 24-May-2015

443 views

Category:

Software


0 download

DESCRIPTION

Deploying clouds is in everybody’s mind but how to make an efficient deployment ? After setting up the hardware, it’s mandatory to make a deep inspection of server’s performance. In a farm of supposed identical servers, many miss-{installation|configuration} could seriously degrade performance. If you want to discovery such counter-performance before users complains of their VMs, you have to be detect them before installing any software. Another performance metric to know is “how many VMs could I load on top of my servers ?”. By using the same methodology it is possible the compare how a set of VMs performs regarding the bare metal capabilities. The challenge is here: How do detect automatically servers that under perform ? How to insure that a new server entering a farm will not degrade it ? How to measure the overhead of all the virtualization layers from the VM point of view ? Erwan Velu – Performance Engineer @eNovance

TRANSCRIPT

Page 1: Kernel Recipes 2014 - Performance Does Matter

Infrastructure Benchmarking

Methodology & Tooling

Page 2: Kernel Recipes 2014 - Performance Does Matter

Who am I ?

Erwan Velu ([email protected])

● Currently Performance Engineer @ eNovance● Previous Experiences

○ Release Manager @ SiT (In Flight Entertainment)○ Presale Engineer @ Seanodes (Distributed Storage - Ceph-like)○ Release Manager @ Mandriva (HPC product)

● Open Source Activity○ Part of the Mageia Team○ HDT (Hardware Detection Tool) Author and Syslinux Contributor○ Fio contributor

Page 3: Kernel Recipes 2014 - Performance Does Matter

Why Benchmarking an Infrastructure ?

● Infrastructure deliver service to users but ...

● Servers could be inter-dependent like with CEPH or Swift○ One server expect data from another○ Servers will run at the speed of the slowest

● Users expect service to be constant○ Servers of the same kind shall perform the same○ Running/Moving to any hypervisor shall provide the same experience

● Don’t wait customers to complain before checking the performance

● The goal is to get a quick view of a server farm performance○ Does my server performs as expected ?○ Does all my servers performs the same ?

Page 4: Kernel Recipes 2014 - Performance Does Matter

What to Benchmark ?

● Processor○ Every logical cpu○ All logical cpus

● Memory Bandwidth○ Small and Big blocks○ From 1K to 2G

● Storage○ Sequential and Random○ 1MB and 4K○ Read and Write

● Networking○ All-to-all communication

Page 5: Kernel Recipes 2014 - Performance Does Matter

Hey ! What did you expect ?

● Processor○ Understand the raw power of a single core○ Understand the efficiency of all cores (~75% on a 6 core Intel E5 CPU)

● Memory Bandwidth○ Understand the bandwidth you can get from a VM

● Storage○ Estimate if you run at HDD or SSD speed○ Insure the block device is performing as expected

■ like 70K IOps on a SSD■ or 200MB/s of bandwidth

● Networking○ Validating the switch isn’t a limiting factor○ Insure each server is performing up to its limits

Page 6: Kernel Recipes 2014 - Performance Does Matter

Tooling

● eDeploy○ eNovance project to generate reproducible operating systems builds○ http://github.com/enovance/edeploy

● Automatic Health Check (AHC)○ Part of eDeploy repository○ Build a small operating systems with tools & benchmark procedure

● Benchmark tools○ Sysbench (CPU & Memory)○ Fio (Storage)○ Netpipe / Netperf (Network)

Page 7: Kernel Recipes 2014 - Performance Does Matter

Analyzing results

● Each host is uploading its results in a python structure format○ Saved on the server side in a directory

● Cardiff tool analyze a series of results to get a clear picture of the run○ Human brain cannot synthesis so much data easily

● We must compare apples to apples○ cardiff first group similar hosts regarding their hardware properties

● For each kind of test (cpu, memory, disk, network), and for each group of hosts○ Compute min, max, average, standard deviation○ If stddev is too important group isn’t stable enough○ If some hosts are too far from the average (vs stddev), host is curious

■ It have to be inspected by human to understand the variation○ Unless, hosts & groups are “OK”

Page 8: Kernel Recipes 2014 - Performance Does Matter

Getting into the cloud !

● We do understand the bare-metal performance

● We can deploy an openstack & run the same tooling inside VMs○ Same tools, same metrics, same output

● Benchmarks have to be synchronized to get a simultaneous load of a component

● Results can be compared with bare metal performance○ to estimate the loss induced by the virtualization○ to insure how a number of VMs performs on a given infrastructure

● Can be also used to measure the resulting performance of ○ a patch○ a tuning of the infrastructure

Page 9: Kernel Recipes 2014 - Performance Does Matter

AHC & OCP

● OCP is brand new hardware○ could even be at prototype level

● AHC is a quick path to understand how the hardware performs○ AHC is automated and reduce humans errors○ AHC makes a fixed comparaison point between people/project/companies

● AHC is open○ New benchmarks can be easily added○ It’s fully open source○ So let’s hack it !

Page 10: Kernel Recipes 2014 - Performance Does Matter

THANK YOU

Erwan Velu <[email protected]>erwan_taf on freenode or oftc