data center workload measurement and analysis

1

Presented by:

-Ankita Duggal

-Gurkamal Deep Singh Rakhra

-Keerthana Muniraj

-Preeti Sawant

Data Center Workload Measurement and Analysis

What is a Data center ?•A large group of networked computer servers typically used by organizations for the remote storage, processing, or distribution of large amounts of data.

•It doesn’t house only servers but also contains backup power supplies, communication connections, air conditioning, fire supplies etc.

•“A data center is a factory that transforms and stores bits”

3

A few glimpses of Data Center of a few organizations …Rackspace - Richardson,TX Facebook – Lulea, Sweden

Google- Douglas County, Georgia Amazon – Virginia, outside Washington D.C

Google’s floating data center Aliyun (Alibaba) – Hangshou, China

5

Data Center workload

• Amount of processing that the computer has been given to do at a given time.

• Workload — in the form of web requests, data analysis, multimedia rendering, or other applications – is placed in the data center

Ref: http://searchdatacenter.techtarget.com/definition/workload

Classification of workloads based on time criticality

Critical Workloads Non-critical Workloads

“Cannot tolerate even a few minutes of downtime”

can tolerate a wide range of outage times

Ways to improve data protection

• Prevent downtime by reducing resource contention : Managers accommodates drastically changing demands on workloads by allowing easy creation of additional workloads without changing or customizing its applications.

• Replicate workloads into cloud to create asymmetric “Hot back-ups”:Clone the complete workload stack. Import into public/private cloud

• Using dissimilar infrastructure for off-premises redundancies:Workloads are replicated off-site to different cloud providers.

• Concept of “Failures or Failback” reserved only for critical workloads: Automating the switching of users or processes from production to recovery instances

8

Characterizing Data Analysis workloads in Data Centers

• Data Analysis is important improving future performance of data center

• Data center workloads services workload (web search, media streaming)

data analysis workload ( business intelligence, machine learning )

• We concentrate on internet services workload here

• Data analysis workloads are diverse in speedup performance and micro-architectural characteristics. Therefore, there is a need to analyze many applications

• 3 important application domains are in internet services are : 1) search engine 2) social networks 3) electronic commerce

9

Workload requirements :1)most important application domain2) data is distributed, data can not be processed on single node3)consider recently used data

10

Breakdown of Executed Instructions

11

DCBench :

• Benchmarks used to evaluate new designs and systems benefit

• DCBench is a benchmark suite for data center computing, with an open source license.

• Includes online and offline workload

• Includes different programming model like MPI versus MapReduce

• Helpful for performing architecture and small to medium scale system researches for data center computing.

12

Methodologies

13

Workflow Phases

Extract• Look for raw data• Generates stream

of data

Partition• Divides stream into

buckets

Aggregate• Combines/reduces

14

Patterns comprising traffic in Data Center

Work-seeks-bandwidth

Scatter gather pattern

15

Work-seeks-bandwidth

• chip designers prefer placing components that interact often (e.g., cpu-L1 cache, multiple CPU cores) close by to get high bandwidth interconnections on the cheap

• Jobs are placed in data center that rely on heavy traffic exchanges with each other in areas where high network bandwidth is available.

16

Contd..

This translates to the engineering decision of placing jobs within thesame server, within servers on the same rack or within servers inthe same VLAN and so on with decreasing order of preference andhence the work-seeks-bandwidth pattern.

17

Scatter gather pattern• data is partitioned into small chunks, each of which is worked on by

different servers, and the resulting answers are later aggregated.

18

Congestion

• Periods of low network utilization indicate Application that demands more of other resources- CPU, disk than network Application can be rewritten to make better use of available bandwidth

19

Evacuation event (congestion)

• When a server repeatedly experiences problems, the automated management system in the cluster evacuates all the usable blocks on that server prior to alerting a human that the server is ready to be re-imaged.

20

Read failure

• When a job does not make any progress it is killed (unable to find input data, or unable to connect to a machine)

21

Contd.

• To attribute network traffic to the applications that generate it, the network event logs and logs at the application-level were merged that describe which job and phase (e.g., map, reduce) were active at that time. Results showed that, jobs in the reduce phase are responsible for a fair amount of the network traffic.

• Note that in the reduce phase of a map-reduce job, data in each partition that is present at multiple servers in the cluster (e.g., all personnel records that start with ‘A‘) has to be pulled to the server that handles the reduce for the partition .

22

Monitoring Data Center Workload

• For coordinated monitoring and control of data centers, the most commonly approaches are based on Monitor, Analyze ,Plan and Execute (MAPE ) control loops.

Overview

23

Modern Data Center Operation

• Workload in the form of web requests, data analysis, etc is placed in the data center.

• An instrumentation infrastructure logs sensor readings.

• The results are fed into a policy engine that creates a plan to utilize resources.

• External interfaces or Actuators implement the plan.

24

Workload Monitoring using Splice

• Splice aggregates sensor and performance data in a relational database.

• It also gathers data from many sources through different interfaces with different formats.

• Splice uses change of value filter that retains only those values that differ significantly from the previously logged values.

• It reduces minimal loss of information.

25

Database Schema Of Splice

26

Implementation

• Splice uses change of value filter that retains only those values that differ significantly from the previously logged values.

• It reduces minimal loss of information.

27

Analysis

• Data analysis is done by two main classes- attribute behavior and correlation.

• Attribute behavior describes the value of the observed readings and how those values change over time.

• Data correlation methods determine the strength of the correlations among the attributes affecting each other.

Virtualization in Data Centers

• Virtualization is a combination of software and hardware features that creates virtual CPUs (vCPU) or virtual systems-on-chip (vSoC).

• Virtualization provides the required level of isolation and partitioning of resources.

• Each VM is protected from interference from another VM.

Reference: Multicore Processing: Virtualization and Data Center By: Syed Shah, Nikolay Guenov

Why Virtualization

• Reduced power consumption and building space, providing high availability for critical applications and streamlining application deployment and migration.

• To support multiple operating systems and consolidation of services on a single server by defining multiple VMs.

• Multiple VMs can run on a single server, the advantage is of reduced server inventory and better server utilization.


Benefits Of Virtualization


Multi Core Processing

• A multi-core processor is a single computing component with two or more independent actual processing units (called "cores"), which are the units that read and execute program instructions.


Virtualization and Multicore Processing

• With multicore SoCs, given enough processing capacity and virtualization, control plane applications and data plane applications can be run without one affecting the other.

• Data or control traffic that is relevant to the customized application and operating system (OS) can be directed to the appropriate virtualized core without impacting or compromising the rest of the system.


Control and Data Plane Application Consolidation in virtualized Multicore SoC

• Functions that were previously implemented on different boards now can be consolidated onto a single card and a single multicore SoC.


Data center Reliability

Network Reliability

Characterizing most failure prone network elements

Estimating the impact of failures

Analyzing the effectiveness of network

redundancy

Reference: Understanding Network Failures in Data Centers: Measurement, Analysis, and Implications By: Phillipa Gill, Navendu Jain, Microsoft Research

Key Observations

• Data center networks are reliable

• Low-cost, commodity switches are highly reliable

• Load balancers experience a high number of software faults

• Failures potentially cause loss of a large number of small packets.

• Network redundancy helps, but it is not entirely effective

Reference: Understanding Network Failures in Data Centers: Measurement, Analysis, and Implications By: Phillipa Gill, Navendu Jain Microsoft Research

Reasons to change from traditional

Significant changes in computing power, network bandwidth, and network file system usage

• Network file system workloads

• No CIFS protocol studies

• Limited file system workloads

Reference: Measurement and Analysis of Large-Scale Network File System Workloads by Andrew W. Leung, Shankar Pasupathy, Garth Goodson, Ethan L. Miller

Access Pattern

Read Only Write Only Read and Write

Analysis

• File Access Patterns:


Sequential Access

Entire Partial

• Sequentiality Analysis:


File Lifetime

• CIFS, files can be either deleted through an explicit delete request, which frees the entire file and its name, or through truncation, which only frees the data

• CIFS users begin a connection to the file server by creating an authenticated user session and end by eventually logging off.

Reference: Measurement and Analysis of Large-Scale Network File System Workloads by Andrew W. Leung, Shankar Pasupathy, Garth Goodson, Ethan L. M.

Architecture

Load Balancer

IP address to which requests are sent is called a virtual IP address

(VIP)

IP addresses of the servers over which the requests are spread are

known as direct IP addresses (DIPs).

• Inside the data center, requests are spread among a pool of front- end servers that process the requests. This spreading is typically performed by a specialized load balancer.

Reference: Towards a Next Generation Data Center Architecture: Scalability and Commoditization By Albert Greenberg, David A. Maltz Microsoft Research, WA, USA

Challenges and RequirementsChallenges

• Fragmentation of resources

• Poor server to server connectivity

• Proprietary hardware that scales up, not out

Requirements:

• Placement anywhere

• Server to server bandwidth

• Commodity hardware that scales out

• Support 100,000 serversReference: Towards a Next Generation Data Center Architecture: Scalability and Commoditization By Albert Greenberg, David A. Maltz Microsoft Research, WA, USA

Load Balancing

Load Balancing

Load Spreading:

requests spread evenly over a pool of servers

Load Balancing:place load balancers in front

of the actual servers

Reference: Towards a Next Generation Data Center Architecture: Scalability and Commoditization By Albert Greenberg, David A. Maltz Microsoft Research, WA, USA

44

Case studies

– a few real-time scenarios

Why build a Data center at Virginia when there is one at California?

• Reduce the time to send a page to users on the East Coast

• California – running out of space Virginia – lots of room to grow

• restricting to one datacenter meant that in the event of disaster(earthquake, power failure, Godzilla) Facebook could be usable for extended amount of time.

The hardware and network were set up soon..but how to handle cache consistency?

Master DB

Sl

Facebook’s Scheduling with Corona

• With Facebook’s user base expanding at an enormous rate, the development of a new scheduling framework called CORONA came into place.

• Initially a MapReduce implementation of Apache Hadoop served as the foundation of the infrastructure. But this system over the years developed several issues. These were:

Scheduling overhead Pull based scheduling model Static slot-based resource management model

Facebook’s Solution• Corona introduces a cluster manager whose only purpose is to track the nodes in the cluster and the amount of free resources.

• Corona uses push based scheduling. This reduces scheduling latency.

• The separation of duties allows Corona to manage a lot more jobs and achieve better cluster utilization.

• The cluster manager also implements fair-share scheduling.

Future of Corona

• New features such as Resource based scheduling than slot based modelOnline upgrades to the cluster managerExpansion of user base by scheduling applications such as Peregrine

Characterizing backend workload(at

Google)

Ref: Towards Characterizing Cloud Backend Workloads: Insights from Google Compute Clusters (Asit K. Mishra Joseph L. Hellerstein Walfredo Cirne Chita R. Das)

Pre-requisites

• Capacity planning to determine which machine resources must grow and by how much and

• Task scheduling to achieve high machine utilization and to meet service level objectives

• Both these require good understanding of task resource consumption i.e CPU and memory usage.

The approaches

1. Make each task its own workloadScales poorly since tens of thousands of tasks execute daily on google computes clusters.

2. View all tasks as belonging to one single taskResults on large variances in predicted resource consumptions.

The proposed methodology

• identifying the workload dimensions• constructing task classes using an off-the-shelf algorithm such as k-

means• determining the break points for qualitative coordinates within the

workload dimensions • merging adjacent task classes to reduce the number of workloads

Based on

• the duration of task executions is bimodal in that tasks either have a short duration or a long duration

• most tasks have short durations• Most resources are consumed by a few tasks with long duration that

have large demands for CPU and memory

Objective

• construct a small number of task classes such that tasks within each class have similar resource usage.

• We use qualitative coordinates to distinguish workload- small(s), medium(m), large(l)

First step

• Identify the workload dimensions.• For example, in analysis of the Google Cloud Backend, the workload

dimensions are task duration, average core usage, and average memory usage

Second step

• Constructs preliminary task classes that have fairly homogeneous resource usage. It is done by using the workload dimensions as a feature vector and applying an off-the-shelf clustering algorithm such as k-means

Third step

• determining the break points for the qualitative coordinates of the workload dimensions. It has two considerations. First, break points must be consistent across workloads. For example, the qualitative coordinate small for duration must have the same break point (e.g., 2 hours) for all workloads. Second, the result should produce low within-class variability

Fourth step

merges classes to form the final set of task classes. These classes define our workloads. This involves combining “adjacent” preliminary task classes. Adjacency is based on the qualitative coordinates of the class. For example, in the Google data, duration has qualitative coordinates small and large; for cores and memory, the qualitative coordinates are small, medium, large. Thus, the workload smm is adjacent to sms and sml in the third dimension. Two preliminary classes are merged if the CV(coefficient of variance) of the merged classes does not differ much from the CVs of each of the preliminary classes. Merged classes are denoted by the wild card “*”. For example, merging the classes sms, smm and sml yields the class sm*

61

• click me

https://www.youtube.com/watch?v=xmTIA9dv7Os

62

Questions?

data center workload measurement and analysis

Internet

data centers data analysis

data analysis workloads

raw data

data protection

data center computing

data center ref

glimpses of data center

classification of workloads