ithome cloud summit: the next generation of data center: machine intelligent cluster

Post on 22-Jan-2018

2.098 Views

Category:

Internet

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

MachineIntelligentCluster:Thenextgenerationofdatacenter

EvanLin@LinkerNetworks

About meCloud Architect @ Linker Networks

Golang User Group - Co-Organizer

Top 5 Taiwan Golang open source contributor (githubaward)

Developer, Curator, Blogger

Recap Cloud Summit 2016

Agenda• Problems on data center• How machine learning helps• Machine Intelligent Cluster• Applications• Q&A

Data center

• Power consumption• Low usage• Unpredictable peak• Noisy neighbors

Efficiency

• Physical damage• Networking problem• Anomaly• Attack

Risk

Real data center

Power consumption

Low usage and Unpredictable peak

Noisy neighbor

Use machine learning improve DC power consumption

None of your business?

Modern Data center: Machine Cluster

Before machine clusterDB Master:IP: 192.168.1.222

DB Slave:IP: 192.168.1.223

Web Server 1:IP: 192.168.1.101

Web Server 2:IP: 192.168.1.102

Web Server 3:IP: 192.168.1.103

Load Balancer:IP: 1.2.3.4

Container orchestration

Resource arrangement

Scalability

Portability

Automation migration

Resource management

3 Web App Servers2 DB Servers

1 Load Balancer

Scalability

Automation migration

Automation migration

Automation migration

Automation migration

But .. we need better ..

No prediction

How to define scale out threshold?

50 %?

75 %?

25 %?

MachineIntelligentCluster

Efficiency

Maximize Utilization

Operation Optimization

Accident

RiskMitigation

ServiceabilityManagement

Machine Intelligence

Cluster

How MIC helps

Operation Optimization1. Reinforcement learning 2. Adjust thermostat3. Check the reward (CPU performance).

[1]: Refer from https://goo.gl/ly3zyX

Maximize UtilizationAnalyze utilization and reduce working machines to save our customer budget

- Predict utilization trend- Provide auto-scaling threshold

adjustment

Prediction and dynamic threshold

OptimizedScheduler

Node 1 Node 2Node 3

Node 1 Node 2 Node 3

Nginx(CPU 30%)

DB- MySQL(IO 25%)

DB- Mongo(IO 30%)

Apache(CPU 30%)

Backend Process(CPU 35%)

DB- Oracle(IO 35%)

NodeJS(CPU 7%)

Go backend(CPU 8%) Nginx

(CPU 30%)

DB- MySQL(IO 25%)

NodeJS(CPU 7%)

Go backend(CPU 8%)

Apache(CPU 30%)

Backend Process(CPU 35%)

DB- Mongo(IO 30%)

DB- Oracle(IO 35%)

Maximize Utilization

P.S. Not rearrange processes, we change the scheduler to avoid it happen..

Model 1

Serial Number Prediction

S.M.A.R.T. RNN Prediction

Serviceability Management (cont.)

Model 2

Dummy VM Detection Outlier Attack Detection

Mitigate risk

Storage SDN

Zombie Tagging system

Architecture

Cloud Native Architecture

HPC (with GPU) Server

Storage SDN

Storage SDN

Data Collect Probe & Sensor & Smart GW

Visualization

Data Process

Data Analysis &Machine Learning

DCOS/ Kubernetes Spark ML Tensorflow

DCOS / Kubernetes

Cassandra (Storage)

Kafka (Queueing)

Go/Akka (Connector)

Spark (ETL/Streaming)

D3.js

Scikit Learn R

Interactive Dashboard

Jupyter Notebook

Zeppelin

ML Job Scheduler Chronos

MIC System Architecture

Data Agent KafkaSpark

Streaming

Cassandra

Spark ML(Classification,

Clustering)

TensorFlow(Deep

Learning)

Backend ServerAPI

Portal

TensorFlow Predict

SparkML Predict

MIC Data Flow

Applications on MIC

Machine Intelligent Cluster

IOT Gaming 5G NFV E-Commerce

Machine Intelligent Cluster Summary

• Machine cluster with Intelligent• Features• Self-Optimization• Self-Learning• Self-Recovery• Green, Secure and Predictive machine cluster

歡迎訂閱碼天狗

http://weekly.codetengu.com/

ThankYou

top related