1 modern approaches of customer’s dream distribution across the cluster evgenij kozhevnikov,...

22
1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015

Upload: valerie-welch

Post on 21-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015

1

Modern Approaches of Customer’s Dream Distribution Across the ClusterEvgenij Kozhevnikov, Samara

AUGUST 4, 2015

Page 2: 1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015

2

About me

1+ years of production experience in BigData– Edmunds.com– BigData CC

3+ years of development experience in BigData– Hadoop– Spark– Storm– Akka

6+ years of development experience– Java EE, IBM Websphere– Spring

Page 3: 1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015

3

Successful Business - Growing Business

Page 4: 1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015

4

Growing Business – Growing Load

Page 5: 1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015

5

Our software should be ready to grow with business

1. Pay for your needs, not for plans

2. Growth doesn’t require any changes in application

3. Where one growing app, there are some growing apps

Page 6: 1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015

6

Our software should be ready to grow with business

Page 7: 1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015

7

Our software should be ready to grow with business

• Caching Proxy• CDN

Page 8: 1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015

8

Our software should be ready to grow with business

• Caching Proxy• CDN

• NoSQL• Distributed cache

Page 9: 1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015

9

Our software should be ready to grow with business

• Caching Proxy• CDN

• NoSQL• Distributed cache

Page 10: 1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015

10

Does Edmunds need cluster solution?

1. What trends we have now?2. Is quality of the vehicle catalog is enough?3. Is our ad efficient?4. What results of A/B testing do we get?5. What can we recommend to our clients?6. Where is the car that client needs?7. How many leads were sent to the dealer?8. Is the dealer successful?9. Are our visitors not robots?10. What revenue do we have in this year?11. Are we growing?12. Are our dealers growing?

Page 11: 1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015

11

1. What trends we have now?2. Is quality of the vehicle catalog is enough?3. Is our ad efficient?4. What results of A/B testing do we get?5. What can we recommend to our clients?6. Where is the car that client needs?7. How many leads were sent to the dealer?8. Is the dealer successful?9. Are our visitors not robots?10. What revenue do we have in this year?11. Are we growing?12. Are our dealers growing?

It’s not a competitive advantageAll competitors do that

Does Edmunds need cluster solution?

Page 12: 1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015

12

• Need in fast access to the whole amount of data

• Historical data is important as a new one

• Support dynamically extended hardware resources

• Be able to run some independent applications on the same cluster

• Each application run require specific amount of resources

• Need in convenient monitoring tool and fault-tolerance of the system

• Code should be readable and distributed algorithms should be supportable

Does Edmunds need cluster solution?

Growing amount of data

Amount of tasks growth

Page 13: 1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015

13

MAPREDUCEYARN

Hadoop-based solutions

Page 14: 1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015

14

MapReduce across YARN

Node Node

Node Node

Node

Node

Node

Node

Page 15: 1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015

15

MapReduce across YARN

ResourceManager

NameNode

Resource Manager

Name Node

Node

Node

Node

Node

Page 16: 1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015

16

MapReduce across YARN

StandbyResourceManager

ActiveResource Manager

HadoopClient

MR Application

Master

NameNode

Data NodeMR Executor

Data NodeMR Executor

Page 17: 1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015

17

SPARKYARN

Hadoop-based solutions

Page 18: 1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015

18

Spark across YARN

StandbyResourceManager

ActiveResource Manager

SparkClient

MR Application

Master

NameNode

Data NodeSpark Executor

Data NodeSpark Executor

Page 19: 1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015

19

SPARKMESOS

Mesosphere-based solutions

Page 20: 1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015

20

Spark across YARN

StandbyMesos Master

ActiveMesosMaster

SparkClient

Spark Scheduler

NameNode

Data NodeSpark Executor

Data NodeSpark Executor

Page 21: 1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015

21

1 2 3 4

WHAT NEXT

Myriad

• YARN on Mesos

• Efficient access to Hadoop resources

• Dynamic nature of Mesos

Kubernetes

• Resource Manager for docker-based infrastructure

• Solution from Google

Akka Cluster

• Efficient model for vertical and horizontal scaling

• Freedom of choosing the way of distribution

Task-specific tools

• Apache Storm

• Hive/Pig/Cascading…

• NoSQL solutions

• Kafka/Sqoop/Flume…

• Chef/Puppet/Ansible…

• Docker/Rocket/CoreOS

• Data Science

Page 22: 1 Modern Approaches of Customer’s Dream Distribution Across the Cluster Evgenij Kozhevnikov, Samara AUGUST 4, 2015

22

Modern Approaches of Customer’s Dream Distribution Across the Cluster

Evgenij Kozhevnikov, Samara