apache big data europa- how to make money with your own data

34
APACHE BIG DATA CONFERENCE How to transform data into money using Big Data technologies

Upload: jorge-lopez-malla

Post on 15-Apr-2017

296 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Apache Big Data Europa- How to make money with your own data

APACHE

BIG DATA

CONFERENCE

How to transform data into moneyusing Big Data technologies

Page 2: Apache Big Data Europa- How to make money with your own data

After almost a decade developing Big Data projects in Paradigma, through its R+D department

we were early adopters of Spark, which led to the creation of Stratio

THE FIRST SPARK-BASED BIG DATA

PLATFORM RELEASED

INTRO

Page 3: Apache Big Data Europa- How to make money with your own data

JORGE LOPEZ-MALLA

After working with traditional

processing methods, I started to

do some R&S Big Data projects

and I fell in love with the Big Data

world. Currently i’m doing some

awesome Big Data projects at

Stratio

MY PROFILE

SKILLS

Page 4: Apache Big Data Europa- How to make money with your own data

ALBERTO RODRÍGUEZ DE LEMA

After graduating I've been

programming for more than 10 years.

I’ve built high performance and

scalable web applications for

companies such as Indra Systems,

Prudential and Springer Verlag Ltd.

MY PROFILE

@ardlema

SKILLS

Page 5: Apache Big Data Europa- How to make money with your own data

II

GO TO SPACE

STRATIO

OPEN-SOURCE SOLUTIONSOur enterprises solutions are based on open sourcetechnologies

PURE SPARKThe only pure Spark platform,

the only global solution

ENTERPRISE SPARKOn – premise & cloud, our platform is

geared towards helping companies

SPARK-BASED BD PLATFORMThe first Spark-Based big data platform released

Page 6: Apache Big Data Europa- How to make money with your own data

OUR CLIENT

MIDDLE EAST TELCO COMPANY

o 9.500 mil. daily events processed

o 9.2 mil. clients

Page 7: Apache Big Data Europa- How to make money with your own data

USE CASES

Page 8: Apache Big Data Europa- How to make money with your own data

MANAGEMENT & NORMALIZATION OF DATA

SOURCES

USE CASES

1

Page 9: Apache Big Data Europa- How to make money with your own data

USE CASES

MANAGEMENT & NORMALIZATION OF DATA

SOURCES

1

Page 10: Apache Big Data Europa- How to make money with your own data

USE CASES

NETWORK COVERAGE IMPROVEMENT

2

Page 11: Apache Big Data Europa- How to make money with your own data

USE CASES

PEOPLE GATHERING

3

Page 12: Apache Big Data Europa- How to make money with your own data

USE CASES

PEOPLE GATHERING

3

Page 13: Apache Big Data Europa- How to make money with your own data

USE CASES

DATA MONETIZATION

4

Page 14: Apache Big Data Europa- How to make money with your own data

USE CASES

DATA MONETIZATION

4

Page 15: Apache Big Data Europa- How to make money with your own data
Page 16: Apache Big Data Europa- How to make money with your own data
Page 17: Apache Big Data Europa- How to make money with your own data
Page 18: Apache Big Data Europa- How to make money with your own data

DATA MONETIZATION

4

USE CASES

Page 19: Apache Big Data Europa- How to make money with your own data

TECHNICAL CHALLENGES

Page 20: Apache Big Data Europa- How to make money with your own data

TECHNICAL PROBLEMS

Huge volumenof data

Huge sizeof Data

Distributedprocessing

Hardto read

Recognized patterns

1 2 3 4 5

Page 21: Apache Big Data Europa- How to make money with your own data

1 HUGE VOLUME OF DATA

SOLUTIONAPACHE HADOOP DISTRIBUTED FILE SYSTEM (HDFS)

Page 22: Apache Big Data Europa- How to make money with your own data

1 HUGE VOLUME OF DATA

9500 mil. csv daily records -> circa 16 Gb

Requirements:

High availability

Concurrent file reads

Page 23: Apache Big Data Europa- How to make money with your own data

2 HUGE SIZE OF DATA

SOLUTIONAPACHE PARQUET

Page 24: Apache Big Data Europa- How to make money with your own data

2 HUGE SIZE OF DATA

16.5 Gb of daily event information stored as csv text in HDFS

4.3 Gb of daily event information stored as parquet files in HDFS

STORE IMPROVEMENT Circa 70%

Page 25: Apache Big Data Europa- How to make money with your own data

2 HUGE SIZE OF DATA

Time to count daily csv events -> 6.2 minutes

.

Time to count daily Parquet events -> 1 minute

READ PROCESS IMPROVEMENT Circa 80%

Page 26: Apache Big Data Europa- How to make money with your own data

3 DISTRIBUTED PROCESSING

SOLUTIONAPACHE SPARK

Page 27: Apache Big Data Europa- How to make money with your own data

3 DISTRIBUTED PROCESSING - REQUIREMENTS

Complex algorithmics with the minimum amount of

resources

Reduction of the process time in order to obtain data

when it still is used

Page 28: Apache Big Data Europa- How to make money with your own data

3 DISTRIBUTED PROCESSING - REQUIREMENTS

Sharing the cluster with legacy processes

Use of legacy outputs processes without does any

change

Page 29: Apache Big Data Europa- How to make money with your own data

4 HARD TO READ

SOLUTIONSCALA + APACHE SPARK

Page 30: Apache Big Data Europa- How to make money with your own data

4 HARD TO READ

Reducing developing time

LOCs dramatically reduced

Number of classes dramatically reduced

Page 31: Apache Big Data Europa- How to make money with your own data

Tests and application readability improvements

DSLs make our lives easier

Spark makes Map Reduces jobs even simpler

4 HARD TO READ

Page 32: Apache Big Data Europa- How to make money with your own data

5 RECOGNIZED PATTERNS

SOLUTIONAPACHE SPARK

MLLIB

Page 33: Apache Big Data Europa- How to make money with your own data

Millons of data processed in order to obtain

mathematical models

Applied complex mathematical algorithms to obtain

accurate weekly behaviors

5 RECOGNIZED PATTERNS

Page 34: Apache Big Data Europa- How to make money with your own data

THANK YOU

UNITED STATES

Tel: (+1) 408 5998830

EUROPE

Tel: (+34) 91 828 64 73

[email protected]

www.stratio.com