séminaire big data alter way - elasticsearch - octobre 2014

80

Upload: alter-way

Post on 01-Dec-2014

313 views

Category:

Data & Analytics


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Page 2: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Agenda & Intervenants

Page 3: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Introduction

Page 4: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Alter Way in 2 slides

Page 5: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Alter Way in 2 slides

Page 6: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Elasticsearch in 1 slide

• More than 11 million downloads

• 650,000 New Downloads per Month

• 1000s of Mission Critical Implementations

• Top Investors: Benchmark Capital, Index

Ventures

• Seasoned Executive Team

– Founded by Creator of Elasticsearch

– Seasoned Executives from SpringSource

Page 7: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Les enjeux de la recherche à

l’ère du BigData

Page 8: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Big Data in Todayʼs Business and Technology

Environment : some significant figures

• 2.7 Zetabytes of data exist in the digital universe today. (=1 billion Terabytes)

• 235 Terabytes of data has been collected by the U.S. Library of Congress in

April 2011.

• Facebook stores, accesses, and analyzes 30+ Petabytes of user generated

data.

• Akamai analyzes 75 million events per day to better target advertisements.

• Walmart handles more than 1 million customer transactions every hour,

which is imported into databases estimated to contain more than 2.5 petabytes

of data.

• The largest AT&T database boasts titles including the largest volume of data in

one unique database (312 terabytes) and the second largest number of rows in

a unique database (1.9 trillion), which comprises AT&Tʼ’s extensive calling

records.

• Hadoop :

– 94% of Hadoop users perform analytics on large volumes of data not

possible before

– 88% analyze data in greater detail;

– while 82% can now retain more of their data.

Page 9: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

The Rapid Growth of Unstructured Data

• YouTube users upload 48 hours of new video every minute of the

day.

• 500+ new websites are created every minute of the day.

• Brands and organizations on Facebook receive 34,722 Likes every

minute of the day.

• 100 terabytes of data uploaded daily to Facebook.

• According to Twitterʼ’s own research in early 2012, it sees roughly

175 million tweets every day, and has more than 465 million

accounts.

• 30 Billion pieces of content shared on Facebook every month.

Data production will be 44 times greater in 2020 than it was in 2009.

Page 10: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Big Data & Real Business Issues

• 25+ % of decision‐makers surveyed predict that data volumes in their

companies will rise by more than 60% by the end of 2014, with the

average of all respondents anticipating a growth of no less than 42 %.

• 40% projected growth in global data generated per year vs. 5% growth in

global IT spending.

• According to estimates, the volume of business data worldwide, across all

companies, doubles every 1.2 years.

– Poor data can cost businesses 20%–35% of their operating revenue.

– Bad data or poor data quality costs US businesses $600 billion annually.

• 75+ % of decision-makers surveyed anticipate significant impacts in the

domain of storage systems as a result of the “Big Data” phenomenon.

• We anticipate a new challenge : to be able to Search and Analyse all

those datas … in real time !

Page 11: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Elasticsearch

A solution already in production

with significant french

implementations

Revolutionizing Data Search and

AnalyticsRichard Maurer– SEMEA Territory Manager

Page 12: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Purpose of Elasticsearch

• Organize data and make it easily accessible

– Through powerful search and analytics

– Easily consumable (even for non-data scientists)

– Elegantly handles extremely large data volumes

– Delivers results in real time

• Technology stack agnostic

• Used across all market verticals

Page 13: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Features of Elasticsearch

• Structured & unstructured search

• Advanced analytics capabilities

• Unmatched performance

• Real-time results

• Highly scalable

• User friendly installation and maintenance

Page 14: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Elasticsearch 1.4: a solution

production ready• Real time data Indexation

• Distributed

• High Availability

• Schema Free

• Real Time Data Analytics

• Multi Tenancy

• Much more….

Page 15: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Unprecedented Uptake

Elasticsearch has more than11 Million downloads

… and 650,000 more each month

Cumulative

Page 16: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

French Users

Page 17: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

French Use Cases

Bouygues Telecom:

Uses Elasticsearch in their Big Data Platform. Cut their web resolution time by 10X

Daily Motion:

Indexing their 20 million Videos on Elasticsearch. On production for over 2 years

Voyages SNCF

They have recently announced ES has being live on their “Usine Logicielle”

Fotolia:

Search Engine made on Elasticsearch, to access 24 Million Images, move over to ES

Orange:

With over 1.2 billion docs, looking at better solution and cost reduction

Page 18: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Product Offerings:Support Throughout Your Project

1. Core Elasticsearch Training (2 days)

2. ELK Workshop (1 day)

3. Development and Production Support

4. Marvel, Monitoring of your ES clusters

Page 19: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

2: Support

Page 20: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Resources

• www.elasticsearch.com

• www.elasticsearch.org

• User Groups:

http://www.elasticsearch.org/community/forum/

• Contact:

Richard Maurer

Territory Manager

[email protected]

Page 21: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

MAKE SENSE OF YOUR (BIG) DATA!

David Pilato Technical advocate!!elasticsearch. @dadoonet

Page 22: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Page 23: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

StartUp

data ?

Page 24: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

StartUp

Page 25: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

StartUp

Page 26: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

StartUp

Page 27: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

StartUp

Page 28: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

StartUp

Page 29: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

StartUp

BIG data ?

Page 30: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

StartUp

BIG data ?

Page 31: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

StartUp

Source: http://www.csc.com/insights/flxwd/78931-big_data_just_beginning_to_explode

35.000.000.000.000.000 mb

Page 32: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

StartUp

Source: http://www.domo.com/learn/data-never-sleeps-2

Page 33: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Page 34: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

StartUp

search = like % ?SELECT ! doc.*, country.* !FROM ! doc, country!WHERE ! doc.country_code = country.code AND! doc.date_doc > to_date('2011-12', 'yyyy-mm') AND ! doc.date_doc < to_date('2012-01', 'yyyy-mm') AND ! lower(country.name) = 'france' AND ! lower(doc.comment) LIKE ‘%product%' AND lower(doc.comment) LIKE ‘%david%';

Page 35: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

StartUp

Search engine ?

Page 36: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

StartUp

elasticsearch ?

plug & play

REST/JSON

scalable

Apache 2 license

Lucene

elasticsearch

Page 37: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Start…

$ wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.1.1.tar.gz!$ tar -xf elasticsearch-1.1.1.tar.gz!$ ./elasticsearch-1.1.1/bin/elasticsearch![INFO ][node ][Ghost Maker] {1.1.1}[5645]: initializing

Page 38: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

… and play!$ curl -XPUT localhost:9200/sessions/session/1 -d '{! "title" : "Elasticsearch",! "subtitle" : "Make sense of your (BIG) data !",! "date" : "2014-05-20T10:30:00",! "tags" : [ "elasticsearch", "alterway", "bigdata" ],! "speakers" : [{! "first_name" : "David", ! "last_name" : "Pilato" ! }]!}'

Page 39: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Search!$ curl http://localhost:9200/sessions/session/_search -d' { "query": { "multi_match": { "query": "elasticsearch alterway david", "fields": [ "title^3", "tags^2", "speakers.first_name" ] } }, "post_filter": { "range": { "date": { "from": "2014-05-01", "to": "2014-06" } } } }'

Page 40: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

StartUp

Compute?

Page 41: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

$ curl http://localhost:9200/sessions/session/_search -d' { "query": { ... }, "aggs": { "by_date": { "date_histogram": { "field": "date", "interval": "day", "format" : "dd/MM/yyyy" } } } }'

"by_date": [ { "key_as_string": "03/04/2014", "doc_count": 1 }, { "key_as_string": "12/04/2014", "doc_count": 2 }, { "key_as_string": "16/04/2014", "doc_count": 3 } ]

Compute!

Page 42: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Page 43: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

#mstechdays #elasticsearch StartUp

• logs!

• twitter!

• github!

• marketing data!

• ...!

• your data!

• your big data

Let’s make sense of …

Page 44: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

#mstechdays #elasticsearch StartUp

• logs!

• twitter!

• github!

• marketing data!

• ...!

• your data!

• your big data

Let’s make sense of …{ "name":"Pilato David", "dateOfBirth":"1971-12-26", "gender":"male", "children":3, "marketing":{ "fashion":334, "music":3363, "hifi":2351 }, "address":{ "country":"France", "city":"Paris", "location": [2.332395, 48.861871] } }

Page 45: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

démo#mstechdays #elasticsearch StartUp

MAKE SENSE OF YOUR (BIG) DATA!

let’s inject some marketing documents…

Page 46: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

elasticsearch.elasticsearch

kibana

logstash

Marvel

Page 47: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

@dadoonet

thanks

Page 48: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Comment insérer ElasticSearch

dans votre Système d’Information

et en tirer le meilleur parti

Page 49: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

ElasticSearch to do What ?

Page 50: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

STORE

Page 51: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

SEARCH

Page 52: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

ANALYZE

Page 53: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Are you ready to use

ElasticSearch in your IT?

Page 54: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

What you need to run it

• Java 8 update 20 or later, or Java 7 update 55 or later

• Only Oracle’s Java and the OpenJDK are supported.

Page 55: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Github projects• Many projects• Big activity• Many languages

6 mois !

Page 56: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Clients

Page 57: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Scripting Plugins Language

Page 58: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Why it ‘s easy

Page 59: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

• One to many• ~ Zero conf• Cloud oriented• Scalability DNA• Replication• Sharding• Distributed• Resilience• Snapshot• Restore

Start Small Grow Big

Page 60: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

• One to many• ~ Zero conf• Cloud oriented• Scalability DNA• Replication• Sharding• Distributed• Resilience• Snapshot• Restore

Start Small Grow Big

Page 61: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Where / How can you use

ElasticSearch?

Page 62: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

VIA

Centralized Log Storage 1/2

Page 63: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Centralized Log Storage 2/2

Page 64: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

CMS Search Engine

Page 65: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

• Faceting• Fuzzy Search• Speed• Auto Completion• Geo Search• Log Analysis

Ecommerce Enhanced Search

Engine

Page 66: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

• REST based• Memory and I/O efficient• Adaptive I/O• Map/Reduce API support• Pig support• Hive support

elasticsearch-hadoop

Combining Hadoop & ElasticSearch

Page 67: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

What Else ?

Page 68: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

It’s up to you to decide what to build with ES

Page 69: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Analysis / Dasboards

Some Examples

Page 70: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Kibana examples : IRC Activity

Page 71: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Kibana examples : Pfsense Monitoring

Page 72: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Kibana examples : Windows Events

Page 73: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Kibana examples : Inventory

Page 74: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Kibana examples : Syslog

Page 75: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Kibana examples : Web Activity

Page 76: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

ES = No Limits

Page 77: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Page 78: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Conclusion

Page 79: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Conclusion

• Il est temps de révolutionner la façon dont vous valorisez

vos données : offrez Elasticsearch à vos applicatifs !

• La stack ELK (Elasticsearch, Logstash, Kibana) est déjà

massivement utilisée en production !

• Faites vous accompagner pour bénéficier des bonnes

pratiques et du support à tous les stades de votre projet :

conception, développement, production

Page 80: Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

Questions / Réponses