big data hadoop-no sql and graph db-final
DESCRIPTION
TRANSCRIPT
![Page 1: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/1.jpg)
This document is intended for only AVEA İletişim Hizmetleri A.Ş.("AVEA"), its dealers, employees and/or others specifically authorised. The contents of this document are confidential and any disclosure, copying, distribution and/or taking any action in reliance with the content of this document is prohibited. AVEA is not liable for the transmission of this document in any manner to any third parties that are not authorised to receive.
Big Data – Hadoop - NoSQL and Graph DatabaseRamazan FIRIN20.11.2012
![Page 2: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/2.jpg)
2
AGENDA
• Big Data
• Hadoop
• NoSQL
• Graph DB and Neoj
• Possible Usage in Tellco
• Demo
![Page 3: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/3.jpg)
3
Executive Summary
R&D /MW DevelopementAVEA
• Big Data is a new IT trend
• Hadoop and NoSQL can used to process Big Data
• Possible usage area in Tellco : - Prevent Churn - to offer customer spesific campaign - to get more customer
![Page 4: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/4.jpg)
4
What is Big Data?
Datasets that are too awkward to work with using traditional,
hands-ondatabase management tools.
![Page 5: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/5.jpg)
5
Big Data- 3V Concept
![Page 6: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/6.jpg)
6
Big Data Sources
1. Social network profiles -Facebook, LinkedIn, Yahoo, Google
2. Social influencers - blog comments, user forums, review sites,
3. Activity-generated data - application logs, sensor data
4. Public—Wikipedia, IMDb, etc
5. Data warehouse appliances - transactional data
6. Network and in-stream monitoring
7. Legacy documents—
![Page 7: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/7.jpg)
7
Big Data To Smart Data
Cover of The Economist
![Page 8: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/8.jpg)
8
Volume
/
![Page 9: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/9.jpg)
9
New Data Sources - Internet
• 2 Billion internet users by 2011
• Twitter processes 7 terabytes data of every day
• Facebook processes 10 terabytes data of every day
• 4.6 billion mobile phone
• Google processes 24 petabytes data of every day
![Page 10: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/10.jpg)
10
Big Data Approach
![Page 11: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/11.jpg)
11
Big Data Design
![Page 12: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/12.jpg)
12
Big Data Usage Sector
![Page 13: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/13.jpg)
13
Sample Usage - 360°Degree View of the Customers
![Page 14: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/14.jpg)
14
Sample Usage – Customer Sentiment
![Page 15: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/15.jpg)
15
Sample Usage – Detect Churn Pattern
![Page 16: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/16.jpg)
16
Sample Usage - Healty
![Page 17: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/17.jpg)
17
Big Data Market
![Page 18: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/18.jpg)
18
Big Data Solutions – Oracle Big Data Appliance
![Page 19: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/19.jpg)
19
Big Data Solutions – IBM Pure Data
![Page 20: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/20.jpg)
20
TOP 10 Tecnology Trend 2012 from CSC
![Page 21: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/21.jpg)
21
Gartner: Top 10 IT Trends for 2013
21R&D /MW DevelopementAvea
![Page 22: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/22.jpg)
22
Gartner:10 Critical IT Trends For The Next Five Years
• Third trend is Bigger data and storage:
• By 2015, big data demand will generate 1 million jobs in the Global 1000,
• but only a one-third of jobs will get filled due to shortage of talent.
• Analytics and pattern recognition are key.
• Seeing new specialized ARM-based servers to do specialty analytics.
22R&D /MW DevelopementAvea
![Page 23: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/23.jpg)
23
HADOOP
![Page 24: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/24.jpg)
24
What is HADOOP?
The Apache Hadoop software library is a framework that
allows for the distributed processing of large data sets
across clusters of computers using simple programming models
![Page 25: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/25.jpg)
25
History
![Page 26: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/26.jpg)
26
Hadoop Components
![Page 27: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/27.jpg)
27
![Page 28: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/28.jpg)
28
Hadoop Ecosystem
Pig - simplifies hadoop programming, data processing language
Hive - SQL like queries
HBase - Random read/write, billions of row and millions of colums (NoSQL)
![Page 29: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/29.jpg)
29
Other Google Research
![Page 30: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/30.jpg)
30
NoSQL
![Page 31: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/31.jpg)
31
RDBMS PERFORMANCE
31R&D /MW DevelopementAvea
![Page 32: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/32.jpg)
32
Join is killer...
32R&D /MW DevelopementAvea
![Page 33: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/33.jpg)
33
What is NoSQL?
• Stands for Not Only SQL
• Non relational
• Cheap, Easy to implement
• Scalability
– Vertically - Add more data
– Horizontally - Add more storage
• No pre-defined schema
• No join operations
• Not ACID, support CAP threom
![Page 34: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/34.jpg)
34
NoSQL DB Types
1. Key-values Stores
2. Document Databases
3. Column Family Stores
4. Graph Databases
![Page 35: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/35.jpg)
35
Key-Value Stores
- Redis, Voldemort
![Page 36: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/36.jpg)
36
Document Database
- CouchDB, MongoDB
![Page 37: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/37.jpg)
37
-Cassandra, HBase
![Page 38: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/38.jpg)
38
Graph Database
- Neo4J, InfoGrid, Infinite Graph
![Page 39: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/39.jpg)
39
RMDBS Support ACID
• Atomicity - a transaction is all or nothing
• Consistency - only valid data is written to the database
• Isolation - pretend all transactions are happening serially and the data is correct
• Durability - what you write is what you get
![Page 40: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/40.jpg)
40
NoSQL Support CAP Threom
![Page 41: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/41.jpg)
41
NoSQL Support CAP Theorem
• Consistency - each client always has the same view of the data.
• Availability - all clients can always read and write.
• Partition tolerance - if one or more nodes fails the system still works
You can pick only two...
![Page 42: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/42.jpg)
42
Visual Guide to NoSQL Systems
42R&D /MW DevelopementAvea
![Page 43: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/43.jpg)
43
NoSQL Complexity
![Page 44: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/44.jpg)
44
NoSQL Performance
![Page 45: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/45.jpg)
45
Job Trends
45R&D /MW DevelopementAvea
![Page 46: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/46.jpg)
46
Graph DB and Neo4j
![Page 47: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/47.jpg)
47
Graph DB
Graph database uses graph structures with nodes, edges, and properties to represent and store data.
![Page 48: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/48.jpg)
48
Graph DB Usage Area
• Recommendations
• Business Inteligence
• Social networking
• MDM
• System Management
• Time Series data
• Product Catalogue
• Web Analitics
• Scientific Computing
• Indexing your slow RMDBS
![Page 49: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/49.jpg)
49
Relational Databases are Graphs!
![Page 50: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/50.jpg)
50
Neo4j
• Leading Graph Database
• Transaction support (ACID)
• Indexing
• Querying
• REST support
• Disk Based
• Opensource
• Traversal framework
• High Performance (traverse 1.000.000 + relationship/seconds)
• Robust (in 7/24 operation since 2003)
• Massive scalability
![Page 51: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/51.jpg)
51
Neo4j Data Model
Neo4j has Nodes and Relationship.
Nodes and realtionships have properties.
Node1
Node2
Property:name
Property:surname
Property:name
Property:surname
Relationship
Relationship type : knowsProperty : Date of meeting
![Page 52: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/52.jpg)
52
Ne4j Performance
http://www.neotechnology.com/2012/10/20-billion-relationships-imported-into-neo4j-on-ec2/
![Page 53: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/53.jpg)
53
Who use Neo4j?
• Cisco - Master Data Management
• Telenor Group : Customer organization scructure (203 million subscribers )
• Deutsche Telekom: Social football site (150 million subscribers )
![Page 54: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/54.jpg)
54
Cypher For Query
![Page 55: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/55.jpg)
55
Sample Code
![Page 56: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/56.jpg)
56
Spring Data Neo4j
![Page 57: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/57.jpg)
57
Neoclipse
![Page 58: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/58.jpg)
58
Product Catalog
58R&D /MW DevelopementAvea
![Page 59: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/59.jpg)
59
Sample OM Data Model
![Page 60: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/60.jpg)
60
Hardware Calculating Tool
![Page 61: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/61.jpg)
61
Hardware Calculating Tool Result
Calculation Result Prod Environment
• 4 pysical machines
• 3 node at every machines
• 1024 mhz cpu
• 65536 MB Ram
![Page 62: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/62.jpg)
62
Orient DB
• The Document-Graph database
• ACID support
• SQL and Native Queries,
• schema-less, schema-full and schema-mixed modes
• Roles + Security
• Functions
• HTTP / Restfull / Json / Binary supports
• Hooks
• Fetch plans
• Inheritance
• 200.000 insert per second(6 M node travels with cache)
![Page 63: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/63.jpg)
63
FluxGraph
• Temporal Graph Database
• Has checkpoint
• Compatible with Neo4j
632008-07-01_Presentation Template MBT / CEOMercedes-Benz Türk A.Ş.
![Page 64: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/64.jpg)
64
Examples for TelCos
• CDR
• Routing
• Social graphs
• Master Data Management
• Spatial and LBS
• Network topology analysis
• Neo4j and Android
64R&D /MW DevelopementAvea
![Page 65: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/65.jpg)
65
CDR Analysis
65R&D /MW DevelopementAvea
![Page 66: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/66.jpg)
66
Master Data Management
66R&D /MW DevelopementAvea
![Page 67: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/67.jpg)
67
Network Management
67R&D /MW DevelopementAvea
![Page 68: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/68.jpg)
68
Cell Network Analiysis
68R&D /MW DevelopementAvea
![Page 69: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/69.jpg)
69
Sample Senarios
• Customer Spesific Campaign
• Prevent Churn
• Get More Customer
• Special offer for campaigns
![Page 70: Big data hadoop-no sql and graph db-final](https://reader031.vdocuments.mx/reader031/viewer/2022020803/54c6ef1f4a7959f3488b456d/html5/thumbnails/70.jpg)
70
Thanks