you sun jeong data analytics with druid°œ표... · data analytics with druid druid architecture...
TRANSCRIPT
![Page 1: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/1.jpg)
DATA ANALYTICS WITH DRUIDYOU SUN JEONG
![Page 2: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/2.jpg)
DATA ANALYTICS WITH DRUID
WHO AM I ?
Senior Software Engineer of SK Telecom
Commercial Products
Big Data Discovery Solution (~’16)
Hadoop DW (~’15)
PaaS(CloudFoundry) (~’13)
Iaas (OpenStack) (~’13)
Mail to : [email protected]
2
![Page 3: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/3.jpg)
DATA ANALYTICS WITH DRUID
FOOTPRINTS
2014
2015 - Hadoop DW - Realtime NW Analytics
2016 - Big Data Discovery- Streaming Processing
3
![Page 4: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/4.jpg)
DATA ANALYTICS WITH DRUID
AGENDA
‣ History
‣ What is Druid?
‣ Druid Architecture
‣ Real-Time Ingestion Demo (15m)
‣ Cohort Analysis (15m)
4
![Page 5: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/5.jpg)
DATA ANALYTICS WITH DRUID
HISTORY
▸ Development started at Meta markets in 2011
▸ Apache V2 in early 2015
▸ 150+ contributors today
▸ https://github.com/druid-io
5
![Page 6: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/6.jpg)
DATA ANALYTICS WITH DRUID
DATA LAKE
6
https://www.linkedin.com/pulse/more-analytics-than-just-fishing-data-lake-john-poppelaars
![Page 7: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/7.jpg)
DATA ANALYTICS WITH DRUID
DW VS DATA LAKE
http://www.kdnuggets.com/2015/09/data-lake-vs-data-warehouse-key-differences.html
7
![Page 8: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/8.jpg)
DATA ANALYTICS WITH DRUID
WHAT IS DRUID
Distributed, In-memory Multi-dimensional
OLAP store
8
![Page 9: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/9.jpg)
DATA ANALYTICS WITH DRUID
PROBLEMStimestamp domain user gender clicked 2011-01-01T00:01:35Z bieber.com 4312345532 Female 1 2011-01-01T00:03:03Z bieber.com 3484920241 Female 0 2011-01-01T00:04:51Z ultra.com 9530174728 Male 1 2011-01-01T00:05:33Z ultra.com 4098310573 Male 1 2011-01-01T00:05:53Z ultra.com 5832057930 Female 0 2011-01-01T00:06:17Z ultra.com 5789283478 Female 1 2011-01-01T00:23:15Z bieber.com 4730093842 Female 0 2011-01-01T00:38:51Z ultra.com 3909846810 Male 1 2011-01-01T00:49:33Z bieber.com 4930097162 Female 1 2011-01-01T00:49:53Z ultra.com 0381837193 Female 0
timestamp impressions clicks 2011-01-01T00:00:00Z 10 6
timestamp domain user gender clicked 2011-01-01T00:01:35Z bieber.com 4312345532 Female 1 2011-01-01T00:03:03Z bieber.com 3484920241 Female 0 2011-01-01T00:04:51Z ultra.com 9530174728 Male 1 2011-01-01T00:05:33Z ultra.com 4098310573 Male 1 2011-01-01T00:05:53Z ultra.com 5832057930 Female 0 2011-01-01T00:06:17Z ultra.com 5789283478 Female 1 2011-01-01T00:23:15Z bieber.com 4730093842 Female 0 2011-01-01T00:38:51Z ultra.com 9530174728 Male 1 2011-01-01T00:49:33Z bieber.com 4930097162 Female 1 2011-01-01T00:49:53Z ultra.com 0381837193 Female 0
timestamp domain gender impressions clicks 2011-01-01T00:00:00Z bieber.com Female 4 2 2011-01-01T00:00:00Z ultra.com Female 3 1 2011-01-01T00:00:00Z ultra.com Male 3 2
9
![Page 10: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/10.jpg)
DATA ANALYTICS WITH DRUID
BIG DATA DISCOVERY
▸ Roll-up
▸ Summarizing over a dimension
▸ Drill-down
▸ Focusing (zooming in)
▸ Slicing and dicing
▸ Reducing dimensions (slice)
▸ Picking values of specific dimensions (dice)
▸ Pivoting
▸ Rotating multi-dimensional cube
10
![Page 11: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/11.jpg)
DATA ANALYTICS WITH DRUID
OLAP CUBE
▸ Slice and Dice
11
![Page 12: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/12.jpg)
DATA ANALYTICS WITH DRUID
IN-MEMORY
12
![Page 13: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/13.jpg)
DATA ANALYTICS WITH DRUID
COLUMNAR STORAGE
13
![Page 14: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/14.jpg)
DATA ANALYTICS WITH DRUID
DRUID TERMS
▸ Data
▸ Timestamp
▸ Dimension
▸ Metric
▸ Datasource
▸ Segment
▸ Granularity
14
![Page 15: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/15.jpg)
DATA ANALYTICS WITH DRUID
DRUID ARCHITECTURE
REALTIME
BROKER HISTORICAL
15
![Page 16: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/16.jpg)
DATA ANALYTICS WITH DRUID
ARCHITECTURE - BATCH INGESTION
HDFS
HISTORICAL NODE
HISTORICAL NODE
HISTORICAL NODE
BROKER NODE
Segments
Queries
16
![Page 17: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/17.jpg)
DATA ANALYTICS WITH DRUID
ARCHITECTURE - STREAMING INGESTION
REALTIME NODE
HISTORICAL NODE
HISTORICAL NODE
HISTORICAL NODE
BROKER NODE
Segments
QueriesStreaming
17
![Page 18: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/18.jpg)
DATA ANALYTICS WITH DRUID
ARCHITECTURE - LAMBDA
REALTIME NODE
HISTORICAL NODE
HISTORICAL NODE
HISTORICAL NODE
BROKER NODE
Segments
QueriesStreaming
HDFS
18
![Page 19: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/19.jpg)
DATA ANALYTICS WITH DRUID
GLUE ARCHITECTURE
REAL TIME TASK
HISTORICAL NODE
HISTORICAL NODE
HISTORICAL NODE
BROKER NODE
Segments
Queries
Streaming
STREAM PROCESSOR
(TRANQUILITY)
Kafka Indexing Service
19
![Page 20: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/20.jpg)
DATA ANALYTICS WITH DRUID
REAL WORLD ARCHITECTURE
DATA NODE #1
DATA NODE #N
OVERLORDMIDDLE MANAGE
#1
COORDINATOR
MYSQL
HA PROXY
MEMCACHED#2
BROKER NODE
#1
BROKER NODE
#1
MEMCACHED#3
MEMCACHED#1
HISTORICAL NODE #1
HISTORICAL NODE #N
MIDDLE MANAGE
#N
ZK1
ZK2
ZK3
20
![Page 21: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/21.jpg)
DATA ANALYTICS WITH DRUID
DRUID MONITORING
21
http://www.slideshare.net/CharlesAllen9/programmatic-bidding-data-streams-druid
![Page 22: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/22.jpg)
DATA ANALYTICS WITH DRUID
DRUID DATASOURCE
22
![Page 23: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/23.jpg)
RDRUID
DATA ANALYTICS WITH DRUID
https://github.com/druid-io/RDruid
23
![Page 24: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/24.jpg)
DATA ANALYTICS WITH DRUID
PYDROID
24
https://github.com/druid-io/pydruid
![Page 25: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/25.jpg)
DATA ANALYTICS WITH DRUID
DEMO
▸ Jupyter Notebook(PyDruid)
▸ Mobile App User Events for 1 week : 2 billion events
▸ Scenario : Unique users Cohort Analysis
25
![Page 26: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/26.jpg)
DEMO
![Page 27: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/27.jpg)
DATA ANALYTICS WITH DRUID
MAY THE FORCE BE WITH YOU
27
![Page 28: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/28.jpg)
DATA ANALYTICS WITH DRUID
REFERENCES
▸ Druid: http://www.popit.kr/tag/druid/ (https://www.facebook.com/popitkr/): http://druid.io/
▸ Cohort Analysis : http://www.gregreda.com/2015/08/23/cohort-analysis-with-python/
▸ Druid Meetup@Seoul : http://www.meetup.com/Druid-Seoul/
28
![Page 29: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/29.jpg)
DATA ANALYTICS WITH DRUID
POPIT
29
https://www.facebook.com/popitkr/
![Page 30: YOU SUN JEONG DATA ANALYTICS WITH DRUID°œ표... · data analytics with druid druid architecture realtime broker historical 15. data analytics with druid architecture - batch ingestion](https://reader030.vdocuments.mx/reader030/viewer/2022040609/5ecd8832d525a4298018ed24/html5/thumbnails/30.jpg)
Q&A
THANK YOU
DATA ANALYTICS WITH DRUID 30