druidmeetup@seoul 0906
TRANSCRIPT
DRUID REAL WORLDDRUID MEETUP@SEOUL YOU SUN JEONG ([email protected])
DRUID MEETUP@SEOUL
WELCOME TO DRUID WORLD
THE JOURNEY OF 9 MONTHS
DRUID MEETUP@SEOUL
DRUID OVERVIEW
REALTIME
BROKER HISTORICAL
DRUID MEETUP@SEOUL
ARCHITECTURE - BATCH INGESTION
HDFS
HISTORICAL NODE
HISTORICAL NODE
HISTORICAL NODE
BROKER NODE
Segments
Queries
DRUID MEETUP@SEOUL
ARCHITECTURE - STREAMING INGESTION
REALTIME NODE
HISTORICAL NODE
HISTORICAL NODE
HISTORICAL NODE
BROKER NODE
Segments
QueriesStreaming
DRUID MEETUP@SEOUL
ARCHITECTURE - LAMBDA
REALTIME NODE
HISTORICAL NODE
HISTORICAL NODE
HISTORICAL NODE
BROKER NODE
Segments
QueriesStreaming
HDFS
DRUID MEETUP@SEOUL
REAL WORLD IS CRUEL
DRUID MEETUP@SEOUL
PROBLEMS
▸ For Data Scientist
Arbitrary and Interactive exploration of time series data
▸ Scalability and Performance
Ad-hoc query on trillions of events
▸ Characteristics of the data
Dynamic Columns
Numeric data with Array Type
DRUID MEETUP@SEOUL
DATA LAKE
http://www.kdnuggets.com/2015/09/data-lake-vs-data-warehouse-key-differences.html
DRUID MEETUP@SEOUL
STAT FUNCTION FOR DATA SCIENTIST
▸ HISTOGRAM (MEDIAN)
▸ MEAN
▸ STDDEV
▸ RANGE
▸ AREA
▸ MIN/MAX/SUM
DRUID MEETUP@SEOUL
DATA EXPLODING
// column size 100 ~ 1000 KEYS = [“a”,”b”,”c”……] VALUES = [3.25,45.443,103.2…..]
“a” = 3.25“b” = 45.443“c” = 103.2...
DRUID MEETUP@SEOUL
REAL WORLD ARCHITECTURE
DATA NODE #1
DATA NODE #70
OVERLORDMIDDLE MANAGE
#1
COORDINATOR
MYSQL
HA PROXY
MEMCACHED#2
BROKER NODE
#1
BROKER NODE
#1
MEMCACHED#3
MEMCACHED#1
HISTORICAL NODE #1
HISTORICAL NODE #70
MIDDLE MANAGE
#50
ZK1
ZK2
ZK3
DRUID MEETUP@SEOUL
BUT, WE DID IT
ORC INGESTION
VIRTUAL COLUMNS
JDBC FIREHOSE
STATS-EXTENTIONS
OPTIMIZATION
QUERY
DRUID MEETUP@SEOUL
NO PAIN, NO GAIN
DRUID MEETUP@SEOUL
MAY THE FORCE BE WITH YOU
Q&A
THANK YOU
DRUID MEETUP@SEOUL