distributed stream processing on fluentd / #fluentd
Post on 10-May-2015
9.904 Views
Preview:
DESCRIPTION
TRANSCRIPT
Distributed message stream processing on Fluentd
2012/02/04 Fluentd meetup in Japan
NHN Japan Corp. Web Service Business DivisionTAGOMORI Satoshi (@tagomoris)
12年2月4日土曜日
12年2月4日土曜日
Working at NHN Japanwe are hiring!
12年2月4日土曜日
What we are doing about logs with fluentd
data mining
reportingpage views, unique users,
traffic amount per page,
...
12年2月4日土曜日
super large scale
'sed | grep | wc'like processes
What we are doing about logs with fluentd
12年2月4日土曜日
What fluentd? (not Storm, Kafka or Flume?)
Ruby, Ruby, Ruby! (NOT Java!)we are working in lightweight language culture
easy to try, easy to patch
Plugin model architecture
Builtin TimeSlicedOutput mechanism
12年2月4日土曜日
What I talk today
What we are trying with fluentd
How we did, and how we are doing now
What is distributed stream process topologies like?
What is important about stream processing
Implementation details
(appendix)
12年2月4日土曜日
Architecture in last week's presentationサーバサーバサーバサーバサーバサーバ
serverサーバサーバサーバサーバサーバサーバ
server
deliver(scribed)
send data both archive servers and Fluentd workers (as stream)
archive server(scribed)大容量RAID
Hadoop ClusterHadoop ClusterHadoop ClusterHadoop ClusterHadoop ClusterHadoop Cluster
ShibHadoop HiveWeb Client
aggregation querieson demand
archive server(scribed)
Large volume RAIDdeliver server
(scribed)
Hadoop ClusterHadoop ClusterHadoop ClusterHadoop ClusterHadoop ClusterFluentd Cluster
convert logs as structured dataand write HDFS (as stream)
import past logs and converton demand (as batch)
12年2月4日土曜日
Nowサーバサーバサーバサーバサーバサーバ
serverサーバサーバサーバサーバサーバサーバ
server
deliver(scribed)
archive server(scribed)大容量RAID
Hadoop ClusterHadoop ClusterHadoop ClusterHadoop ClusterHadoop ClusterHadoop Cluster
ShibHadoop HiveWeb Client
archive server(scribed)
Large volume RAIDdeliver server
(Fluentd)
Hadoop ClusterHadoop ClusterHadoop ClusterHadoop ClusterHadoop ClusterFluentd Cluster
Fluentd Watcher
12年2月4日土曜日
Fluentd in production service
10 days
12年2月4日土曜日
from 127 Web Servers
146 log streams
Scale of Fluentd processes
12年2月4日土曜日
Scale of Fluentd processes
70,000 messages/sec
120 Mbps
(at peak time)
12年2月4日土曜日
650 GB/day(non-blog: 100GB)
Scale of Fluentd processes
12年2月4日土曜日
89 fluentd instances
on
12 nodes (4Core HT)
Scale of Fluentd processes
12年2月4日土曜日
We can't go back.
crouton by kbysmnr12年2月4日土曜日
log conversion
from: raw log
(apache combined like format)
to: structured and query-friendly log
(TAB separated, masked some fields, many flags added)
What we are trying with fluentd
12年2月4日土曜日
log conversion
What we are trying with fluentd
99.999.999.99 - - [03/Feb/2012:10:59:48 +0900] "GET /article/detail/6246245/ HTTP/1.1" 200 17509 "http://news.livedoor.com/topics/detail/6246245/" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR
3.0.30729; Media Center PC 6.0; InfoPath.1; .NET4.0C)" "news.livedoor.com" "xxxxxxx.xx.xxxxxxx.xxx" "-" 163266
152930 news.livedoor.com /topics/detail/6242972/ GET 302 210 226 - 99.999.999.99 TQmljv9QtXkpNtCSuWVGGg Mozilla/5.0 (iPhone; CPU iPhone OS 5_0_1 like Mac OS X)
AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9A406 Safari/7534.48.3 TRUE TRUE FALSE FALSE FALSE FALSE FALSE
hhmmdd vhost path method status bytes duration referer rhost userlabel agent FLAG [FLAGS]FLAGS: status_redirection status_errors rhost_internal suffix_miscfile suffix_imagefile agent_bot
FLAG: logical OR of FLAGSuserlabel: hash of (tracking cookie / terminal id (mobile phone) / rhost+agent)
12年2月4日土曜日
TimeSlicedOutput of fluentd
Traditional 'log rotation' is important, but troublesome
We want:
2/3 23:59:59 log in access.0203_23.log
2/4 00:00:00 log in access.0204_00.log
What we are trying with fluentd
12年2月4日土曜日
How we did, and how we are doing now
collect
archive
convert
aggregate
show
12年2月4日土曜日
How we did in past (2011)
collect (scribed)
archive (scribed)
convert (Hadoop Streaming)
aggregate (Hive)
show
streamstream
hourly/daily
on demand
on demand
HIGH LATENCYtime to flush +
hourly invocation +running time20-25mins
store to hdfs
12年2月4日土曜日
How we are doing now
collect (Fluentd)
archive (scribed)
convert (Fluentd)
aggregate (Hive)
show
streamstream
on demand
on demand
stream convertVELY LOW LATENCY
2-3 minutes(only time to wait flush)
stream
store to hdfs(over Cloudera's Hoop)
12年2月4日土曜日
break.crouton by kbysmnr
12年2月4日土曜日
reasonable efficiency(compared with batch throughput)
ease to re-run same conversion as batch
None SPOF
ease to add/remove nodes
What is important about stream processing
12年2月4日土曜日
How to re-run conversion as batch when we got troubles?
We want to use 'just one' converter program for both stream processes
and batch processes!
Stream processing and batch
12年2月4日土曜日
out_exec_filter (fluentd built-in plugin)
1. fork and exec 'command' program
2. write data to child process stdin as TAB separated fields specified by 'out_keys' (for tag, remove_prefix available)
3. read data from child process stdout as TAB separated fields named by 'in_keys' (for tag, add_prefix
available)
4. set message's timestamp by 'time_key' value in parsed data as format specified by 'time_format'
12年2月4日土曜日
read from stdin / write to stdout
TAB separated values as input/output
WOW!!!!!!!
difference: 'tag' may be needed with out_exec_filter
simple solution: if not exists, ignore.
'out_exec_filter' and 'Hadoop Streaming'
12年2月4日土曜日
reasonable efficiency(compared with batch throughput)
ease to re-run same conversion as batch
None SPOF
ease to add/remove nodes
What is important about stream processing
12年2月4日土曜日
What is distributed stream process toplogies like?
servers
servers
servers
servers
servers
deliver
deliver
workerworker
workerworker
workerworker
workerworker
worker
archiver backup
serializer
HDFS(Hoop Server)
serializer
Redundancy and load balancing MUST be guaranteed anywhere.
12年2月4日土曜日
Deliver nodes
servers
servers
servers
servers
servers
deliver
deliver
workerworker
workerworker
workerworker
workerworker
worker
archiver backup
serializer
HDFS(Hoop Server)
serializer
Accept connections from web servers,Copy messages and send to:
1. archiver (and its backup)2. convert workers (w/ load balancing)
3. and ...
useful for casual worker append/remove12年2月4日土曜日
Worker nodes
servers
servers
servers
servers
servers
deliver
deliver
workerworker
workerworker
workerworker
workerworker
worker
archiver backup
serializer
HDFS(Hoop Server)
serializer
Under load balancing,workers as many as you want
12年2月4日土曜日
Serializer nodes
servers
servers
servers
servers
servers
deliver
deliver
workerworker
workerworker
workerworker
workerworker
worker
archiver backup
serializer
HDFS(Hoop Server)
serializer
Receive converted data stream from workers,aggregate by services, and :
1. write to storage(hfds/hoop)2. and...
useful to reduce overhead of storage from many concurrent write operations
12年2月4日土曜日
Watcher nodes
servers
servers
servers
servers
servers
deliver
deliver
workerworker
workerworker
workerworker
workerworker
worker
archiver backup
serializer
HDFS(Hoop Server)
serializer
watcherwatcher
Watching data for real-time workload repotings
and trouble notifications
1. for raw data from delivers2. for structured data from serializers
12年2月4日土曜日
break.crouton by kbysmnr
12年2月4日土曜日
Implementation details
log agents on servers (scribeline)
deliver (copy, in_scribe, out_scribe, out_forward)
worker (in/out_forward, out_exec_filter)
serializer/hooper (in/out_forward, out_hoop)
watcher (in_forward, out_flowcounter, out_growthforecast)
12年2月4日土曜日
log agent: scribeline
log delivery agent tool, python 2.4, scribe/thrift
easy to setup and start/stopworks with any httpd configuration updates
works with logrotate-ed log filesautomatic delivery target failover/takeback
(NEW) Cluster support(random select from server list)
https://github.com/tagomoris/scribe_line
12年2月4日土曜日
From scribeline To deliver
serversscribeline
deliver server (primary)category: blog
message: RAW LOG (Apache combined + α)
deliver server (secondary)
fluentd
fluentd
in_scribe
in_scribe
scribe
12年2月4日土曜日
From scribeline To deliverdeliver 01 (primary)
deliver 02 (secondary)
deliver 03 (primary for high throughput nodes)
x8 fluentdper node
xNN servers
12年2月4日土曜日
From scribeline To deliver
serversscribeline
deliver server (primary)category: blog
message: RAW LOG (Apache combined + α)
deliver server (secondary)
fluentd
fluentd
in_scribe
in_scribe
12年2月4日土曜日
deliver node internal routing
deliver server (primary) x8 fluentd instancesdeliver fluentd
in_scribe add_prefix scribe remove_newline true
time: received_attag: scribe.blog
message: RAW LOG
copy scribe.*out_scribe host archive.server.local remove_prefix scribe add_newline true
category: blogmessage: RAW LOG
out_flowcounter (see later..)
roundrobin (see next)
out_forward (see later with out_flowcounter..)
12年2月4日土曜日
deliver node: roundrobin strategy to workers
roundrobin x56 substore configurations (7workers x 8instances)
out_forward server: worker01 port 24211 secondary server: worker02 port 24211
out_forward server: worker01 port 24212 secondary server: worker03 port 24212
out_forward server: worker01 port 24213 secondary server: worker04 port 24213
out_forward server: worker01 port 24214 secondary server: worker05 port 24214
time: received_attag: scribe.blog
message: RAW LOG
12年2月4日土曜日
From deliver To worker
deliver serverdeliver fluentd
copy scribe.*roundrobin
out_forward
worker server Xworker fluentd Xn1
in_forward
time: received_attag: scribe.blog
message: RAW LOG
time: received_attag: scribe.blog
message: RAW LOG
worker server Yworker fluentd Yn2
in_forward
12年2月4日土曜日
worker node internal routing
worker server x8 worker instances, x1 serializer instanceworker fluentd
in_forwardserializer fluentd
out_forward converted.*
in_forwardout_exec_filter scribe.*command: convert.shin_keys: tag,messageremove_prefix scribeout_keys: .......add_prefix: convertedtime_key: timefieldtime_format: %Y%m%d%H%M%S
time:received_attag: scribe.blog
message: RAW LOG
time:written_timetag: converted.blog[many data fields]
out_hoop converted.bloghoop_server servername.localusernamepath /on_hdfs/%Y%m%d/blog-%H.log
out_hoop converted.newspath /on_hdfs/%Y%m%d/news-%H.log
TAB separatedtext data
12年2月4日土曜日
out_exec_filter (review.)
1. fork and exec 'command' program
2. write data to child process stdin as TAB separated fields specified by 'out_keys' (for tag, remove_prefix available)
3. read data from child process stdout as TAB separated fields named by 'in_keys' (for tag, add_prefix
available)
4. set message's timestamp by 'time_key' value in parsed data as format specified by 'time_format'
12年2月4日土曜日
worker fluentd
out_exec_filter behavior details
out_exec_filter scribe.*command: convert.sh in_keys: tag,message remove_prefix: scribe out_keys: ....... add_prefix: converted time_key: timefield time_format: %Y%m%d%H%M%S
time: received_attag: scribe.blog
message: RAW LOG
Forked Process (convert.sh -> perl convert.pl)stdin
blog RAWLOG
stdout
blog 20120204175035 field1 field2.....
time: 2012/02/04 17:50:35tag: converted.blog
path:... agent:...referer:... flag1:TRUE
12年2月4日土曜日
From serializer To HDFS (Hoop)
worker serverserializer fluentd
in_forward
time:written_timetag: converted.blog[many data fields]
out_hoop converted.bloghoop_server servername.localusernamepath /on_hdfs/%Y%m%d/blog-%H.log
out_hoop converted.newspath /on_hdfs/%Y%m%d/news-%H.log
Hadoop NameNodeHoop Server
HDFS
TAB separatedtext data
HTTP
12年2月4日土曜日
worker node cluster
deliver node clusterOverview
servers
servers
servers
servers
servers
deliver
deliver
workerworker
workerworker
workerworker
workerworker
worker
archiver backup
serializer
HDFS(Hoop Server)
serializer
12年2月4日土曜日
にくきゅー。crouton by kbysmnr
12年2月4日土曜日
Traffics: Bytes/sec (on deliver 2/3-4)
• bytes
12年2月4日土曜日
Traffics: Messages/sec (on deliver 2/3-4)
• counts
12年2月4日土曜日
Traffic/CPU/Load/Memory: deliver nodes (2/3-4)
12年2月4日土曜日
Traffics: workers network traffics total
• total network traffics
12年2月4日土曜日
Traffic/CPU/Load/Memory: a worker (2/3-4)
12年2月4日土曜日
Fluentd stream processing
Finally, works fine, now
Log conversion latency dramatically reduced
Many useful plugins for monitoring are waiting shipped
Hundreds of cool features to implement are also waiting for us!
12年2月4日土曜日
crouton by kbysmnr
Thank you!
12年2月4日土曜日
Appendixcrouton by kbysmnr
12年2月4日土曜日
input traffics: by fluent-plugin-flowcounter
deliver server (primary) x8 fluentd instancesdeliver fluentd
in_scribe add_prefix scribe remove_newline true
time: received_attag: scribe.blog
message: RAW LOG
copy scribe.*out_scribe host archive.server.local remove_prefix scribe add_newline true
category: blogmessage: RAW LOG
out_flowcounter (see later..)
roundrobin (see next)
out_forward (see later with out_flowcounter..)
12年2月4日土曜日
bytes/messages counting on fluentd
1. 'out_flowcounter' counts input message and its size (specified fields) and its rate (/sec)
2. Counting results emitted per minute/hour/day
3. Worker fluentd sends results to 'Watcher' node over out_forward
4. Watcher receives counting results, and pass to 'out_growthforecast'.
'GrowthForecast' is graph drawing tool with REST API for data registration, by kazeburo
12年2月4日土曜日
out_forward roundrobin is per buffer flushing !
(per buffer size, or flush_interval)
For high throughput stream,
this unit is too large.
We needs roundrobin per 'emit'.
Why not out_forward roundrobin in deliver?
12年2月4日土曜日
deliver node: roundrobin strategy to workers
roundrobin x56 substore configurations (7workers x 8instances)
out_forward server: worker01 port 24211 secondary server: worker02 port 24211
out_forward server: worker01 port 24212 secondary server: worker03 port 24212
out_forward server: worker01 port 24213 secondary server: worker04 port 24213
out_forward server: worker01 port 24214 secondary server: worker05 port 24214
time: received_attag: scribe.blog
message: RAW LOG
12年2月4日土曜日
out_forward roundrobin is per buffer flushing !
(per buffer size, or flush_interval)
For high throughput stream,
this unit is too large.
We needs roundrobin per 'emit'.
Why not out_forward roundrobin in deliver?
12年2月4日土曜日
From worker To serializer: details
worker server x8 worker instances, x1 serializer instanceworker fluentd serializer fluentd
out_forward converted.*server: localhostsecondary: worker1, worker2, worker3, worker4 worker5, worker6, worker7
in_forward
normally send to localhost
in trouble, balance all traffic to all other worker's serializers
12年2月4日土曜日
Software list:scribed: github.com/facebook/scribe/
scribeline: github.com/tagomoris/scribe_line
fluent-plugin-scribe: github.com/fluent/fluent-plugin-scribe
Hoop: http://cloudera.github.com/hoop/docs/latest/ServerSetup.html
fluent-plugin-hoop: github.com/fluent/fluent-plugin-hoop
GrowthForecast: github.com/kazeburo/growthforecast
fluent-plugin-growthforecast: github.com/tagomoris/fluent-plugin-growthforecast
fluent-plugin-flowcounter: github.com/tagomoris/fluent-plugin-flowcounter
12年2月4日土曜日
top related