distributed logging architecture in container era · why fluentd? • docker fluentd logging driver...

Distributed Logging Architecture in Container Era

LinuxCon Japan 2016 at Jun 13 2016

Satoshi "Moris" Tagomori (@tagomoris)

Satoshi "Moris" Tagomori (@tagomoris)

Fluentd, MessagePack-Ruby, Norikra, ...

Treasure Data, Inc.

Topics• Microservices and logging in various industries

• Difficulties of logging with containers

• Distributed logging architecture

• Patterns of distributed logging architecture

• Case Study: Docket and Fluentd

• Why OSS are important for logging

Logging

Logging in Various Industries

• Web access logs • Views/visitors on media • Views/clicks on Ads

• Commercial transactions (EC, Game, ...)

• Data from devices • Operation logs on Apps of phones • Various sensor data

Microservices and Logging

• Monolithic service • a service produces all data

about an users access

• Microservices • many services produce data

about an users access • it's needed to collect logs

from many services to know what is happening

Users

Service (Application)

Logs

Users

Logs

Logging and Containers

Containers: "a must" for microservices

• Dividing a service into services • a service requires less computing resources

(VM -> containers)

• Making services independent from each other • but it is very difficult :( • some dependency must be solved even in

development environment(containers on desktop)

Redesign Logging: Why?

• No permanent storages

• No fixed physical/network address

• No fixed mapping between servers and roles

Containers: immutable & disposable

• No permanent storages

• Where to write logs? • file in container

→ be gone w/ container instance 😞 • dir shared from host

→ hosts are shared by many services ☹

• TODO: ship logs from container to anywhere ASAP

Containers: unfixed addresses

• No fixed physical / network address

• Where should we go to fetch logs? • Service discovery (e.g., consul)

→ one more component 😞 • rsync? ssh+tail? or ..? Is it installed in container?

→ one more tool to depend on ☹

• TODO: push logs to anywhere from containers

Containers: instances per roles

• No fixed mapping between servers and roles

• How can we parse / store these logs? • Central repository about log syntax

→ very hard to maintain 😞 • Label logs by source address

→ many containers/roles in a host ☹

• TODO: label & parse logs at source of logs

Distributed Logging Architecture

Core Architecture

• Collector nodes

• Aggregator nodes

• Destination

Collector nodes(Docker containers + agent)

Destination(Storage, Database, ...)

Aggregator nodes

• Parse (collector) • Raw logs are not good for processing • Convert logs to structured data (key-value pairs)

• Sort/Shuffle (aggregator) • Mixed logs are not good for scanning • Split whole data stream into streams

• Store (destination) • Format logs(records) as destination expects

Collecting and Storing Data

Scaling Logging• Network traffic

• CPU load to parse / format • Parse logs on each collector (distributed) • Format logs on aggregator (to be distributed)

• Capability • Make aggregators redundant

• Controlling delay

Patterns

source aggregationNO

source aggregationYES

destinationaggregation

NO

destinationaggregation

YES

Aggregation Patterns

Source Side Aggregation Patterns

w/o source aggregation w/ source aggregation

collector

aggregator

aggregate container

Without Source Aggregation

• Pros: • Simple configuration

• Cons: • fixed aggregator (endpoint) address • many network connections • high load in aggregator

collector

aggregator

With Source Aggregation

• Pros: • less connections • lower load in aggregator • less configuration in containers

(by specifying localhost) • highly flexible configuration

(by deployment only for aggregate containers)

• Cons: • a bit much resource (+1 container per host)

aggregate container

Destination Side Aggregation Patterns

w/o destination aggregation w/ destination aggregation

aggregator

collector

destination

Without Destination Aggregation

• Pros: • Less nodes • Simpler configuration

• Cons: • Storage side change affects collector side • Worse performance: many small write requests

on storage

With Destination Aggregation

• Pros: • Collector side configuration is

free from storage side changes • Better performance with fine tune

on destination side aggregator

• Cons: • More nodes • A bit complex configuration

aggregator

Scaling PatternsScaling Up Endpoints

HTTP/TCP load balancer Huge queue + workers

Scaling Out Endpoints Round-robin clients

Load balancer

Backend nodes

Collector nodes

Aggregator nodes

Scaling Up Endpoints

• Pros: • Simple configuration

in collector nodes

• Cons: • Scaling up limit

Load balancer

Backend nodes

Scaling Out Endpoints

• Pros: • Unlimited scaling

by adding aggregator nodes

• Cons: • Complex configuration • Client features for round-robin

WithoutDestination Aggregation

WithDestination Aggregation

Scaling UpEndpoints Systems in early stages

Collecting logs over Internet

or

Using queues

Scaling OutEndpoints

Impossible :(

Collector nodes must knowall endpoints

↓Uncontrollable

Collecting logsin datacenter

Case Studies

Case Study: Docker+Fluentd

• Destination aggregation + scaling up • Fluent logger + Fluentd

• Source aggregation + scaling up • Docker json logger + Fluentd + Elasticsearch • Docker fluentd logger + Fluentd + Kafka

• Source/Destination aggregation + scaling out • Docker fluentd logger + Fluentd

Why Fluentd?• Docker Fluentd logging driver

• Docker container can send logs into Fluentd directly - less overhead

• Pluggable architecture • Various destination systems

• Small memory footprint • Source aggregation requires +1 container per host • Less additional resource usage ( < 100MB )

Destination aggregation + scaling up

• Sending logs directly over TCP by Fluentd loggerin application code

• Same with patterns of New Relic

Application code

Source aggregation + scaling up

• Kubernetes: Json logger + Fluentd + Elasticsearch

• Applications write logs to STDOUT

• Docker writes logs as JSON in files

• Fluentd reads logs from file parse JSON objects writes logs to Elasticsearch

http://kubernetes.io/docs/getting-started-guides/logging-elasticsearch/

Elasticsearch

Application code

Files (JSON)

Source aggregation + scaling up

• Docker fluentd logging driver + Fluentd + Kafka


• Docker sends logs to localhost Fluentd

• Fluentd gets logs over TCP pushes logs into Kafka

Kafka

Application code

Source/Destination aggregation + scaling out

• Docker fluentd logging driver + Fluentd


• Docker sends logs to localhost Fluentd

• Fluentd gets logs over TCP sends logs into Aggregator Fluentd w/ round-robin load balance

Application code

What's the Best?• Writing logs from containers: Some way to do it

• Docker logging driver • Write logs on files + read/parse it • Send logs from apps directly

• Keep it scalable! • Source aggregation: Fluentd on localhost • Scalable storage: (Kafka, external services, ...)

• No destination aggregation + Scaling up • Non-scalable storage: (Filesystems, RDBMSs, ...)

• Destination aggregation + Scaling out

Why OSS Are Important For Logging?

Why OSS?

• Logging layer is interface • transparency • interoperability

• Keep it scalable • number of nodes • number of types of source/destination

Use OSS, Make Logging Scalable

distributed logging architecture in container era · why fluentd? • docker fluentd logging driver...

Documents