osdc 2015: david norton | influxdb - scalable metrics made easy

83
@InfluxDB David Norton (@dgnorton) [email protected]

Upload: netways

Post on 15-Jul-2015

101 views

Category:

Technology


1 download

TRANSCRIPT

@InfluxDBDavid Norton (@dgnorton)

[email protected]

What is it for?

"time series"

"time series"

Values and time stamps

Valu

e

Time?

Values and time stamps

Valu

e

Time?

time value2015-04-22 5:00 PM2015-04-22 6:00 PM

10,020,0

CPU usage every 10 seconds...

Valu

e

Time

Cumulative steps taken...

Step

Count Valu

e

Time

Happy InfluxDB users...

Valu

e

TimeInfluxDB

Users

Time series database

Valu

e

Time

Step

Count Valu

e

Time

Valu

e

TimeInfluxDB

Users

Valu

e

Time?

Types of data...

Metrics

Time series

Analytics

Events

Use Cases

Use Cases­ DevOps

Use Cases­ DevOps

­ Real­time analytics

Use Cases­ DevOps

­ Real­time analytics

­ Sensor Data / IoT

Data can come from bothphysical and logical sources

Storing the data

Can't we use aregular DB?

Order by time?

time value2015-04-22 5:00 PM2015-04-22 6:00 PM

10,00020,000

S C A L E

Example from metrics:

10 hosts

100 measurements per host

8640 per day (once every 10s)

365 days

= 3,153,600,000 records per year

Have fun with that table...

But wait, we'll just keepthe summaries!

1h averages =8,760,000 per year

Lose Detail andAdHoc Queryability

Blurry low res version of the data

So let's use Cassandra,HBase, etc.

Too much applicationcode and complexity

Application logic andscripts to compute

summaries

Application level logicfor balancing

No data locality for AdHocqueries

How to handle dataretention?

And then there's more...

Web services

Libraries for web services

Data collection

Visualization

Let's summarize...

Let's summarize...­ 8 open­source time series DBs listed onWikipedia

Let's summarize...­ 8 open­source time series DBs listed onWikipedia

­ 6 built on top of Cassandra, HBase, etc.

Let's summarize...­ 8 open­source time series DBs listed onWikipedia

­ 6 built on top of Cassandra, HBase, etc.

­ 2 no longer maintained

Let's summarize...­ 8 open­source time series DBs listed onWikipedia

­ 6 built on top of Cassandra, HBase, etc.

­ 2 no longer maintained

­ 0 built on top of SQL­type DBs

"Building an application with ananalytics component today is like

building a web application in 1998.

You spend months buildinginfrastructure before getting to the

actual thing you want to build."

--Paul Dix | InfluxDB CEO

Analytics and monitoring should beabout analyzing and interpretingdata, not the infrastructure to store

and process it.

InfluxDB

InfluxDB­ time series database

InfluxDB­ time series database

­ no external dependencies

InfluxDB­ time series database

­ no external dependencies

­ distributed & scalable

InfluxDB­ time series database

­ no external dependencies

­ distributed & scalable

­ easy to install, use, & maintain

InfluxDB­ time series database

­ no external dependencies

­ distributed & scalable

­ easy to install, use, & maintain

­ open source (MIT license)

InfluxDB­ time series database

­ no external dependencies

­ distributed & scalable

­ easy to install, use, & maintain

­ open source (MIT license)

­ written in Go

Data model...

influxd

influxd

influxd

influxd

measurements

DISK

influxd

measurements

DISK

tags

country ='DEU'

region ='neast'

influxd

country ='DEU'

region ='west'

+

series = measurement + unique tag set

influxd

country ='DEU'

region ='west'

+

series = measurement + unique tag set

OR

country ='DEU'

region ='east'

+

influxd

country ='DEU'

region ='east'

+

Points: values in a series

time value2015-04-22 5:00 PM2015-04-22 5:01 PM

10,020,0

country ='USA'

region ='east'

+time value2015-04-22 5:00 PM2015-04-22 5:01 PM

14,038,0

How many series can ithandle?

Data retention & replication...

InfluxDB hasRetention Policies

All data is written to aretention policy

Replication is handled throughRetention Policies

Continuous Queries

Continuous Queries­ downsample or aggregate on the fly

Continuous Queries­ downsample or aggregate on the fly

­ saved in the database

Continuous Queries­ downsample or aggregate on the fly

­ saved in the database

­ rerun periodically

Continuous Queries­ downsample or aggregate on the fly

­ saved in the database

­ rerun periodically

­ great for expensive queries

Writing datacurl -XPOST 'http://localhost:8086/write' -d '...'

Writing data{ "database": "mydb", "retentionPolicy": "default", "points": [ { "name": "cpu_load_short", "tags": { "host": "server01", "region": "us-west" }, "timestamp": "2009-11-10T23:00:00Z", "fields": { "value": 0.64 } } ]}

Queryingcurl -G 'http//localhost:8086/query' --data-urlencode "q=..."

SQL­ish querylanguage

Querying

SELECT value FROM cpu_load_short WHERE region='us-west'

{ "results": [ { "series": [ { "name": "cpu_load_short", "tags": { "host": "server01", "region": "us-west" }, "columns": [ "time", "value" ], "values": [ [ "2015-01-29T21:51:28.968422294Z", 0.64 ] ] } ] } ]}

Grafana Dashboards

Let's play with InfluxDB­ install

­ config

­ write, discover, query

Thank You!@InfluxDB

David Norton (@dgnorton)

[email protected]