influx db talk-20150415

15
Intro to InfluxDB [email protected]

Upload: richard-elling

Post on 17-Jul-2015

211 views

Category:

Technology


0 download

TRANSCRIPT

Page 2: Influx db talk-20150415

FeaturesHTTP(S) API with user access controls

Scalability

Billions of data points

Hundreds of thousands of series

Multiple nodes

Managed retention policies

Simple to install and manage — no external dependencies

Page 3: Influx db talk-20150415

Dev Featuresgithub.com/influxdb

Written in go

SQL-like query language

Client libraries available for your favorite dev environment

python, javascript, node.js, java, R, ruby, C#, PHP, …

HTTP: curl, httpie, wget

MIT license

Page 4: Influx db talk-20150415

Ops FeaturesSecurity model separates admins from users

Active and vibrant community

Flexible data retention policies

Time-based sharding

Downsample data using different time windows

Expand storage space by adding nodes

Page 5: Influx db talk-20150415

Why We Chose InfluxDB?Need telemetry, events, and status from systems

Information, not just numbers

100k+ metrics per system, 2-4k are interesting to measure forever

Events and configuration

Collecting more relational data, extensible, JSON works well

Requirements rules out many “metrics-oriented” time-series solutions

Feed from collectd and HTTP POST

Open source, redistribution and contribution friendly license (MIT)

Page 6: Influx db talk-20150415

Deployment Architecture

Page 7: Influx db talk-20150415

SchemaVersion 0.8

Embed metadata into series name

Similar to graphite

name1.value1.name2.value2.metric

datacenter.0.server.elvis.temperature

Version 0.9

Spoiler alert

Page 8: Influx db talk-20150415

QueriesSQL-like query language

select * from series_name

select value from series_name where time > ‘2015-04-15’

select value from series_name where time > now() - 1h

select value from series_name where time > now() - 1d limit 100

Regular expressions are handyselect * from /.*\.elvis\..*/ limit 10

select value from /^MyCompany\..*/ limit 1

Page 9: Influx db talk-20150415

Queries do mathcount, top, bottom

min, max, mean, mode, median, stddev

distinct

percentile

histogram

first, last, difference, sum, derivative

select mean(value) from series_name where time > now() - 1h

select derivative(value) from series_name where time > now() - 1h group by time (60s) order asc

Page 10: Influx db talk-20150415

Continuous QueriesUseful for downsampling

Choices:

downsample every time you query

downsample in advance and store the results

Restricted query: only admins can create continuous queries

Powerful with many different options and applications

select mean(value) from series_name group by time(5m) into series_name.mean.5m

Page 11: Influx db talk-20150415

Python pluginfrom influxdb import InfluxDBClient client = InfluxDBClient('localhost', 8086, 'user', 'password', ‘db_name’) print json.dumps(client.query('list series'), indent=4) [ { "points": [ [ 0, "Node.elvis.CPU_stats.0.derive.cpu_nsec_idle" ], … ], "name": "list_series_result", "columns": [ "time", "name" ] } ]

Page 12: Influx db talk-20150415

Managing ShardsSetup shard spaces when creating databases (!){ “spaces”: [{ “name”: “detail”, “retentionPolicy”: “10d”, “shardDuration”: “2d”, “regex”: “/.*/“, “replicationFactor”: 1, “split”: 1 }] } object { array { object { string name; // space name string retentionPolicy; // minimum time to keep string shardDuration; // max expected group by time() number replicationFactor; // number of replicas number split; // shards per period }; } spaces; };

Page 13: Influx db talk-20150415

InfluxDB Version 0.8

current stable release

end of the road for 0.8 (0.8.8)

database back-ends: LevelDB (use this), RocksDB, HyperLevelDB, and LMDB

caveat: clustering is completely redesigned in 0.9

Page 14: Influx db talk-20150415

FuturesVersion 0.9 in release-candidate stage (start testing now!)

Significant redesign — migration may be challeging

Tags for fast, efficient queries — see docs and begin schema planning now

Dropping multiple database backends — using BoltDB

Clustering, replication, high-availability

Streaming raft implementation

Role = broker for raft consensus

Role = data for hosting data, answer queries

Page 15: Influx db talk-20150415

www.influxdb.com https://groups.google.com/forum/#!forum/influxdb

@InfluxDB

[email protected] #richardelling

Demo and Questions