time series storage in cassandra
DESCRIPTION
Presented at Cassandra London (April 7, 2014); The challenges of time-series storage and analytics in OpenNMS, with an introduction to Newts, a new Cassandra-based time-series data store.TRANSCRIPT
Open
Open
Network
Management
System
OpenNMS: What It Is
● Network Management System○ Discovery and Provisioning○ Service monitoring○ Data collection○ Event management and notifications
● Java, open source, GPLv3● Since 1999
Graph All The Things
RRDTool
● Round robin database● First released 1999● Time-series storage● File-based● Constant-size● Automatic, amortized aggregation
Consider
● 2 IOPs per update (read-update-write)● 1 RRD per data source (storeByGroup=false)● 100,000s of data sources, 1,000s IOPS● 1,000,000s of data sources, 10,000s IOPS● 15,000 RPM SAS drive, ~175-200 IOPS
Also
● Not everything is a graph● Inflexible● Incremental backups impractical● ...
Observation #1
We collect and write a great deal; We read (graph) relatively little.
We are optimized for reading everything, always.
Observation #2
Samples are naturally collected, and graphed together in groups.
Grouping samples that are accessed together is an easy optimization.
Project: NewtsGoals:● Stand-alone time-series data store● High-throughput● Horizontally scalable● Grouped metric storage/retrieval● Late-aggregating
Cassandra
Why:● Write-optimized● Sorted● Horizontally scalable (linear)
Gist
● Samples stored as-is.● Samples can be retrieved as-is.● Measurements are aggregations calculated
from samples (at time of query).
Sample{
“resource” : “london”,
“timestamp” : 1396289065,
“name” : “meanTemp”,
“type” : “GAUGE”,
“value” : 17.2,
“attributes” : { “units”: “celsius” }
}
SamplesCREATE TABLE newts.samples ( resource text, collected_at timestamp, metric_name text, metric_type text, value blob, attributes map<text, text>, PRIMARY KEY(resource, collected_at, metric_name));
Samples
resource | collected_at | metric_name | value
---------+---------------------+--------------+-----------
london | 2014-03-31 18:04:25 | dewPoint | 0xc01a0000
london | 2014-03-31 18:04:25 | maxTemp | 0x40280000
london | 2014-03-31 18:04:25 | maxWindGust | 0x7ff80000
london | 2014-03-31 18:04:25 | maxWindSpeed | 0x40180000
london | 2014-03-31 18:04:25 | meanTemp | 0xbfe00000
Behind the scenes...
london (2014-03-31 18:04:25, dewPoint): 0xc01a0000
(2014-03-31 18:04:25, maxTemp): 0x40280000
...
Ascending Order