Page 1
Processing Metrics, Logs & Traces
… at ScaleOtis Gospodnetić
Page 3
WHO
HQ: BrooklynPeople: Everywhere
Page 4
WHO Otis Gospodnetić
Sematext founderApache memberBook authorex-Lucene/Solr dev
Page 5
WHO Services
Solr Elasticsearch* Kafka Spark HBase Cassandra...
* We’ve got serious Solr & Elasticsearch ninjas on the team!
Page 7
WHO Clients want
Performance Bottlenecks Tuning Scaling
WHY
Page 8
WHO
Before you can fix things need to know what to fix
WHY
Page 9
WHY We need….INSIGHT
Performance Metrics! Anomalies! Logs!
Page 10
WHY i.e. Need Tools!
Metrics monitoring
Log searching Anomaly alerting
Page 11
OSS
Use the (Open) Source, Luke
Page 12
OSS
OpenTSDB InfluxDB Ganglia Graphite Nagios ELK ...
Page 13
OSS
http://blog.sematext.com/2015/04/22/monitoring-stream-processing-tools-cassandra-kafka-and-spark/
Page 14
OSS
“I have an ELK stack that has been suffering as of late. The logstash service will continually crash, the elasticsearch cluster is hardly in the green, and it is taking a constant amount of maintenance.”
Page 16
WHAT
SPM → monitoring
Logsene → logging
On PremisesCloud
http://sematext.com/spm http://sematext.com/logsene
Page 17
WHAT
http://blog.sematext.com/2015/04/22/monitoring-stream-processing-tools-cassandra-kafka-and-spark/
Page 21
WHAT Agent
Java Node.jsWant Traces? Embed it!Collectd ⇒ SIGAR for OSFlume SpilloverChannelES API
Page 22
WHAT Interesting finds
Variable Collectd supportCollectd ⇒ SIGARApache Flume Elasticsearch Stats APIMetrics 2nd class citizen
Page 23
WHAT Transaction Tracing
Java Bytecode Instrumentation
Bottleneck finderAppMap maker
Page 26
WHAT Custom Pointcuts
<method signature="java.lang.String com.company.example.Service#getUserName(com.company.model.Company company)"/>
Page 28
Write-agg vs. Read-agg
Page 29
Anomalies > Thresholds
Page 30
WHAT Alerts
HeartbeatsThresholdsAnomalies
Page 31
WHAT Anomaly Detection
ExponentialSTDFromMAKNN ...
boolean result = anomalyCount / (notAnomalyCount + anomalyCount) >= 3d / 4d;
Page 32
WHAT Anomaly issues
Warn early / create noiseNormal abnormalitiesSlow change
Page 33
Scalable Data Stores
Page 34
http://blog.sematext.com/2015/06/09/docker-monitoring-support/
Page 36
Hot vs. Cold
HOT COLD
Page 37
Drop, don’t Delete
HOT COLDdrop
Page 38
Pull, don’t Push
GET QUEUEpull
ES
Page 39
Beware of Aggregations
Circuit Breakers
Page 40
http://blog.sematext.com/2014/10/06/top-5-most-popular-log-shippers/
Page 41
Thank you!
@[email protected]
@sematexthttp://sematext.com