Анализ телеметрии при масштабировании, theo schlossnagle...
DESCRIPTION
Доклад Тео Шлосснейгла на HighLoad++ 2014.TRANSCRIPT
![Page 1: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/1.jpg)
Scaling to billionsLessons learned at Circonus
![Page 2: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/2.jpg)
Theo Schlossnagle @postwaitCEO Circonus
![Page 3: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/3.jpg)
What is Circonus?and why does it need to scale?
![Page 4: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/4.jpg)
CirconusTelemetry collectionTelemetry analysisVisualizationAlerting
SaaS… trillions of measurements
![Page 5: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/5.jpg)
![Page 6: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/6.jpg)
• Distributed collection nodes• Distributed aggregation nodes• Distributed storage nodes• Centralized message bus
(for now, federate if needed)
ArchitectureCollection
Storage
Agg
MQ
![Page 7: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/7.jpg)
Distributed Collection• It all starts with Reconnoiter
https://github.com/circonus-labs/reconnnoiter
• This is, by its design, distributed.• It is hard to collect
trillions of measurements per dayon a single node.
![Page 8: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/8.jpg)
Jobs are run• They collect data and bundle it into messages (protobufs)• A single message may have one to thousands of
measurements.• Data can be requested (pulled) by Reconnoiter• Data can be received (puhed) into Reconnoiter• Uses jlog (https://github.com/omniti-labs/jlog) to store and
forward the telemetry data.• C and lua (LuaJIT).
![Page 9: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/9.jpg)
Aggregation• In the beginning, we started with a single aggregator
receiving all messages.• Redundancy was simple, but scale was not.• It is not that difficult to
route billions of messages per dayon a single node.
![Page 10: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/10.jpg)
Storage• We started with Postgres• Great database,
but our time-series data was not efficiently stored in a relational database.
![Page 11: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/11.jpg)
Column-store hack on Postgres• We altered the schema.• Instead of one measurement per row…
– 1440 measurements in array in a single row– one row per day
![Page 12: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/12.jpg)
Data volume grew• Our data volume grew and we federated streams of data
across pairs of redundant Postgres databases.• This would scale as needed.• Had unreasonable management burden.
![Page 13: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/13.jpg)
Safety first…How to switch from one database technology to another?Cut-over always has casualties.Solution 1: partial cut-overSolution 2: concurrent operations
![Page 14: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/14.jpg)
We built “Snowth”• We realized all our measurement storage operations were
commutative.• We designed distributed, consistent-hashed telemetry
storage system with no single point of failure.
![Page 15: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/15.jpg)
n1-‐1
n1-‐2
n1-‐3
n1-‐4
n2-‐1
n2-‐2
n2-‐3
n2-‐4
n3-‐1
n3-‐2
n3-‐3
n4-‐1
n4-‐2
n4-‐4
n5-‐1
n5-‐2
n5-‐3
n5-‐4
n6-‐1
n6-‐3
n3-‐4
n4-‐3
n6-‐2
n6-‐4
![Page 16: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/16.jpg)
n1-‐1
n1-‐2
n1-‐3
n1-‐4
n2-‐1
n2-‐2
n2-‐3
n2-‐4
n3-‐1
n3-‐2
n3-‐3
n3-‐4
n4-‐1
n4-‐2
n4-‐3
n4-‐4
n5-‐1
n5-‐2
n5-‐3
n5-‐4
n6-‐1
n6-‐2
n6-‐3
n6-‐4
o1
![Page 17: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/17.jpg)
n1-‐1
n1-‐2
n1-‐3
n1-‐4
n2-‐1
n2-‐2
n2-‐3
n2-‐4
n3-‐1
n3-‐2
n3-‐3
n3-‐4
n4-‐1
n4-‐2
n4-‐3
n4-‐4
n5-‐1
n5-‐2
n5-‐3
n5-‐4
n6-‐1
n6-‐2
n6-‐3
n6-‐4
o1
![Page 18: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/18.jpg)
n1-‐1
n1-‐2
n1-‐3
n1-‐4
n2-‐1
n2-‐2
n2-‐3
n2-‐4
n3-‐1
n3-‐2
n3-‐3
n4-‐1
n4-‐2
n4-‐4
n5-‐1
n5-‐2
n5-‐3
n5-‐4
n6-‐1
n6-‐3
n3-‐4
n4-‐3
n6-‐2
n6-‐4
![Page 19: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/19.jpg)
n1-‐1
n1-‐2
n1-‐3
n1-‐4
n2-‐1
n2-‐2
n2-‐3
n2-‐4
n3-‐1
n3-‐2
n3-‐3
n3-‐4
n4-‐1
n4-‐2
n4-‐3
n4-‐4
n5-‐1
n5-‐2
n5-‐3n5-‐4
n6-‐1
n6-‐2
n6-‐3
n6-‐4
![Page 20: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/20.jpg)
e
n1-‐1
n1-‐2
n1-‐3
n1-‐4
n2-‐1
n2-‐2
n2-‐3
n2-‐4
n3-‐1
n3-‐2
n3-‐3
n3-‐4
n4-‐1
n4-‐2
n4-‐3
n4-‐4
n5-‐1
n5-‐2
n5-‐3n5-‐4
n6-‐1
n6-‐2
n6-‐3
n6-‐4Availability Zone 2
Availability Zone 1
![Page 21: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/21.jpg)
e
n1-‐1
n1-‐2
n1-‐3
n1-‐4
n2-‐1
n2-‐2
n2-‐3
n2-‐4
n3-‐1
n3-‐2
n3-‐3
n3-‐4
n4-‐1
n4-‐2
n4-‐3
n4-‐4
n5-‐1
n5-‐2
n5-‐3n5-‐4
n6-‐1
n6-‐2
n6-‐3
n6-‐4
o1
Availability Zone 2
Availability Zone 1
![Page 22: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/22.jpg)
Availability Zone 1
Availability Zone 2
n1-‐1
n1-‐2
n1-‐3
n1-‐4
n2-‐1
n2-‐2
n2-‐3
n2-‐4
n3-‐1
n3-‐2
n3-‐3
n3-‐4
n4-‐1
n4-‐2
n4-‐3
n4-‐4
n5-‐1
n5-‐2
n5-‐3n5-‐4
n6-‐1
n6-‐2
n6-‐3
n6-‐4
o1
![Page 23: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/23.jpg)
Availability Zone 1
Availability Zone 2
n1-‐1
n1-‐2
n1-‐3
n1-‐4
n2-‐1
n2-‐2
n2-‐3
n2-‐4
n3-‐1
n3-‐2
n3-‐3
n3-‐4
n4-‐1
n4-‐2
n4-‐3
n4-‐4
n5-‐1
n5-‐2
n5-‐3
n5-‐4
n6-‐1
n6-‐2
n6-‐3
n6-‐4
n1-‐1
n1-‐2
n1-‐3
n1-‐4
n2-‐1
n2-‐2
n2-‐3
n2-‐4
n3-‐1
n3-‐2
n3-‐3
n3-‐4
n4-‐1
n4-‐2
n4-‐3
n4-‐4
n5-‐1
n5-‐2
n5-‐3
n5-‐4
n6-‐1
n6-‐2
n6-‐3
n6-‐4
![Page 24: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/24.jpg)
A real ringKeep it simple, stupid.We actually don’t do split AZ.
![Page 25: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/25.jpg)
Rethinking it all
![Page 26: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/26.jpg)
Time & SafetySmall, well-defined API allowed for low-maintenance, concurrent operations and ongoing development.
![Page 27: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/27.jpg)
As needs increased…• We added computation support into the database.
– allowing simple “cheap” computation to be “moved” to the data
– allowing data to be moved to complex “expensive” computation
![Page 28: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/28.jpg)
Real-time Analysis• Real-time (or soft real-time) requires
low latency, high-throughput systems.• We used RabbitMQ.• Things worked well until they broke.
![Page 29: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/29.jpg)
Message Sizes Matter• Benchmarks suck:
they never resemble your use-case• MQ benchmarks burned me.• Our message sizes are between 0.2k and 6k.• It turns out many MQ benchmarks are at 0k.
*WHAT?!*
![Page 30: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/30.jpg)
RabbitMQ falls• At around 50k message per second at our message sizes,
RabbitMQ failed.• Worse, it did not fail gracefully.• Another example of a product that works well…
until it is worthless.
![Page 31: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/31.jpg)
Soul-searching lead to Fq• We looked and looked.• Kafka led the pack, but still…• Existing solutions didn’t meet our use-case.• We built Fq:
https://github.com/postwait/fq
![Page 32: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/32.jpg)
Fq• Subscribers can specify queue semantics:
– public (multiple competing subscribers), private– block (pause sender flow control), drop– backlog– transient, permanent– memory, file-backed– many millions of messages (gigabits) per second
![Page 33: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/33.jpg)
Safety first…How to switch from one message queueing technology to another?Cut-over always has casualties.Solution 1: partial cut-overSolution 2: concurrent operations
![Page 34: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/34.jpg)
Duplicity“concurrent” deployment ofmessage queues means one thing:
• every consumer must support detection and handling of duplicate message delivery
AXIOM:there are 3 numbers in computing:0, 1 and ∞.
![Page 35: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/35.jpg)
solve easy problems1. Temporal relevance old messages don’t matter to us… say > 10 seconds (dedup only N=20 past seconds)
2. Timestamp messages at source
M: Messagef: process
If TM > max(T)DUPTM %N ⃪ ∅T ⃪ T ∪ TM
TM % N
0 1 2 N-1
DUP
If DUPTM %N ∌ h(M)f(M)DUPTM %N ⃪ DUPTM %N ∪ h(M)
![Page 36: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/36.jpg)
A failureand no one cared.
![Page 37: Анализ телеметрии при масштабировании, Theo Schlossnagle (Circonus)](https://reader035.vdocuments.mx/reader035/viewer/2022081401/557ef25bd8b42aa30a8b4b35/html5/thumbnails/37.jpg)
Спасибо.