getting 20x performance improvement in data routing
TRANSCRIPT
![Page 1: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/1.jpg)
SignalFx
![Page 2: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/2.jpg)
SignalFx
Getting to 20x Performance Improvement on our Data Routing Layer
Rajiv Kurian, Software [email protected]
![Page 3: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/3.jpg)
Agenda
1. Introduction2. Properties of modern memory systems3. Evolution of our data router4. Results5. Q&A (hopefully)
![Page 4: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/4.jpg)
SignalFx
What does SignalFx do?
![Page 5: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/5.jpg)
• High resolution: • Any mix of resolutions up to 1 sec
• Streaming analytics: • custom analytics pipelines at any scale• Streaming dashboards update within seconds
• Multidimensional metrics: • add dimensions to model metrics however you like• Use them to aggregate & filter (e.g. 99th-percentile-of-latency-by-
service-by-customer) interactively on streaming data
SignalFx is an advanced monitoring platform for modern applications
![Page 6: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/6.jpg)
SignalFx
What is the data routing layer
![Page 7: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/7.jpg)
SignalFx data routerRaw data in Processed data out
PUBLISHER0
SUBSCRIBER 1
SUBSCRIBER 0
SUBSCRIBER 2
PUBLISHER1
PUBLISHER2
Time Series ID: 1212450
Payload: 0b1000100010
![Page 8: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/8.jpg)
SignalFx data router - subscribers
Subscriptions
PUBLISHER0
SUBSCRIBER 1
SUBSCRIBER 0
SUBSCRIBER 2
PUBLISHER1
PUBLISHER2
Subscriber ID: 1224525566
Time Series ID: 1212450
![Page 9: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/9.jpg)
Routing table
Routing table
Key: 128759 Set<Subscriber>
Key Subscribers
Routing data
![Page 10: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/10.jpg)
SignalFx
Properties of modern memory systems
![Page 11: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/11.jpg)
SignalFx Main memory
L1 D L1 I
L3
L1 D L1 I
L2L2
CORE 1 CORE 2
11
1
1
![Page 12: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/12.jpg)
Cache Lines
•The memory subsystem makes a few bets to help us:•Temporal locality•Spatial locality•Prefetching
![Page 13: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/13.jpg)
SignalFx
L3
L2L2
CORE 1 CORE 2
L1 L1
Main memory1
1
1
2
1
2
2
2
1 2
![Page 14: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/14.jpg)
SignalFx
L1 L1
L2L2
L3
CORE 1 CORE 2
Main memory 1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7 8
2
1 2 3 4 5 6 7 8
1 4 3 6 8 7 5
![Page 15: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/15.jpg)
SignalFx
L1 CORE
![Page 16: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/16.jpg)
SignalFx
L2 CORE
![Page 17: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/17.jpg)
SignalFx
MainMemory CORE
![Page 18: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/18.jpg)
SignalFx
The evolution of our data routing layer
![Page 19: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/19.jpg)
Routing table
Routing table
Key: 128759 Set<Subscriber>
Key Subscribers
![Page 20: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/20.jpg)
Routing table v1
HashMap<Long, HashSet<Subscriber>>
Subscriber Objects
Data Key Set<Subscriber>
1212450 {1228, 4412}
3989 {12244}
8921224 {3244}
245819 {3244, 12244, 1228}
Subscriber ID Host Port
1228 …. ….
Subscriber ID Host Port
12244 …. ….
Subscriber ID Host Port
4412 …. ….
Subscriber ID Host Port
3244 …. ….
![Page 21: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/21.jpg)
But …
We want to be able to support millions of subscriptions per publisher, while doing more than 2 million queries per second
![Page 22: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/22.jpg)
Set<Subscriber>Boxed long
key* value*key* value*
List
List
List
List
HashMap <Long, HashSet<Subscriber>>
1
2
3 4
????
![Page 23: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/23.jpg)
So why did we need a better data router?
• Look ups are O(1) ….• Cache misses • High memory overhead
![Page 24: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/24.jpg)
Routing table v2 - bloom filters
A Bloom filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set.
False positive matches are possible, but false negatives are not, thus a Bloom filter has a 100% recall rate
![Page 25: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/25.jpg)
SignalFx
Routing table v2 - write
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Subscriber bloom filter
Hash 1 Hash 2 Hash 3
3 9 12
127829
0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0
![Page 26: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/26.jpg)
SignalFx
0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0
Routing table v2 - read hit
Subscriber bloom filter
Hash 1 Hash 2 Hash 33 9 12
127829
0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0
![Page 27: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/27.jpg)
SignalFx
0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0
Routing table v2 - read miss
Subscriber bloom filter
Hash 1 Hash 2 Hash 33 9 14
120422
0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0
![Page 28: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/28.jpg)
long 0 long 1 long 2 long 3
long 4 long 5 long 6 long 7
long 8 long 9 long 10 long 11
long 12 long 13 long 14 long 15
long 16 long 17 long 18 long 19
long 20 long 21 long 22 long 23
long 24 long 25 long 26 long 27
long 28 long 29 long 30 long 31
long 32 long 33 long 34 long 35
long 36 long 37 long 38 long 39
1
2
3
Typical bloom filter get lookupKey Hash 1 Hash 2 Hash 3
43 168 312
![Page 29: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/29.jpg)
Bloom Filter 1long 4 long 5 long 6
long long long long long long long long long long long long long long long long long long long long long long long long long long long long long long long long
Routing table v2Key Hash 1 Hash 2 Hash 3
43 168 312
Bloom Filter 2long 4 long 5 long 6
long long long long long long long long long long long long long long long long long long long long long long long long long long long long long long long long
Bloom Filter 2long 4 long 5 long 6
long long long long long long long long long long long long long long long long long long long long long long long long long long long long long long long long
Bloom Filter 4long 4 long 5 long 6
long long long long long long long long long long long long long long long long long long long long long long long long long long long long long long long long
Bloom Filter 5long 4 long 5 long 6
long long long long long long long long long long long long long long long long long long long long long long long long long long long long long long long long
Bloom Filter 6long 4 long 5 long 6
long long long long long long long long long long long long long long long long long long long long long long long long long long long long long long long long
1 2 3
1 2 3
1 2 3
1 2 3
1 2 3
1 2 3 num_sub * 3
cache misses
![Page 30: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/30.jpg)
Progress so far
GetSubscribers() Memory
Naive hash map O(1) high
Bloom filter O(num_subscribers) low
![Page 31: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/31.jpg)
So why did we need a better data router?
• CPU Intensive• What did the profiler say? Data
router -> 32%
• Scaled poorly• CPU performance got worse with
the number of subscribers
![Page 32: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/32.jpg)
So how can we do better?
Specialize - we have a limited number of subscribers present at any time. Fewer than 128
![Page 33: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/33.jpg)
ID transformation
Subscriber ID
1228
4412
…
…
12244
3244
Subscriber ID
0
1
2
…
…
127
subscribercoordination
publisherassignment
![Page 34: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/34.jpg)
Producer Routing table
Data Key(8 bytes) Set<Subscriber>
Subscriber ID(0 - 127) Key (64 bit)
0 3890
subscribe message
Routing table V3
0000000000…..00013890
16 bytes bit set
![Page 35: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/35.jpg)
Boxed long
key* value*key* value*
List
List
List
List
Routing table V3 - regular hash map
1
2
3 4
long 1 long 2
![Page 36: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/36.jpg)
Routing table V4 - single array of longsEmpty
Empty
Empty
Empty
Empty
Empty
Empty
Empty
Empty
Empty
Empty
Empty
Key
Value 0-63
Value 64-127
![Page 37: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/37.jpg)
Routing table V4 - single array of longsKey 0 hash 0 Empty
Empty
Empty
Empty
Empty
Empty
Empty
Empty
Empty
Empty
Empty
Empty
![Page 38: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/38.jpg)
Key 0
Value 0-63
Value 64-127
Empty
Empty
Empty
Empty
Empty
Empty
Empty
Empty
Empty
Routing table V4 - single array of longsKey 0 hash 0
![Page 39: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/39.jpg)
Routing table V4 - single array of longsKey 1 hash 0 Key 0
Value 0-63
Value 64-127
Empty
Empty
Empty
Empty
Empty
Empty
Empty
Empty
Empty
![Page 40: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/40.jpg)
Routing table V4 - single array of longsKey 1 hash 0 Key 0
Value 0-63
Value 64-127
Key 1
Value 0-63
Value 64-127
Empty
Empty
Empty
Empty
Empty
Empty
![Page 41: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/41.jpg)
Routing table V4 - single array of longsKey 2 hash 3 Key 0
Value 0-63
Value 64-127
Key 1
Value 0-63
Value 64-127
Empty
Empty
Empty
Empty
Empty
Empty
![Page 42: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/42.jpg)
Key 0
Value 0-63
Value 64-127
Key 1
Value 0-63
Value 64-127
Empty
Empty
Empty
Key 3
Value 0-63
Value 64-127
Routing table V4 - single array of longsKey 2 hash 3
![Page 43: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/43.jpg)
Routing table V4 - single array of longsKey 0
Value 0-63
Value 64-127
Key 1
Value 0-63
Value 64-127
Empty
Empty
Empty
Key 3
Value 0-63
Value 64-127
1 Key 1 hash 0
![Page 44: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/44.jpg)
Routing table V4 - single array of longs
Key 0
Value 0-63
Value 64-127
Key 1
Value 0-63
Value 64-127
Key 2
Value 0-63
Value 64-127
Subscribers Array
Subscriber 0Subscriber 1Subscriber 2Subscriber 3Subscriber 4
…Subscriber 127
BitSet024
127
Key 1 hash 0
Key 0
Value 0-63
Value 64-127
Key 1
Value 0-63
Value 64-127
Key 2
Value 0-63
Value 64-127
![Page 45: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/45.jpg)
Progress so far
GetSubscribers() Memory
Naive hash map O(1) high
Bloom filter O(num_subscribers) low
Optimized hash map O(1) medium
![Page 46: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/46.jpg)
SignalFx
Results(library)
![Page 47: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/47.jpg)
Microbenchmark• Method:
• Heap: 3G• Number of subscribers: 128• Number of time series: 1048576• All time series have a random number of subscribers: [1, 128]• 2 million random queries
Writes Reads
Naive hash map 34469 ms (42x) 11900 ms (21x)
Bloom filter 31710 ms (39x) 54995 ms (97x)
Optimized hash map 805 ms (1x) 565 ms (1x)
Memory
2.6 GB (27x)
80 MB (0.83x)
96 MB (1x)
![Page 48: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/48.jpg)
SignalFx
Results(Application)
![Page 49: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/49.jpg)
SignalFx
CPU %
![Page 50: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/50.jpg)
SignalFx
CPU %
6 subscribers45 %
![Page 51: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/51.jpg)
SignalFx
Garbage collection
![Page 52: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/52.jpg)
SignalFx
Garbage collection
6 subscribers63 %
![Page 53: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/53.jpg)
Closing remarks / rant
• “Write code first, optimize later”….
• Analyze your data• Metrics• Logging
![Page 54: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/54.jpg)
SignalFx
Thank You!Rajiv Kurian
[email protected]@rzidane360
WE’RE [email protected]
@SignalFx - signalfx.com
![Page 55: Getting 20x Performance Improvement in Data Routing](https://reader031.vdocuments.mx/reader031/viewer/2022030216/588901bd1a28abcf5f8b6543/html5/thumbnails/55.jpg)
SignalFx
Q & A