realtime analytics with cassandra
DESCRIPTION
My talk at NoSQL Now 2012TRANSCRIPT
![Page 1: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/1.jpg)
Realtime Analytics with Cassandra
Acunu Analytics
Tom Wilkie, Acunu21st August 2012
![Page 2: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/2.jpg)
Analytics
• Motivation / alternatives• What is it?• How does it work?• Approximate Analytics• Whats it good for?
2
![Page 3: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/3.jpg)
Analytics
• Motivation / alternatives
• What is it?• How does it work?• Approximate Analytics• Whats it good for?
3
![Page 4: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/4.jpg)
Analytics
Why bother?
“Companies that can harness big data will trample data incompetents”
The Economist, May 26th 2011
4
![Page 5: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/5.jpg)
Analytics
time page session id duration
... ... ... ...
14:58:03.234 /index.html 248.180.3.40 175
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
time page session id duration
... ... ... ...
14:58:03.234 /index.html 248.180.3.40 175
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
time page session id duration
... ... ... ...
14:58:03.234 /index.html 248.180.3.40 175
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
time page session id duration
... ... ... ...
14:58:03.234 /index.html 248.180.3.40 175
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
time page session id duration
... ... ... ...
14:58:03.234 /index.html 248.180.3.40 175
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
time page session id duration
... ... ... ...
14:58:03.234 /index.html 248.180.3.40 175
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
time page session id duration
... ... ... ...
14:58:03.234 /index.html 248.180.3.40 175
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
time page session id duration
... ... ... ...
14:58:03.234 /index.html 248.180.3.40 175
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
time page session id duration
... ... ... ...
14:58:03.234 /index.html 248.180.3.40 175
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
time page session id duration
... ... ... ...
14:58:03.234 /index.html 248.180.3.40 175
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
time page session id duration
... ... ... ...
14:58:03.234 /index.html 248.180.3.40 175
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
time page session id duration
... ... ... ...
14:58:03.234 /index.html 248.180.3.40 175
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
time page session id duration
... ... ... ...
14:58:03.234 /index.html 248.180.3.40 175
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
time page session id duration
... ... ... ...
14:58:03.234 /index.html 248.180.3.40 175
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
time page session id duration
... ... ... ...
14:58:03.234 /index.html 248.180.3.40 175
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
time page session id duration
... ... ... ...
14:58:03.234 /index.html 248.180.3.40 175
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
time page session id duration
... ... ... ...
14:58:03.234 /index.html 248.180.3.40 175
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
time page session id duration
... ... ... ...
14:58:03.234 /index.html 248.180.3.40 175
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
time page session id duration
... ... ... ...
14:58:03.234 /index.html 248.180.3.40 175
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
14:58:03.409 /csi/csi/council/freedom.html 248.180.3.40 1234
14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52
5
![Page 6: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/6.jpg)
Analytics
Live & historicalaggregates... Trends... Drill downs
and roll ups
Combining “big” and “real-time” is hard
6
![Page 7: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/7.jpg)
Analytics7
Solution Con
Scalability$$$
Not realtime
Spartan query semantics => complex, DIY solutions
![Page 8: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/8.jpg)
Analytics
• Motivation / alternatives• What is it?
• How does it work?• Approximate Analytics• Whats it good for?
8
![Page 9: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/9.jpg)
Analytics
• Aggregate incrementally, on the fly• Store live + historical aggregates
events
counterupdates
Acunu Analytics
Click streamSensor data
etc
![Page 10: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/10.jpg)
Analytics
{time : TIME(HOUR; MIN; SEC),page : PATH(/),category : STRING,loadTime : LONG
}
{select : ["COUNT", "AVG(loadTime)"],where : “time, ?path”,group : “time, ?category”
}
10
![Page 11: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/11.jpg)
Analytics11
Dashboard UI
![Page 12: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/12.jpg)
Analytics
• Motivation / alternatives• What is it?• How does it work?
• Approximate Analytics• Whats it good for?
12
![Page 13: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/13.jpg)
Analytics
countgrouped by ...
daycount
distinct (session)
count ... geography
... browseravg(duration)
13
![Page 14: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/14.jpg)
Analytics
time : TIME(HOUR; MIN; SEC),cust_id : LONG,session_id : LONG,geography : STRING,browser : STRING,load_time : LONG
Data Definition
{ select: “COUNT” patterns: [ { where : “?time”, group : “?time” }, { where : “”, group : “geography” }, { where : “”, group : “browser” } ]}, { select: [“COUNT_DISTINCT(session_id)”, “AVG(load_time)”], where: “time”, group: “”}
QueryPatterns
14
![Page 15: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/15.jpg)
Analytics
21:00 all→1345 :00→45 :01→62 :02→87 ...
22:00 all→3221 :00→22 :00→19 :02→104 ...
... ...
UK all→228 user01→1 user14→12 user99→7 ...
US all→354 user01→4 user04→8 user56→17 ...
...
UK, 22:00 all→1904 ...
∅ all→87314 UK→238 US→354 ...
{cust_id: user01,session_id: 102,geography: UK,browser: IE,time: 22:02,
}
15
![Page 16: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/16.jpg)
Analytics
21:00 all→1345 :00→45 :01→62 :02→87 ...
22:00 all→3222 :00→22 :00→19 :02→105 ...
... ...
UK all→229 user01→2 user14→12 user99→7 ...
US all→354 user01→4 user04→8 user56→17 ...
...
UK, 22:00 all→1905 ...
∅ all→87315 UK→239 US→354 ...
16
{cust_id: user01,session_id: 102,geography: UK,browser: IE,time: 22:02,
}
![Page 17: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/17.jpg)
Analytics
21:00 all→1345 :00→45 :01→62 :02→87 ...
22:00 all→3221 :00→22 :00→19 :02→104 ...
... ...
UK all→228 user01→1 user14→12 user99→7 ...
US all→354 user01→4 user04→8 user56→17 ...
...
UK, 22:00 all→1904 ...
∅ all→87314 UK→238 US→354 ...
17
![Page 18: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/18.jpg)
Analytics
21:00 all→1345 :00→45 :01→62 :02→87 ...
22:00 all→3222 :00→22 :01→19 :02→105 ...
... ...
UK all→229 user01→2 user14→12 user99→7 ...
US all→354 user01→4 user04→8 user56→17 ...
...
UK, 22:00 all→1905 ...
∅ all→87315 UK→239 US→354 ...
18
where time 21:00-22:00count(*)
![Page 19: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/19.jpg)
Analytics
21:00 all→1345 :00→45 :01→62 :02→87 ...
22:00 all→3222 :00→22 :01→19 :02→105 ...
... ...
UK all→229 user01→2 user14→12 user99→7 ...
US all→354 user01→4 user04→8 user56→17 ...
...
UK, 22:00 all→1905 ...
∅ all→87315 UK→239 US→354 ...
19
where time 21:00-22:00count(*)
where time 22:00-23:00, group by minute
![Page 20: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/20.jpg)
Analytics
21:00 all→1345 :00→45 :01→62 :02→87 ...
22:00 all→3222 :00→22 :01→19 :02→105 ...
... ...
UK all→229 user01→2 user14→12 user99→7 ...
US all→354 user01→4 user04→8 user56→17 ...
...
UK, 22:00 all→1905 ...
∅ all→87315 UK→239 US→354 ...
20
where time 21:00-22:00count(*)
where time 22:00-23:00, group by minute
where geography=UK group all by user,
![Page 21: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/21.jpg)
Analytics
21:00 all→1345 :00→45 :01→62 :02→87 ...
22:00 all→3222 :00→22 :01→19 :02→105 ...
... ...
UK all→229 user01→2 user14→12 user99→7 ...
US all→354 user01→4 user04→8 user56→17 ...
...
UK, 22:00 all→1905 ...
∅ all→87315 UK→239 US→354 ...
21
where time 21:00-22:00count(*)
where time 22:00-23:00, group by minute
where geography=UK group all by user,
count all
![Page 22: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/22.jpg)
Analytics
21:00 all→1345 :00→45 :01→62 :02→87 ...
22:00 all→3222 :00→22 :01→19 :02→105 ...
... ...
UK all→229 user01→2 user14→12 user99→7 ...
US all→354 user01→4 user04→8 user56→17 ...
...
UK, 22:00 all→1905 ...
∅ all→87315 UK→239 US→354 ...
22
where time 21:00-22:00count(*)
where time 22:00-23:00, group by minute
where geography=UK group all by user,
count all
group all by geo
![Page 23: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/23.jpg)
Analytics
• Motivation / alternatives• What is it?• How does it work?• Approximate Analytics
• Whats it good for?
23
![Page 24: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/24.jpg)
Analytics
Approximate Analytics
Exact
Large ScaleReal-time
24
![Page 25: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/25.jpg)
Analytics
Count Distinct
Plan A: keep a list of all the things you’ve seen count them at query time
Quick to update ... but at scale ...Takes lots of spaceTakes a long time to query
25
![Page 26: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/26.jpg)
Analytics
Approximate Distinct
xitem
00101001110...
hash max so far
22leading zeroes
y 11010100111... 0 2z 00011101011... 3 3
...
max # leading zeroes seen so far
... to see a max of M takes about 2M items
26
![Page 27: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/27.jpg)
Analytics
Approximate Distinct
to reduce var, average over m=2k sub-streams
xitem
00101001110...
hash
0, 0
index, zeroes max so far
0,0,0,0y 11010100111... 3, 1 0,0,1,0z 00011101011... 0, 1 1,0,1,0
...
take the harmonic mean
27
![Page 28: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/28.jpg)
Analytics
• Motivation / alternatives• What is it?• How does it work?• Approximate Analytics• Whats it good for?
28
![Page 29: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/29.jpg)
Analytics
Was it worth it?
29
![Page 30: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/30.jpg)
Analytics
• Ad Hoc: same queries, but without the need to pre-define them
• Geolocation: support for location-based events and queries
• Drill down: see the events that make up any given aggregate
30
What’s Coming?
![Page 31: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/31.jpg)
Analytics
• Motivation / alternatives• What is it?• How does it work?• Approximate Analytics• Whats it good for?
31
![Page 32: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/32.jpg)
Analytics
Manufacturing
Systems Monitoring
Financial Services
Social Media Ad Analytics
Oil + Gas
![Page 33: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/33.jpg)
Analytics
“Up and running in about 4 hours”
“We found out a competitor was scraping our data”
“We keep discovering use cases we hadn’t thought of ”
![Page 34: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/34.jpg)
Analytics
![Page 35: Realtime Analytics with Cassandra](https://reader034.vdocuments.mx/reader034/viewer/2022050920/54c385394a79593a698b45b2/html5/thumbnails/35.jpg)
Analytics
www.acunu.com @acunu
Apache, Apache Cassandra, Cassandra, Hadoop, and the eye and elephant logos are trademarks of the Apache Software Foundation.
35