future of data meetup : boontadata
TRANSCRIPT
![Page 1: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/1.jpg)
@benjguin
boontadata
Benjamin GuinebertièreTechnical EvangelistMicrosoft France@benjguin
![Page 2: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/2.jpg)
@benjguin
Introduction
![Page 3: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/3.jpg)
@benjguin
Agenda- Introduction- Big Data Architectures- Stream Processing Challenges- boontadata- Conclusion
![Page 4: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/4.jpg)
@benjguin
Big Data Architectures
![Page 5: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/5.jpg)
@benjguin
Big Data Processing Engines
data stream
data lake
![Page 6: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/6.jpg)
@benjguin
events log
cold path
hot path
Lambda architecture
Events
storage
NearReal Time
DBSQL/noSQL
batch
query &
mergeStorage
, DB SQL/noSQL
![Page 7: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/7.jpg)
@benjguin
cold path
hot path
Non lambda Architecture
Events
events log
NearReal Time DB
SQL/noSQLbatch
query
![Page 8: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/8.jpg)
@benjguin
Kappa Architecture
https://www.oreilly.com/ideas/questioning-the-lambda-architecture , By Jay Kreps, July 2, 2014
![Page 9: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/9.jpg)
@benjguin
Exemples de services(dans Azure)
http://dev.microsoft.fr/data
![Page 10: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/10.jpg)
@benjguin
Stream Processing Challenges
![Page 11: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/11.jpg)
@benjguin
Events:Example
Time Object A Object B10:00:00 100 200010:00:03 180010:00:04 8510:00:07 40 250010:00:08 10010:00:09 3000
Aggregations:Time Window Object A (sum) Object B (average)10:00:05 185 190010:00:10 140 2750
![Page 12: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/12.jpg)
@benjguin
A distributed system
object A
object B
brokerbroker
broker processing engine
![Page 13: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/13.jpg)
@benjguin
Best Case10
:00:
00
10:0
0:01
10:0
0:02
10:0
0:03
10:0
0:04
10:0
0:05
10:0
0:06
10:0
0:07
10:0
0:08
10:0
0:09
10:0
0:10
10:0
0:11
10:0
0:12
A,m
1,10
0 A,m
4,8
5 A,m
5,4
0 A,m
7,10
0
B,m
2,20
00 B,m
3,18
00 B,m
6,25
00 B,m
8,30
00
![Page 14: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/14.jpg)
@benjguin
duplicate events10
:00:
00
10:0
0:01
10:0
0:02
10:0
0:03
10:0
0:04
10:0
0:05
10:0
0:06
10:0
0:07
10:0
0:08
10:0
0:09
10:0
0:10
10:0
0:11
10:0
0:12
A,m
1,10
0 A,m
4,8
5 A,m
5,4
0 A,m
7,10
0
B,m
2,20
00 B,m
3,18
00 B,m
6,25
00 B,m
8,30
00B,m
2,20
00
A,m
1,10
0
A,m
5,4
0B,
m8,
3000 B,
m8,
3000
![Page 15: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/15.jpg)
@benjguin
out of order events10
:00:
00
10:0
0:01
10:0
0:02
10:0
0:03
10:0
0:04
10:0
0:05
10:0
0:06
10:0
0:07
10:0
0:08
10:0
0:09
10:0
0:10
10:0
0:11
10:0
0:12
A,m
1,10
0 A,m
4,8
5 A,m
5,4
0A,m
7,10
0
B,m
2,20
00 B,m
3,18
00 B,m
6,25
00 B,m
8,30
00
![Page 16: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/16.jpg)
@benjguin
late events10
:00:
00
10:0
0:01
10:0
0:02
10:0
0:03
10:0
0:04
10:0
0:05
10:0
0:06
10:0
0:07
10:0
0:08
10:0
0:09
10:0
0:10
10:0
0:11
10:0
0:12
A,m
1,10
0A,m
4,8
5 A,m
5,4
0 A,m
7,10
0
B,m
2,20
00 B,m
3,18
00 B,m
6,25
00 B,m
8,30
00
![Page 17: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/17.jpg)
@benjguin
Watermark
from Apache Flink documentation
![Page 18: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/18.jpg)
@benjguin
Watermark
from Apache Flink documentation
![Page 19: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/19.jpg)
@benjguin
boontadata
![Page 20: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/20.jpg)
@benjguin
Storm
Flink
Samza
Spark Streaming
…
![Page 21: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/21.jpg)
@benjguin
IOT simulator
broker
noSQL Database
Streaming engine #1
Streaming engine #2
Streaming engine #...
compare
boontadata
![Page 22: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/22.jpg)
@benjguin
IOT simulator(python)
Kafka broker
Cassandra
Apache Spark Streaming
Apache Flink
Apache Kafka Streams,…
compare(python)
boontadata-streams
![Page 23: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/23.jpg)
@benjguin
IOT simulator(python)
IOT Hub
DocumentDb
Azure Stream Analytics
HDInsight Storm
HDInsight Spark Streaming,
…
compare(python)
boontadata-paas
![Page 24: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/24.jpg)
@benjguin
Implement other engines: Apache
Samza, Apex,…
![Page 25: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/25.jpg)
@benjguin
http://boontadata.io
![Page 26: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/26.jpg)
@benjguin
let’s run it
![Page 27: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/27.jpg)
@benjguin
inject
![Page 28: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/28.jpg)
@benjguin
Spark Streaming – processing time
![Page 29: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/29.jpg)
@benjguin
Flink – processing time
![Page 30: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/30.jpg)
@benjguin
Flink – event time
![Page 31: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/31.jpg)
@benjguin
Azure Stream Analyticspyclient
container
![Page 32: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/32.jpg)
@benjguin
Run other tests(other seed, several objects, …)
![Page 34: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/34.jpg)
@benjguin
Conclusion
![Page 35: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/35.jpg)
@benjguin
• A number of streaming engines with subtle differences=> do your own tests
• Contribute: http://boontadata.io
Conclusion
![Page 36: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/36.jpg)
@benjguin
![Page 37: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/37.jpg)
@benjguin
Benjamin Guinebertière Technical Evangelist, Microsoft FranceAzure, data insights, machine learning @benjguin | http://3-4.fr
![Page 38: Future of Data Meetup : Boontadata](https://reader036.vdocuments.mx/reader036/viewer/2022070516/58cfaf221a28ab223a8b46b1/html5/thumbnails/38.jpg)
@benjguin
© 2014 Microsoft Corporation. All rights reserved.