spark meetup - zoomdata streaming
TRANSCRIPT
![Page 1: Spark meetup - Zoomdata Streaming](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587ba4161a28ab81758b5811/html5/thumbnails/1.jpg)
Interactive Visualization of Data powered by Spark
![Page 2: Spark meetup - Zoomdata Streaming](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587ba4161a28ab81758b5811/html5/thumbnails/2.jpg)
Streaming Data @ Zoomdata
Visualizations react to new data delivered
Users start, stop, pause the stream
Users select a rolling window or pin a start time to capture cumulative metrics
![Page 3: Spark meetup - Zoomdata Streaming](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587ba4161a28ab81758b5811/html5/thumbnails/3.jpg)
Drivers for Streaming Data
Data Freshness Time to Analytic Business Context
![Page 4: Spark meetup - Zoomdata Streaming](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587ba4161a28ab81758b5811/html5/thumbnails/4.jpg)
Challenges
● Time
● Frequency
● Retention
● Synchronization
● Order
● Updates
![Page 5: Spark meetup - Zoomdata Streaming](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587ba4161a28ab81758b5811/html5/thumbnails/5.jpg)
Addressing the Problem @ Zoomdata
Historical Revised
Receive Data JMS Kafka
Manipulate Stream Single JVM in Memory Spark Streaming
Hold Data in Buffer MongoDB Pluggable
Interact with Data Custom Code Pluggable
![Page 6: Spark meetup - Zoomdata Streaming](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587ba4161a28ab81758b5811/html5/thumbnails/6.jpg)
Technology Cast
● The Stream - Kafka, Kinesis, JMS
● Processing Fabric - Spark Streaming
● Landing Area - MemSQL, Solr, Kudu, Others
![Page 7: Spark meetup - Zoomdata Streaming](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587ba4161a28ab81758b5811/html5/thumbnails/7.jpg)
How it looks
![Page 8: Spark meetup - Zoomdata Streaming](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587ba4161a28ab81758b5811/html5/thumbnails/8.jpg)
With the rest of the app
![Page 9: Spark meetup - Zoomdata Streaming](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587ba4161a28ab81758b5811/html5/thumbnails/9.jpg)
Scale Out
![Page 10: Spark meetup - Zoomdata Streaming](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587ba4161a28ab81758b5811/html5/thumbnails/10.jpg)
Benefits
● Contextual Expressiveness with Streaming Data● Independent scalability (scale-up, scale-around)● Expressiveness powered by Spark -- using
Windowing (dataframe API with stream)
![Page 11: Spark meetup - Zoomdata Streaming](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587ba4161a28ab81758b5811/html5/thumbnails/11.jpg)
Side Benefits
● Separation of concerns● Disaster Recovery, COOP, other Data management
concerns● Restatements● Options!
![Page 12: Spark meetup - Zoomdata Streaming](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587ba4161a28ab81758b5811/html5/thumbnails/12.jpg)
Demo
● Twitter Producer● Spark Streaming● MemSQL & Solr Sinks
![Page 13: Spark meetup - Zoomdata Streaming](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587ba4161a28ab81758b5811/html5/thumbnails/13.jpg)
Future Work
● Cross Stream Synchronization & Fusion
● On-demand scale out and resource management via Mesos
● Schema Evolution
● Storage Tiering