real time analytics with kafka and sparkstreaming

Post on 28-Jul-2015

149 Views

Category:

Engineering

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1© Cloudera, Inc. All rights reserved.

Real-time analytics with Kafka and Spark StreamingAshish Singh | Software Engineer, Cloudera

2© Cloudera, Inc. All rights reserved.

It’s Real-Time timeWhy now? Complex Event Processing (CEP) is not a new concept.

3© Cloudera, Inc. All rights reserved.

It’s Real-Time time

Emergence of Real-Time

Stream Processing

Exponential growth in

continuous data streams

Open Source tools for

reliable high-throughput low latency event

queuing and processing

Tools run on “Commodity”

Hardware

Why now? Complex Event Processing (CEP) is not a new concept.

4© Cloudera, Inc. All rights reserved.

It’s happening! …Across IndustriesCredit Card& MonetaryTransactionsIdentifyfraudulent transactions as soon as they occur.

Transportation& Logistics• Real-timetrafficconditions• Trackingfleet and cargo locations and dynamic re-routing to meet SLAs

Retail• Real-timein-storeOffers and Recommendations.• Email andmarketing campaigns based on real-time social trends

Consumer Internet, Mobile & E-Commerce

Optimize userengagement basedon user’s currentbehavior. Deliver recommendations relevant “in the moment”

HealthcareContinuouslymonitor patientvital stats and proactively identifyat-risk patients.

Manufacturing• Identifyequipmentfailures and react instantly• Perform proactivemaintenance.• Identify product quality defects immediately to prevent resource wastage.

Security & Surveillance

Identifythreatsand intrusions, both digital and physical, in real-time.

Digital Advertising& MarketingOptimize and personalize digital ads based on real-time information.

5© Cloudera, Inc. All rights reserved.

Canonical Stream Processing Architecture

DataSources

6© Cloudera, Inc. All rights reserved.

Canonical Stream Processing Architecture

DataSources

Kafka Flume

7© Cloudera, Inc. All rights reserved.

Canonical Stream Processing Architecture

DataSources

Kafka Flume

FilterEnrich

TransformStats on Sliding Windows

Stream JoinsFeature Engineering Predictive Analytics

Active Model Training....

And combinations of the above

8© Cloudera, Inc. All rights reserved.

Canonical Stream Processing Architecture

DataSources

Kafka Flume

HDFS

9© Cloudera, Inc. All rights reserved.

Canonical Stream Processing Architecture

DataSources

Kafka Flume

NoSql

HDFS

10© Cloudera, Inc. All rights reserved.

Canonical Stream Processing Architecture

DataSources

Kafka Flume

NoSql

HDFS

11© Cloudera, Inc. All rights reserved.

Canonical Stream Processing Architecture

DataSources

Kafka Flume

NoSql

HDFS

12© Cloudera, Inc. All rights reserved.

Canonical Stream Processing Architecture

DataSources

Kafka Flume

NoSql

HDFS

13© Cloudera, Inc. All rights reserved.

Canonical Stream Processing Architecture

DataSources

Kafka Flume

NoSql

HDFS

Kafka

14© Cloudera, Inc. All rights reserved.

Canonical Stream Processing Architecture

DataSources

Kafka Flume

NoSql

HDFS

Kafka ...

15© Cloudera, Inc. All rights reserved.

Too much?

15© Cloudera, Inc. All rights reserved.

16© Cloudera, Inc. All rights reserved.

Example application to demonstrate how real time analytics can be done using Kafka and Spark Streaming

Pankh

https://github.com/SinghAsDev/pankh

17© Cloudera, Inc. All rights reserved.

Pankh – Building Pieces

DataSources

18© Cloudera, Inc. All rights reserved.

Pankh – Building Pieces

DataSources

Kafka

19© Cloudera, Inc. All rights reserved.

Pankh – Building Pieces

DataSources

Kafka

20© Cloudera, Inc. All rights reserved.

Pankh – Building Pieces

DataSources

Kafka

NoSql

21© Cloudera, Inc. All rights reserved.

Pankh – Building Pieces

DataSources

Kafka

NoSql

22© Cloudera, Inc. All rights reserved.

Demo Time

22© Cloudera, Inc. All rights reserved.

25© Cloudera, Inc. All rights reserved.

Kappa Architecture

26© Cloudera, Inc. All rights reserved.

Demo Time

26© Cloudera, Inc. All rights reserved.

27© Cloudera, Inc. All rights reserved.

Thank youAshish Singhasingh@cloudera.com@singhasdev

top related