webinar: data streaming with apache kafka & mongodb

46
Data Streaming with Apache Kafka & MongoDB Andrew Morgan – MongoDB Product Marketing David Tucker – Director, Partner Engineering and Alliances at Confluent 8 th November 2016

Upload: mongodb

Post on 07-Jan-2017

638 views

Category:

Data & Analytics


2 download

TRANSCRIPT

Page 1: Webinar: Data Streaming with Apache Kafka & MongoDB

Data Streaming with Apache Kafka &

MongoDB

Andrew Morgan – MongoDB Product MarketingDavid Tucker – Director, Partner Engineering

and Alliances at Confluent

8th November 2016

Page 2: Webinar: Data Streaming with Apache Kafka & MongoDB

Agenda

Target Audience

Apache Kafka

MongoDB

Integrating MongoDB and Kafka

Kafka – What’s Next

Next Steps

Page 3: Webinar: Data Streaming with Apache Kafka & MongoDB

Target Audience

Page 4: Webinar: Data Streaming with Apache Kafka & MongoDB

Target Audience

Page 5: Webinar: Data Streaming with Apache Kafka & MongoDB

Target Audience

Page 6: Webinar: Data Streaming with Apache Kafka & MongoDB

Target Audience

Page 7: Webinar: Data Streaming with Apache Kafka & MongoDB

Target Audience

Page 8: Webinar: Data Streaming with Apache Kafka & MongoDB

Target Audience

Page 9: Webinar: Data Streaming with Apache Kafka & MongoDB

Apache Kafka / Confluent Platform

Page 10: Webinar: Data Streaming with Apache Kafka & MongoDB

What does Kafka do?

Producers

Consumers

Kafka Connect

Kafka Connect

Topic

Your interfaces to the world

Connected to your systems in real time

Page 11: Webinar: Data Streaming with Apache Kafka & MongoDB

What is Streaming Data

Synchronous Req/Response0 – 100s ms

Near Real Time> 100s ms

Offline Batch> 1 hour

KAFKAStream Data Platform

Search

RDBMS

Apps Monitoring

Real-time AnalyticsNoSQL Stream Processing

HADOOPData Lake

Impala

DWH

Hive

Spark Map-Reduce

Page 12: Webinar: Data Streaming with Apache Kafka & MongoDB

Confluent’s OfferingsCore

Connect

Streams

Java Client

Kafka

Confluent Platform EnterpriseConfluent Platform

Multi-data-center ReplicationMore Clients

Advanced Data BalancingREST Proxy

Stream MonitoringSchema Registry

Connector ManagementPre-Built Connectors

Page 13: Webinar: Data Streaming with Apache Kafka & MongoDB

Confluent Platform: It’s Kafka ++Feature Benefit Apache Kafka Confluent Open

Source Confluent Enterprise

Apache Kafka High throughput, low latency, high availability, secure distributed message system

Kafka Connect Advanced framework for connecting external sources/destinations into Kafka

Kafka Streams Simple library that enables streaming application development within the Kafka framework

Additional Clients Supports non-Java clients; C, C++, Python, etc.

REST Proxy Provides universal access to Kafka from any network connected device via HTTP

Schema Registry Central registry for the format of Kafka data – guarantees all data is always consumable

Pre-Built Connectors HDFS, JDBC, elasticsearch and other connectors fully certified and fully supported by Confluent

Confluent Control Center Enables easy connector management and stream monitoring

Data Center & Cloud MDC Replication, auto-data balancing

Support Enterprise class support to keep your Kafka environment running at top performance Community Community 24x7x365

Free Free Subscription

Page 14: Webinar: Data Streaming with Apache Kafka & MongoDB

Common Kafka Use Cases

Data transport and integration• Log data• Database changes• Sensors and device data• Monitoring streams• Call data records• Stock ticker data

Real-time stream processing• Monitoring• Asynchronous applications• Fraud and security

Page 15: Webinar: Data Streaming with Apache Kafka & MongoDB

Kafka Adoption in Large Enterprises

6 of the top 10 travel companies

8 of the top 10 insurance companies

7 of the top 10 global banks

9 of the top 10telecom companies

Page 16: Webinar: Data Streaming with Apache Kafka & MongoDB

People Using Kafka TodayFinancial Services

Entertainment & Media

Consumer Tech

Travel & Leisure

Enterprise Tech

Telecom Retail

Page 17: Webinar: Data Streaming with Apache Kafka & MongoDB

MongoDB

Page 18: Webinar: Data Streaming with Apache Kafka & MongoDB

Relational

Expressive Query Language& Secondary Indexes

Strong Consistency

Enterprise Management& Integrations

Page 19: Webinar: Data Streaming with Apache Kafka & MongoDB

The World Has ChangedData Risk

Time Cost

Page 20: Webinar: Data Streaming with Apache Kafka & MongoDB

NoSQL

Scalability& Performance

Always On,Global Deployments

FlexibilityExpressive Query Language& Secondary Indexes

Strong Consistency

Enterprise Management& Integrations

Page 21: Webinar: Data Streaming with Apache Kafka & MongoDB

Nexus Architecture

Scalability& Performance

Always On,Global Deployments

FlexibilityExpressive Query Language& Secondary Indexes

Strong Consistency

Enterprise Management& Integrations

Page 22: Webinar: Data Streaming with Apache Kafka & MongoDB

Integrating MongoDB and Kafka

Page 23: Webinar: Data Streaming with Apache Kafka & MongoDB

Where MongoDB Fits

Prod324

123...

Topic A

Prod967

123...

Topic B

Filter

Filter

Merge534

123...

Topic C

Analyze496

123...

Topic D

TakeAction

Take Action

Page 24: Webinar: Data Streaming with Apache Kafka & MongoDB

Where MongoDB Fits

Prod324

123...

Topic A

Prod967

123...

Topic B

Filter

Filter

Merge534

123...

Topic C

Analyze496

123...

Topic D

TakeAction

StoreResults

Operational Database

Page 25: Webinar: Data Streaming with Apache Kafka & MongoDB

Where MongoDB Fits

Prod324

123...

Topic A

Prod967

123...

Topic B

Filter

Filter

Merge534

123...

Topic C

Analyze496

123...

Topic D

TakeAction

StoreResults

KeyEvents

Operational Database

Page 26: Webinar: Data Streaming with Apache Kafka & MongoDB

Where MongoDB Fits

Prod324

123...

Topic A

Prod967

123...

Topic B

Filter

Filter

Merge534

123...

Topic C

Analyze496

123...

Topic D

TakeAction

StoreResults

KeyEvents

Operational Database

Page 27: Webinar: Data Streaming with Apache Kafka & MongoDB

Where MongoDB Fits

Prod324

123...

Topic A

Prod967

123...

Topic B

Filter

Filter

Merge534

123...

Topic C

Analyze496

123...

Topic D

TakeAction

StoreResults

KeyEvents

Operational Database

Reference Data

Page 28: Webinar: Data Streaming with Apache Kafka & MongoDB

Where K-Streams Fits

Prod324

123...

Topic A

Prod967

123...

Topic B

534

123...

Topic C

Analyze496

123...

Topic D

TakeAction

StoreResults

KeyEvents

Operational Database

Reference Data

Kafka Streams

Page 29: Webinar: Data Streaming with Apache Kafka & MongoDB

MongoDB As a Kafka Producer

Page 30: Webinar: Data Streaming with Apache Kafka & MongoDB

Mes

sage

Que

ue

Customer Data Mgmt Mobile App IoT App Live Dashboards

Raw Data

Processed Events

Distributed Processing Frameworks

Millisecond latency. Expressive querying & flexible indexing against subsets of data. Updates-in place. In-database aggregations & transformations

Multi-minute latency with scans across TB/PB of data. No indexes. Data stored in 128MB blocks. Write-once-read-many & append-only storage model

Sensors

User Data

Clickstreams

Logs

Churn Analysis

Enriched Customer Profiles

Risk Modeling

Predictive Analytics

Real-Time Access

Batch Processing, Batch Views

Design Pattern: Operationalized Data Lake

Kafka Streams

Page 31: Webinar: Data Streaming with Apache Kafka & MongoDB

Mes

sage

Que

ue

Customer Data Mgmt Mobile App IoT App Live Dashboards

Raw Data

Processed Events

Millisecond latency. Expressive querying & flexible indexing against subsets of data. Updates-in place. In-database aggregations & transformations

Multi-minute latency with scans across TB/PB of data. No indexes. Data stored in 128MB blocks. Write-once-read-many & append-only storage model

Sensors

User Data

Clickstreams

Logs

Churn Analysis

Enriched Customer Profiles

Risk Modeling

Predictive Analytics

Real-Time Access

Batch Processing, Batch Views

Design Pattern: Operationalized Data LakeConfigure where to land incoming data

Distributed Processing Frameworks

Kafka Streams

Page 32: Webinar: Data Streaming with Apache Kafka & MongoDB

Mes

sage

Que

ue

Customer Data Mgmt Mobile App IoT App Live Dashboards

Raw Data

Processed Events

Millisecond latency. Expressive querying & flexible indexing against subsets of data. Updates-in place. In-database aggregations & transformations

Multi-minute latency with scans across TB/PB of data. No indexes. Data stored in 128MB blocks. Write-once-read-many & append-only storage model

Sensors

User Data

Clickstreams

Logs

Churn Analysis

Enriched Customer Profiles

Risk Modeling

Predictive Analytics

Real-Time Access

Batch Processing, Batch Views

Design Pattern: Operationalized Data Lake

Raw data processed to generate analytics models

Distributed Processing Frameworks

Kafka Streams

Page 33: Webinar: Data Streaming with Apache Kafka & MongoDB

Mes

sage

Que

ue

Customer Data Mgmt Mobile App IoT App Live Dashboards

Raw Data

Processed Events

Millisecond latency. Expressive querying & flexible indexing against subsets of data. Updates-in place. In-database aggregations & transformations

Multi-minute latency with scans across TB/PB of data. No indexes. Data stored in 128MB blocks. Write-once-read-many & append-only storage model

Sensors

User Data

Clickstreams

Logs

Churn Analysis

Enriched Customer Profiles

Risk Modeling

Predictive Analytics

Real-Time Access

Batch Processing, Batch Views

Design Pattern: Operationalized Data LakeMongoDB exposes analytics models to operational apps. Handles real time

updates

Distributed Processing Frameworks

Kafka Streams

Page 34: Webinar: Data Streaming with Apache Kafka & MongoDB

Mes

sage

Que

ue

Customer Data Mgmt Mobile App IoT App Live Dashboards

Raw Data

Processed Events

Millisecond latency. Expressive querying & flexible indexing against subsets of data. Updates-in place. In-database aggregations & transformations

Multi-minute latency with scans across TB/PB of data. No indexes. Data stored in 128MB blocks. Write-once-read-many & append-only storage model

Sensors

User Data

Clickstreams

Logs

Churn Analysis

Enriched Customer Profiles

Risk Modeling

Predictive Analytics

Real-Time Access

Batch Processing, Batch Views

Design Pattern: Operationalized Data LakeCompute new

models against MongoDB &

HDFS

Distributed Processing Frameworks

Kafka Streams

Page 35: Webinar: Data Streaming with Apache Kafka & MongoDB
Page 38: Webinar: Data Streaming with Apache Kafka & MongoDB

Kafka – What’s Next

Page 39: Webinar: Data Streaming with Apache Kafka & MongoDB

Kafka Connectors• Confluent-supported connectors (included in CP)

• Community-written connectors (just a sampling)

JDBC

Page 40: Webinar: Data Streaming with Apache Kafka & MongoDB

Kafka Futures• Apache Core

• Admin API (KIP-4)• Exactly-once delivery semantics• Time-based topic indexing

• Kafka Streams• Exactly-once processing semantics• Interactive Queries: enable real-time sharing of application state with

other applications• Confluent Platform Enterprise

• Multi-cluster views and expanded alerting added to Control Center

Page 41: Webinar: Data Streaming with Apache Kafka & MongoDB

Next Steps

Page 42: Webinar: Data Streaming with Apache Kafka & MongoDB

MongoDB AtlasDatabase as a service for MongoDB

MongoDB Atlas is…

• Automated: The easiest way to build, launch, and scale apps on MongoDB

• Flexible: The only database as a service with all you need for modern applications

• Secured: Multiple levels of security available to give you peace of mind

• Scalable: Deliver massive scalability with zero downtime as you grow

• Highly available: Your deployments are fault-tolerant and self-healing by default

• High performance: The performance you need for your most demanding workloads

Page 43: Webinar: Data Streaming with Apache Kafka & MongoDB

MongoDB Atlas Features

• Spin up a cluster in minutes

• Replicated & always-on deployments

• Fully elastic: scale out or up in a few clicks with zero downtime

• Automatic patches & simplified upgrades for the newest MongoDB features

• Authenticated & encrypted

• Continuous backup with point-in-time recovery

• Fine-grained monitoring & custom alerts

Safe & SecureRun for You

• On-demand pricing model; billed by the hour

• Multi-cloud support (AWS available with others coming soon)

• Part of a suite of products & services designed for all phases of your app; migrate easily to different environments (private cloud, on-prem, etc) when needed

No Lock-In

Database as a service for MongoDB

Page 44: Webinar: Data Streaming with Apache Kafka & MongoDB

MongoDB Enterprise Advanced

• MongoDB Ops Manager or MongoDB Cloud Manager Premium

• MongoDB Compass

• MongoDB Connector for BI

• Cloud Foundry Integration

• Encrypted Storage Engine

• LDAP / Kerberos Integration

• DDL & DML Auditing

• FIPS 140-2 Support

SecurityTooling

• 24 x 7 Support

• 1 hr SLA

• Emergency Patches

• Customer Success Program

• On-Demand Training

Support License

• Commercial License

Page 46: Webinar: Data Streaming with Apache Kafka & MongoDB

Old Billingsgate, London15th November

mongodb.com/europe

Use my discount code for 20% off: andrewmorgan20