protecting your data at rest with apache kafka by confluent and vormetric
TRANSCRIPT
1Confidential
Securing your Streaming Data PlatformOperational considerations for a secure deployment
Andrew Lance, VormetricDavid Tucker, Confluent
2Confidential
Agenda
• Introduction to Apache Kafka and Confluent
• Overview of Vormetric and its policy-driven security solution
• Confluent Platform deployment architecture
• Security considerations and solutions
• Q&A
3Confidential
About Confluent and Apache Kafka• Founded by the creators of Apache
Kafka• Founded September 2014• Technology developed while at
LinkedIn• 73% of active Kafka committers
Cheryl DalrympleCFO
Jay KrepsCEO
Neha NarkhedeCTO, VP Engineering
Luanne DauberCMO
Leadership
Todd BarnettVP WW Sales
Jabari NortonVP Business Dev
4Confidential
Before: Many Ad Hoc Pipelines
5Confidential
After: Stream Data Platform with Kafka Distribut
ed Fault Tolerant
Stores Messages
Search Security
Fraud Detection Application
User Tracking Operational Logs Operational MetricsMySQL Cassandra Oracle
Hadoop Elastic Search Splunk Data Warehouse
Kafka
Processes Streams
6Confidential
What is a Stream Data Platform?
KafkaStream Data
Platform
Search
NoSQL
RDBMS Monitoring
Stream ProcessingReal-time Analytics Data Warehouse
Apps
Apps
Hadoop
Synchronous Req/Response0 – 100s ms
Near Real Time> 100s ms
Offline Batch> 1 hour
Build streaming applications
Deploy streaming applications at scale
Monitor and manage streaming applications
Common Kafka Use Cases• Log data• Database changes• Sensors and device
data• Monitoring streams• Call data records
• Real-time Monitoring• Asynchronous
applications• Fraud and security• Bridge to Cloud
7Confidential
People Using Kafka TodayFinancial Services
Entertainment & Media
Consumer Tech
Travel & Leisure
Enterprise Tech
Telecom Retail
8 of the top 10 insurance companies &7 of the top 10 banks in the Fortune 500
9 of the top 10 telcos in the Fortune 500
6 of the top 10 travel companies in the Fortune 500
8Confidential
Confluent Platform: It’s Kafka ++Feature Benefit Apache Kafka Confluent Platform
3.0Confluent Enterprise
3.0
Apache Kafka High throughput, low latency, high availability, secure distributed message system
Kafka Connect Advanced framework for connecting external sources and destinations into Kafka
Java Client Provides easy integration into Java applications
Kafka Streams Simple library that enables streaming application development within the Kafka framework
Additional Clients Supports non-Java clients; C, C++, Python, Go, etc.
Rest Proxy Provides universal access to Kafka from any network connected device via HTTP
Schema Registry Central registry for the format of Kafka data – guarantees all data is always consumable
Pre-Built Connectors HDFS, JDBC, Elastic and other connectors fully Certified and fully supported by Confluent
Confluent Control Center Includes Connector Management and Stream Monitoring
Support Connection and Monitoring command center provides advanced functionality and control Community Community 24x7x365
Free Free Subscription
12Confidential
Agenda
• Introduction to Apache Kafka and Confluent
• Overview of Vormetric and its policy-driven security solution
• Confluent Platform deployment architecture
• Security considerations and solutions
• Q&A
13
Vormetric Company Overview
13
Smart CloudEnterprise Plus
Global Customers• Over 1,500 customers• 17 of the Fortune 30
Most Security Conscious Brands• Largest financial institutions• Largest retail companies• Major manufacturers• Third party business service providers• Government agencies
Cloud Service Providers Trust Vormetric
Business Drivers• Executive mandates
o Data breach, insider threat
• Compliance• SLAs
”With Vormetric, people have no idea it’s even running. Vormetric Encryption also saved us at least nine months of application rewrite effort, and its installation was one of the easiest we’ve ever experienced. ” -Karl Mudra, CIO, Delta Dental of Missouri
15
Vormetric Data Security Platform
ApplicationEncryption
Vormetric Data Security
ManagerTokenization
DataMasking
Key Management
Security Intelligence
TransparentEncryption
EncryptionGateway
KMaaS
16
How do we Encrypt?Sensitive Data Protection Technologies
▌ SSL, SSH, HTTPS, IPSEC
Data in Motion
Between Devices
Data at Rest
ENCRYPTION,TOKENIZATION, DATA MASKING
Application/Database
File System
Disk
Application/Database
File System
Disk
17
Vormetric Transparent Encryption
Policy is used to restrict access to sensitive data by user and process information provided by the Operating System.
Users
Application
Database
Operating System
FS Agent
File Systems
VolumeManagers
SSL/TLS
*communication is only required at system boot
18
Policy Example: Kafka
Policy Summary:Only the specified Kafka user, using only the verified Java process has full read/write & automatic encrypt/decrypt access to the protected topic data.
Privileged admins and root accounts are allowed to manage the protected data without seeing the sensitive contents.
All other data requests are denied and audited.
# Resource User Process Action Effectsany Kafka User Java Read / Write Permit
Encrypt / Decrypt(audit optional)
any Root Whitelisted management
processes
Metadata Only
PermitAudit
any * * * Deny & Audit
1
2
3
1
2
3
Policy Benefits Data-at-rest encryption without changing
configs or application code. Remove custodial risk of privileged root
users
19
Vormetric Security Intelligence
▌Log all access and attempted access to what matters – the dataReveals unauthorized access attempts to protected dataFind unusual access patterns Identify compromised users, administrators and applications Identify attacks on data such as APTs or malicious insidersPrebuilt integrations: Splunk, ArcSight, Qradar, LogRhythm
20Confidential
Agenda
• Introduction to Apache Kafka and Confluent
• Overview of Vormetric and its policy-driven security solution
• Confluent Platform deployment architecture
• Security considerations and solutions
• Q&A
21Confidential
22Confidential
Kafka Topics
Topic == Distributed Commit Log
• Immutable (persisted to broker storage)
• Ordered
• Sequential Offset
• Partitioned (for scalability)
23Confidential
Kafka Deployment Architecture (simplified)
Zookeeper
Producer /
Consumer
Producer /
Consumer
Producer /
Consumer
Producer /
Consumer
Broker
Broker
Broker
Broker
Broker
ZookeeperZookeepe
r
• Zookeeper quorum manages metadata
• Broker nodes manage (and store) topic data
• Brokers and Clients access ZK nodes
• Brokers communicate directly for replication (many-to-many)
• Broker and Zookeeper nodes utilize local storage.
24Confidential
Kafka Deployment Architecture
Zookeeper
Producer /
Consumer
Producer /
Consumer
Producer /
Consumer
Producer /
Consumer
Broker
Broker
Broker
Broker
Broker
ZookeeperZookeepe
r
• Zookeeper quorum manages metadata
• Broker nodes manage (and store) topic data
• Brokers and Clients access ZK nodes
• Brokers communicate directly for replication (many-to-many)
• Broker and Zookeeper nodes utilize local storage.
25Confidential
Security Options• Authentication
• SSL certificates support for 1-way (broker-only) or 2-way (broker and client) authentication • SASL challenge/response support via Kerberos• Mix-n-match : SSL for wire-level encryption, SASL for authentication
• Authorization• Access Control Lists
• Operations: Read, Write, Create, Describe, ClusterAction, ALL• Resources: Topic, Cluster, ConsumerGroup• NOTE: ACL’s stored in zookeeper (along with all topic metadata)
• Data Encryption• Vormetric policy management
26Confidential
Secure Deployments: Step by Step
• SSL Configuration• Identify / deploy Certificate Authority• Generate certificates (brokers, clients, or both)• Share / Install certificates on brokers and/or clients• Set Kafka broker properties to restrict communication to SSL channels
• Kerberos Configuration (SASL)• Identify / deploy Kerberos principal
27Confidential
Secure Deployments: Step by Step (continued)
• Data Encryption• Identify / Deploy Vormetric DSM• Configure cluster brokers and ZK nodes into DSM domain• Create and distributed keys (could be coordinated with keys used by brokers and clients)• Define encryption policy and apply policy to the storage directories
• (test/dev best-practice: exclude metadata operations from policy enforcement)
• References: • http://docs.confluent.io/3.0.0/kafka/security.html• <vormetric>
28Confidential
Solution Benefits
• End-to-end security management … from Kafka topic to storage layer
• Robust access controls across all layers
• Fine grained access control• Logical constraints on privileged users• Alerting regarding in-band and out-of-band access attempts
29Confidential
Any questions ?
30Confidential
Thank You