security implementation on hadoop
TRANSCRIPT
![Page 1: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/1.jpg)
1© Cloudera, Inc. All rights reserved.
Security Implementation on Hadoop
Dr. Wei-Chiu Chuang | Software
Engineer
![Page 2: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/2.jpg)
2© Cloudera, Inc. All rights reserved.
$ whoami
Software Engineer, Cloudera Apache Hadoop Committer/PMC
![Page 3: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/3.jpg)
3© Cloudera, Inc. All rights reserved.
Unguarded data stores are the victims
![Page 4: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/4.jpg)
4© Cloudera, Inc. All rights reserved.
Regulatory Compliance
Organizations can be fined up to 4% of annual global turnover for breaching GDPR
or €20 Million
![Page 5: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/5.jpg)
6© Cloudera, Inc. All rights reserved.
Security Implementation
![Page 6: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/6.jpg)
7© Cloudera, Inc. All rights reserved.
Disclaimer
This talk serves as a general guideline for
security implementation on Hadoop.
The actual implementation procedures and
scope of implementation vary on a case-
by-case basis, and should be assessed by
Cloudera’s Professional Services team or
certified Cloudera SI Partners.
![Page 7: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/7.jpg)
8© Cloudera, Inc. All rights reserved.
Non-secure #0Data Free for All
![Page 8: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/8.jpg)
9© Cloudera, Inc. All rights reserved.
Firewall
ActiveDirectory/KDC
Hadoop cluster
Cloudera Manager
Gateway node
Cloudera NavigatorDatacenter
Applications
![Page 9: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/9.jpg)
10© Cloudera, Inc. All rights reserved.
High Availability made Easy
![Page 10: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/10.jpg)
11© Cloudera, Inc. All rights reserved.
Identity Management
Simple AuthenticationFile group ownership• AD integration• SSSD or CentrifyConsideration in large enterprises.
SSSD
via
![Page 11: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/11.jpg)
12© Cloudera, Inc. All rights reserved.
System Diagram #0
Firewall
ActiveDirectory
Master
Worker Worker Worker
Cloudera Manager
Master
(SSSD/Centrify)
![Page 12: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/12.jpg)
13© Cloudera, Inc. All rights reserved.
Simple authentication =
no authentication
![Page 13: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/13.jpg)
14© Cloudera, Inc. All rights reserved.
Minimal Security #1
Reduce Risk Exposure
![Page 14: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/14.jpg)
15© Cloudera, Inc. All rights reserved.
Kerberos
EXAMPLE.COM
KDC
Hadoop
user
Strong Authentication
KDC
• MIT
• ActiveDirectory (more common)
realmprimary
![Page 15: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/15.jpg)
16© Cloudera, Inc. All rights reserved.
Kerberos
Consideration in large corporates
Time synchronization
CM Kerberos Wizard
• Configure AD to create a Kerberos
principal for CM server, and to
delegate CM the ability to
create/manage Kerberos principals
![Page 16: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/16.jpg)
17© Cloudera, Inc. All rights reserved.
LDAP Authentication
* LDAP over SSL
![Page 17: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/17.jpg)
18© Cloudera, Inc. All rights reserved.
Authorization/Access Control
HDFS File ACL YARN job submission
Hbase ACLs Oozie ACL
Access Control List (ACLs)
Hive
Sentry Managed
(RBAC)
Impala
![Page 18: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/18.jpg)
19© Cloudera, Inc. All rights reserved.
Auditing
![Page 19: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/19.jpg)
20© Cloudera, Inc. All rights reserved.
Backup/Disaster Recovery
Cloudera Backup/Disaster Recovery (BDR)
• A high performance data replicator
• Copies incremental data on the source cluster at specified schedules
Supports
Kerberos
Data encryption
HDFS replication to cloud
![Page 20: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/20.jpg)
21© Cloudera, Inc. All rights reserved.
Kerberized BDR Best Practice
Production DR
Cloudera BDRPROD.EXAMPLE.COM
Cross-realm trustKDC KDC
DR.EXAMPLE.COM
![Page 21: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/21.jpg)
22© Cloudera, Inc. All rights reserved.
Firewall
System Diagram #1
ActiveDirectory/ KDC
Master
Worker Worker Worker
Cloudera Manager
Kerberos
Master
(SSSD/Centrify)
DR
![Page 22: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/22.jpg)
23© Cloudera, Inc. All rights reserved.
More Security #2
Managed, Secure, Protected
![Page 23: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/23.jpg)
24© Cloudera, Inc. All rights reserved.
Data In-Transit Encryption
RPC encryption
Data transport encryption
• Supports AES CTR, up to 256-bit
key length
HTTP TLS/SSL encryption
• No self-signed certificates in
production
Master
Worker Worker Worker
Master
Application
RPC encryption
Transport encryption
TLS/SSL
![Page 24: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/24.jpg)
25© Cloudera, Inc. All rights reserved.
Data At-Rest Encryption
Transparent encryption
Supports any Hadoop applications
Encryption Zone
$ hadoop key create mykey
$ hadoop fs -mkdir /zone
$ hdfs crypto -createZone -keyName mykey -path /zone
/
/tmp/zon
e
foo bar
Encryption zone
![Page 25: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/25.jpg)
26© Cloudera, Inc. All rights reserved.
Key Management Server Deployment (non-prod)
HDFS NameNode
Client
Java Keystore
KMS
Keystorefile
Separation of duties
• Encryption Zone Key (EZK) is stored in
KMS server
• HDFS super user can not decrypt files
![Page 26: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/26.jpg)
27© Cloudera, Inc. All rights reserved.
Key Management Server/Key Trustee Server Deployment
HDFS NameNode
ClientKey Trustee
KMS
Key Trustee KMS
Firewall
Key Trustee Server
(Active)
Key Trustee Server
(Passive)
synchronization
(or more)
![Page 27: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/27.jpg)
28© Cloudera, Inc. All rights reserved.
KMS+KTS+HSM Deployment
HDFS NameNode
Client HSM KMS
HSM KMS
Firewall
Key Trustee Server
(Active)
Key Trustee Server
(Passive)
synchronization
Key HSM
(or more)
Key HSM
HSM
HSM
![Page 28: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/28.jpg)
29© Cloudera, Inc. All rights reserved.
Encryption Performance
![Page 29: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/29.jpg)
30© Cloudera, Inc. All rights reserved.
Troubleshooting: Encryption Performance Anomaly
• Configuration
• AES-NI Hardware acceleration
• OpenSSL library
• Entropy
![Page 30: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/30.jpg)
31© Cloudera, Inc. All rights reserved.
Fine Grained Access Control with Apache Sentry
![Page 31: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/31.jpg)
32© Cloudera, Inc. All rights reserved.
Firewall
System Diagram #2
ActiveDirectory/ KDC
Master
Worker Worker Worker
Cloudera Manager
Kerberos
Master
KMSKMS
Firewall
KeyTrusteeKeyTrustee
(SSSD/Centrify)
![Page 32: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/32.jpg)
33© Cloudera, Inc. All rights reserved.
Most Security #3
Secure Data Vault
![Page 33: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/33.jpg)
34© Cloudera, Inc. All rights reserved.
Data Redaction
Personal Identifiable Information
• PCI-DSS, HIPAA
Best practice
Password
• stores in credential files, not in configuration
Log, queries
• Cloudera Manager
![Page 34: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/34.jpg)
35© Cloudera, Inc. All rights reserved.
Full Encryption
Encrypt Data Spills
• MapReduce
• Impala
• Hive
• Flume
OS-level encryption
• Navigator Encrypt
![Page 35: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/35.jpg)
36© Cloudera, Inc. All rights reserved.
Security Vulnerabilities
![Page 36: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/36.jpg)
37© Cloudera, Inc. All rights reserved.
Vulnerability Response and Process
Vulnerability reports
Upstream
Internal
External
Fix Publish
CVE
Cloudera TSB
![Page 37: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/37.jpg)
38© Cloudera, Inc. All rights reserved.
Cloudera Certified Technology
![Page 38: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/38.jpg)
39© Cloudera, Inc. All rights reserved.
Cloudera Certified Technology Partners
Data Sources Data IngestProcess, Refine
& PrepData Discovery Advanced Analytics
Connected Machines/Data sources
Other Data Sources
![Page 39: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/39.jpg)
40© Cloudera, Inc. All rights reserved.
A certified product ensures it integrates with a secure cluster
• Authenticate via Kerberos or LDAP
Authentication
• Handle Apache Sentry with Hive, Impala, Search, HDFS
Authorization
• Support HDFS transport encryption, at-rest encryption; support SSL/TLS connection encryption
Encryption
![Page 40: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/40.jpg)
41© Cloudera, Inc. All rights reserved.
Cloudera SDX
![Page 41: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/41.jpg)
42© Cloudera, Inc. All rights reserved.
Cloudera Enterprise
42
The modern platform for machine learning and analytics optimized for the cloud
EXTENSIBLE SERVICES
CORE SERVICESDATA
ENGINEERINGOPERATIONAL
DATABASEANALYTIC DATABASE
DATA CATALOG
INGEST & REPLICATION
SECURITY GOVERNANCEWORKLOAD
MANAGEMENT
DATA SCIENCE
S3 ADLS HDFS KUDUSTORAGESERVICES
![Page 42: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/42.jpg)
43© Cloudera, Inc. All rights reserved.
• Unified security – protects sensitive data with consistent
controls, even for transient and recurring workloads
• Consistent governance – enables secure self-service access
to all relevant data and increases compliance
• Easy workload management – increases user productivity
and boosts job predictability
• Flexible ingest and replication – aggregates a single copy of
all data, provides disaster recovery, and eases migration
• Shared catalog – defines and preserves structure and
business context of data for new applications and partner
solutions
Open platform servicesBuilt for multi-function analytics | Optimized for cloud
![Page 43: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/43.jpg)
44© Cloudera, Inc. All rights reserved.
Successful use cases
![Page 44: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/44.jpg)
45© Cloudera, Inc. All rights reserved.
Cloudera Overview & Financial Services Focus
2000Strong Partner
Ecosystem
+
1600 Employees Globally
+
19 Of the 30 G-SIBs Run on Cloudera
Strong Focus & Momentum in Financial Services
3 Of the Fortune 500
Top 5 Insurers Run on Cloudera
5 Of the Top 6 Asset Management Firms
Run on Cloudera
200+ Financial Services Customers
![Page 45: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/45.jpg)
47© Cloudera, Inc. All rights reserved.
Building a Fantastic Customer Experience
• Improved customer experience• 80 percent reduction in operating costs
through a wide-range of customer service and operational improvements
• Decrease in cost to service customers while increasing revenue through better service
CUSTOMER 360
FINANCIAL SERVICES» PREDICTIVE ANALYTICS» 360 CUSTOMER VIEW» OPERATIONAL ANALYTICS
![Page 46: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/46.jpg)
48© Cloudera, Inc. All rights reserved.
Large healthcare provider enables practitioners to recommend at-home actions to prevent hospital visits
• Flexible, automatic data classification for diverse medical ontologies
• Self-service data discovery for real-time, data-driven decisions
![Page 48: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/48.jpg)
50© Cloudera, Inc. All rights reserved.
More information on Hadoop Security
![Page 49: Security implementation on hadoop](https://reader033.vdocuments.mx/reader033/viewer/2022051710/5a6488ca7f8b9a36568b4cfb/html5/thumbnails/49.jpg)
51© Cloudera, Inc. All rights reserved.
Books authored by Clouderans