cloudera godatafest security and governance

18
Cloudera Security & Governance Wim Villano, Sales Engineer Cloudera

Upload: godatadriven

Post on 21-Jan-2018

27 views

Category:

Business


1 download

TRANSCRIPT

Page 1: Cloudera GoDataFest Security and Governance

1© Cloudera, Inc. All rights reserved.

Cloudera Security & Governance

Wim Villano, Sales Engineer Cloudera

Page 2: Cloudera GoDataFest Security and Governance

2© Cloudera, Inc. All rights reserved.

Comprehensive, Compliance-Ready SecurityAuthentication, Authorization, Audit, and Compliance

AccessDefining what users and applications can

do with data

Technical Concepts:PermissionsAuthorization

DataProtecting data in the cluster from

unauthorized visibility

Technical Concepts:Encryption, Tokenization,

Data masking

VisibilityReporting on where data came from and how it’s being used

Technical Concepts:AuditingLineage

Cloudera Manager Apache Sentry Cloudera NavigatorNavigator Encrypt & Key Trustee | Partners

PerimeterGuarding access to

the cluster itself

Technical Concepts:Authentication

Network isolation

Page 3: Cloudera GoDataFest Security and Governance

3© Cloudera, Inc. All rights reserved.

Perimeter Security – Isolation, AuthenticationPreserve user choice of the right Hadoop service (e.g. Impala, Spark)

Conform to centrally managed authentication policies

Implement with existing standard systems: Active Directory (LDAP) and KerberosCloudera Manager

PerimeterGuarding access to

the cluster itself

Technical Concepts:Authentication

Network isolation

Page 4: Cloudera GoDataFest Security and Governance

4© Cloudera, Inc. All rights reserved.

Active Directory and Kerberos

• Manages Users, Groups, and Services• Provides username / password

authentication• Group membership determines Service

access

Active Directory

• Trusted and standard third-party• Authenticated users receive “Tickets”• “Tickets” gain access to Services

Kerberos

User authenticates

to AD

Authenticated user gets Kerberos

Ticket

Ticket grants access to

Services e.g. ImpalaUser

[ssmith]Password[***** ]

Page 5: Cloudera GoDataFest Security and Governance

5© Cloudera, Inc. All rights reserved.

Access Security Requirements

Provide users access to data needed to do their job

Centrally manage access policies

Leverage a role-based access control model built on AD

AccessDefining what users and applications can

do with data

InfoSec Concept:Authorization

Apache Sentry

Page 6: Cloudera GoDataFest Security and Governance

6© Cloudera, Inc. All rights reserved.

Authorization

• (Linux) POSIX: Directory, File• (Linux) ACL: Management of services/resources• Cloudera Sentry: RBAC within services• Impala, Hive, Search, Kafka

Page 7: Cloudera GoDataFest Security and Governance

7© Cloudera, Inc. All rights reserved.

RBAC and Centralized Authorization

Manage data access by role, instead of by individual user• Customer Support Rep has read access to US Customers• Broker Analyst has read access to US Transactions• Relationships between users and roles are established via groups

An RBAC policy is then uniformly enforced for all Hadoop services• Provides unified authorization controls• As opposed to tools for managing numerous, service specific

policies

Page 8: Cloudera GoDataFest Security and Governance

8© Cloudera, Inc. All rights reserved.

Unified Authorization with Apache SentrySentry provides unified authorization via:Fine-grained RBAC for Impala, Hive, Search and KafkaImpala/Hive permissions synced in HDFS for all other components (Spark, MapReduce, etc)

Goal: Unified authorization for all Hadoop services and applications

Sentry Perm.Read Access

to ALL Transaction

Data

Sentry RoleFraud

Analyst Role

GroupFraud

AnalystsSam Smith

Page 9: Cloudera GoDataFest Security and Governance

9© Cloudera, Inc. All rights reserved.

Visual Policy Management

Page 10: Cloudera GoDataFest Security and Governance

10© Cloudera, Inc. All rights reserved.

AuditorRead-OnlyLimited OperatorOperatorConfiguratorCluster Administrator

BDR AdministratorNavigator AdministratorUser AdministratorKey AdministratorFull Administrator -

Cloudera Manager Roles - Separation of Duties

Page 11: Cloudera GoDataFest Security and Governance

11© Cloudera, Inc. All rights reserved.

Cloudera Manager Role Permissions

Page 12: Cloudera GoDataFest Security and Governance

12© Cloudera, Inc. All rights reserved.

Data Security Requirements

Perform analytics on regulated data

Encrypt data, conform to key management policies, protect from root

Integrate with existing HSM as part of key management infrastructure

DataProtecting data in the cluster from

unauthorized visibility

InfoSec Concept:Compliance

Navigator Encrypt & Key Trustee

Page 13: Cloudera GoDataFest Security and Governance

13© Cloudera, Inc. All rights reserved.

Compliance-Ready Encryption & Key ManagementCloudera’s Solution:• ALL data encrypted: HDFS, HBase,

metadata, log files, ingest paths

• Enterprise Key Management via Navigator Key Trustee

• Configuration support via Cloudera Manager

• Audit integration to Cloudera Navigator

• Optional root-of-trust integration with HSMs

Manager Navigator

Impala Hive

HDFS HBase

Sentry

Navigator Key Trustee

Log Files

Metadata Store

Encrypted Data

Encryption Key

Legend

Ingest Paths

Page 14: Cloudera GoDataFest Security and Governance

14© Cloudera, Inc. All rights reserved.

Encryption Firewall

CM Agents

End Points

Page 15: Cloudera GoDataFest Security and Governance

15© Cloudera, Inc. All rights reserved.

Visibility Security Requirements

Understand where report data came from and discover more data like it

Comply with policies for audit, data classification, and lineage

Centralize the audit repository; perform discovery; automate lineage

VisibilityReporting on where data came from and how it’s being used

InfoSec Concept:Audit

Cloudera Navigator

Page 16: Cloudera GoDataFest Security and Governance

16© Cloudera, Inc. All rights reserved.

Governance is the Foundation of Data ManagementCompliance

Track, understand and protect access to data

Am I prepared for an audit?

Who’s accessing sensitive data?

What are they doing with the data?

Is sensitive data governed and protected?

StewardshipManage and organize data

assets at Hadoop scale

How can I efficiently manage data lifecycle, from ingest to purge?

How can I efficiently organize and classify all

my data?

How can I efficiently make data available to

my end users?

End User ProductivityEffortlessly find and trust

the data that matters most

How can I find explore data sets on my own?

Can I trust what I find?

How do I use what I find?

How do I find and use related data sets?

AdministrationBoost user productivity

and cluster performance

Is my data optimized to support current access

patterns?

How can I optimize for future workloads?

How can I migrate workloads to Hadoop

risk-free?

Hadoop Governance Foundation

Centralized audits Unified metadata catalog Comprehensive lineage Data policies

Page 17: Cloudera GoDataFest Security and Governance

17© Cloudera, Inc. All rights reserved.

Thank youWim Villano

Page 18: Cloudera GoDataFest Security and Governance

18© Cloudera, Inc. All rights reserved.

Reference Architecture