don't let security be the 'elephant in the room

41
Don’t Let Security Be The ‘Elephant in the Room’ Enterprise Security for Big Data Mitch Ferguson, VP Business Development, Hortonworks Jeremy Stieglitz, VP Business Development, Voltage Security 06/15/2022

Upload: hortonworks

Post on 10-May-2015

1.459 views

Category:

Technology


0 download

DESCRIPTION

Enterprise security for big data

TRANSCRIPT

Page 1: Don't Let Security Be The 'Elephant in the Room

04/11/2023

Don’t Let Security Be The ‘Elephant in the Room’Enterprise Security for Big Data

Mitch Ferguson, VP Business Development, Hortonworks

Jeremy Stieglitz, VP Business Development, Voltage Security

Page 2: Don't Let Security Be The 'Elephant in the Room

© Hortonworks Inc. 2013

HortonworksCommunity DrivenEnterprise Apache Hadoop

June 2013

Page 2

Page 3: Don't Let Security Be The 'Elephant in the Room

© Hortonworks Inc. 2013

A Brief History of Apache Hadoop

Page 3

2013

Focus on INNOVATION2005: Yahoo! creates

team under E14 to work on Hadoop

Focus on OPERATIONS2008: Yahoo team extends focus to

operations to support multiple projects & growing clusters

Yahoo! begins to Operate at scale

EnterpriseHadoop

Apache Project Established

HortonworksData Platform

2004 2008 2010 20122006

STABILITY2011: Hortonworks created to focus on “Enterprise Hadoop“. Starts with

24 key Hadoop engineers from Yahoo

Page 4: Don't Let Security Be The 'Elephant in the Room

© Hortonworks Inc. 2013

Hortonworks Snapshot

Page 4

• We distribute the only 100% Open Source Enterprise Hadoop Distribution: Hortonworks Data Platform

• We engineer, test & certify HDP for enterprise usage

• We employ the core architects, builders and operators of Apache Hadoop

• We drive innovation within Apache Software Foundation projects

• We are uniquely positioned to deliver the highest quality of Hadoop support

• We enable the ecosystem to work better with Hadoop

Develop Distribute Support

We develop, distribute and support the ONLY 100% open source Enterprise Hadoop distribution

Endorsed by Strategic Partners

Headquarters: Palo Alto, CAEmployees: 200+ and growingInvestors: Benchmark, Index, Yahoo

Page 5: Don't Let Security Be The 'Elephant in the Room

© Hortonworks Inc. 2013

Enabling Hadoop as Enterprise Big Data Platform

OP

ER

ATIO

NSE

CO

SY

STE

MDEVELOPER

Enterprise R

eady & E

asy to Use

Data Platform Services & Open APIs

Ena

ble

Eco

syst

em a

t Eac

h La

yer

Hortonworks Data Platform

Applications,

Business Tools,

Development Tools,

Open APIs and access

Data Movement & Integration,

Data Management Systems,

Systems Management

Installation & Configuration,

Administration,

Monitoring,

High Availability,

Replication,

Multi-tenancy, ..

Metadata, Indexing, Search, Security, Management, Data Extract & Load, APIs

Page 5

Page 6: Don't Let Security Be The 'Elephant in the Room

© Hortonworks Inc. 2013

Hortonworks Partner Eco-System 140+

Page 6

Page 7: Don't Let Security Be The 'Elephant in the Room

© Hortonworks Inc. 2013Page 7

Apache Software Foundation Guiding Principles• Release early & often• Transparency, respect, meritocracy

Key Roles held by Hortonworkers• PMC Members

– Managing community projects– Mentoring new incubator projects– Over 20 Hortonworkers managing community

• Committers– Authoring, reviewing & editing code– Over 50 Hortonworkers across projects

• Release Managers– Testing & releasing projects– Hortonworkers across key projects like Hadoop,

Hive, Pig, HCatalog, Ambari, HBase

ApacheHadoop

Test &Patch

Design & Develop

Release

ApachePig

ApacheHCatalo

gApacheHBase

Other Apache Projects

ApacheHive

Apache Ambari

“We have noticed more activity over the last year from Hortonworks’ engineers on building out Apache Hadoop’s more innovative features. These include YARN, Ambari and HCatalog..”

- Jeff Kelly: Wikibon

Apache Community Leadership

Page 8: Don't Let Security Be The 'Elephant in the Room

© Hortonworks Inc. 2013

Leadership that Starts at the Core

Page 8

• Driving next generation Hadoop– YARN, MapReduce2, HDFS2, High

Availability, Disaster Recovery

• 420k+ lines authored since 2006– More than twice nearest contributor

• Deeply integrating w/ecosystem– Enabling new deployment platforms

– (ex. Windows & Azure, Linux & VMware HA)

– Creating deeply engineered solutions– (ex. Teradata big data appliance)

• All Apache, NO holdbacks– 100% of code contributed to Apache

Page 9: Don't Let Security Be The 'Elephant in the Room

© Hortonworks Inc. 2013

Hortonworks Process for Enterprise Hadoop

Page 9

Upstream Community Projects Downstream Enterprise Product

HortonworksData Platform

Design & Develop

Distribute

Integrate & Test

Package & Certify

ApacheHCatalo

g

ApachePig

ApacheHBase

Other Apache Projects

ApacheHive

Apache Ambari

ApacheHadoop

Test &Patch

Design & Develop

Release

Virtuous cycle when development & fixed issues done upstream & stable project releases flow downstreamNo Lock-in: Integrated, tested & certified distribution lowers risk by ensuring close alignment with Apache projects

Stable Project Releases

Fixed Issues

“We have noticed more activity over the last year from Hortonworks’ engineers on building out Apache Hadoop’s more innovative features. These include YARN, Ambari and HCatalog.” - Jeff Kelly: Wikibon

Page 10: Don't Let Security Be The 'Elephant in the Room

© Hortonworks Inc. 2013

Enhancing the Core of Apache Hadoop

Deliver high-scale storage & processing with enterprise-ready platform services

Unique Focus Areas:• Bigger, faster, more flexible

Continued focus on speed & scale and enabling near-real-time apps

• Tested & certified at scale Run ~1300 system tests on large Yahoo clusters for every release

• Enterprise-ready servicesHigh availability, disaster recovery, snapshots, security, …

Page 10

HADOOP CORE

Hortonworkers are the architects, operators, and builders of core Hadoop

Distributed Storage & Processing

PLATFORM SERVICES Enterprise Readiness

Page 11: Don't Let Security Be The 'Elephant in the Room

© Hortonworks Inc. 2013Page 11

HADOOP CORE

DATASERVICES

Provide data services to store, process & access data in many ways

Unique Focus Areas:• Apache HCatalog

Metadata services for consistent table access to Hadoop data

• Apache Hive Explore & process Hadoop data via SQL & ODBC-compliant BI tools

Distributed Storage & Processing

Hortonworks enables Hadoop data to be accessed via existing tools & systems

Store, Process and Access Data

PLATFORM SERVICES Enterprise Readiness

Data Services for Full Data Lifecycle

Page 12: Don't Let Security Be The 'Elephant in the Room

© Hortonworks Inc. 2013

Operational Services for Ease of Use

Page 12

OPERATIONAL SERVICES

Include complete operational services for productive operations & management

Unique Focus Area:• Apache Ambari:

Provision, manage & monitor a cluster; complete REST APIs to integrate with existing operational tools; job & task visualizer to diagnose issues

Only Hortonworks provides a complete open source Hadoop management tool

Manage & Operate at

Scale

DATASERVICES

Store, Process and Access Data

HADOOP CORE Distributed Storage & Processing

PLATFORM SERVICES Enterprise Readiness

Page 13: Don't Let Security Be The 'Elephant in the Room

© Hortonworks Inc. 2013Page 13

Only Hortonworks allows you to deploy seamlessly across any deployment option

• Linux & Windows• Azure, Rackspace & other clouds• Virtual platforms• Big data appliances

Deployable Across a Range of Options

OS Cloud VM Appliance

PLATFORM SERVICES

HADOOP CORE

Enterprise Readiness

HORTONWORKS DATA PLATFORM (HDP)

OPERATIONAL SERVICES

DATASERVICES

Distributed Storage & Processing

Manage & Operate at

Scale

Store, Process and Access Data

Page 14: Don't Let Security Be The 'Elephant in the Room

© Hortonworks Inc. 2013

OS Cloud VM Appliance

HDP: Enterprise Hadoop Distribution

Page 14

PLATFORM SERVICES

HADOOP CORE

Enterprise Readiness

HORTONWORKS DATA PLATFORM (HDP)

OPERATIONAL SERVICES

DATASERVICES

Hortonworks Data Platform (HDP)Enterprise Hadoop

• The ONLY 100% open source and complete distribution

• Enterprise grade, proven and tested at scale

• Ecosystem endorsed to ensure interoperability

Distributed Storage & Processing

Manage & Operate at

Scale

Store, Process and Access Data

Page 15: Don't Let Security Be The 'Elephant in the Room

© Hortonworks Inc. 2013

OS Cloud VM Appliance

HDP: Enterprise Hadoop Distribution

Page 15

PLATFORM SERVICES

HADOOP CORE

Enterprise ReadinessHigh Availability, Disaster Recovery,Security and Snapshots

HORTONWORKS DATA PLATFORM (HDP)

OPERATIONAL SERVICES

DATASERVICES

HIVE & HCATALOG

PIG HBASE

OOZIE

AMBARI

HDFS

MAP REDUCE

Hortonworks Data Platform (HDP)Enterprise Hadoop

• The ONLY 100% open source and complete distribution

• Enterprise grade, proven and tested at scale

• Ecosystem endorsed to ensure interoperability

SQOOP

FLUME

NFS

LOAD & EXTRACT

WebHDFS

Page 16: Don't Let Security Be The 'Elephant in the Room

© Hortonworks Inc. 2013

OS/VM Cloud Appliance

HDP: Enterprise Hadoop Distribution 2.0

Page 16

PLATFORM SERVICES

HADOOP CORE

Enterprise ReadinessHigh Availability, Disaster Recovery, Rolling Upgrades, Security and Snapshots

HORTONWORKS DATA PLATFORM (HDP)

OPERATIONAL SERVICES

DATASERVICES

HIVE &HCATALOG

PIG HBASE

HDFS

MAP

Hortonworks Data Platform (HDP)Enterprise Hadoop

• The ONLY 100% open source and complete distribution

• Enterprise grade, proven and tested at scale

• Ecosystem endorsed to ensure interoperability

SQOOP

FLUME

NFS

LOAD & EXTRACT

WebHDFS

KNOX*

OOZIE

AMBARI

FALCON*

YARN*

TEZ* OTHERREDUCE

*included HDP 2.0

Page 17: Don't Let Security Be The 'Elephant in the Room

© Hortonworks Inc. 2013

Secure Hadoop Cluster

Apache Knox Gateway

Page 17

Browser

RESTClient

Masters

Slaves

JTNNWebHCat

Oozie

AA DN TTEnterprise

IdentityProvider

Firew

all

Firew

all

Ambari Server

YARN

Enterprise/Cloud SSO

Provider

Knox Gateway Cluster

GWGWGW

HUE

JDBCClient

HBaseHive

DMZ

Page 18: Don't Let Security Be The 'Elephant in the Room

© Hortonworks Inc. 2013

Big DataTransactions, Interactions, Observations

Hadoop Common Patterns of Use

Business Cases

HORTONWORKSDATA PLATFORM

Refine Explore Enrich

Batch Interactive Online

“Right-time” Access to Data

Page 18

Page 19: Don't Let Security Be The 'Elephant in the Room

© Hortonworks Inc. 2013

Operational Data RefineryDA

TA S

YSTE

MS

DATA

SO

URC

ES

1

3

1 Capture

Process

Distribute & Retain

2

3

Refine Explore Enrich

2

APPL

ICAT

ION

S

Transform & refine ALL sources of data

Also known as Data Reservoir or Catch Basin

TRADITIONAL REPOSRDBMS EDW MPP

Business Analytics

Custom Applications

Enterprise Applications

Traditional Sources (RDBMS, OLTP, OLAP)

New Sources (web logs, email, sensor data, social media)

Page 19

HORTONWORKS DATA PLATFORM

Page 20: Don't Let Security Be The 'Elephant in the Room

© Hortonworks Inc. 2013

Big Data Exploration & VisualizationDA

TA S

YSTE

MS

DATA

SO

URC

ES

Refine Explore Enrich

APPL

ICAT

ION

S

Leverage “data lake” to perform iterative investigation for value

3

2TRADITIONAL REPOS

RDBMS EDW MPP

1

Business Analytics

Traditional Sources (RDBMS, OLTP, OLAP)

New Sources (web logs, email, sensor data, social media)

Custom Applications

Enterprise Applications

1 Capture

Process

Explore & Visualize

2

3

Page 20

HORTONWORKS DATA PLATFORM

Page 21: Don't Let Security Be The 'Elephant in the Room

© Hortonworks Inc. 2013

DATA

SYS

TEM

SDA

TA S

OU

RCES

Refine Explore Enrich

APPL

ICAT

ION

S

Create intelligent applications

Collect data, create analytical models and deliver to online apps

3

1

2TRADITIONAL REPOS

RDBMS EDW MPP

Traditional Sources (RDBMS, OLTP, OLAP)

New Sources (web logs, email, sensor data, social media)

Custom Applications

Enterprise Applications

NOSQL

1 Capture

Process & Compute

Deliver Model

2

3

Page 21

Application Enrichment

HORTONWORKS DATA PLATFORM

Page 22: Don't Let Security Be The 'Elephant in the Room

Don’t Let Security Be The ‘Elephant in the Room’Enterprise Security for Big Data

Jeremy Stieglitz

Page 23: Don't Let Security Be The 'Elephant in the Room

Extracting Value from DataBig Data Now Includes Sensitive Data

• Marketing – analyze purchase patterns• Social media – find best customer segments• Financial systems – model trading data• Banking and insurance – 360° customer view• Security – identify credit card fraud• Healthcare – advance disease prevention

Copyright 2013 Voltage Security 23

How do you liberate the value in data – without increasing risk?

Page 24: Don't Let Security Be The 'Elephant in the Room

Hidden Risks in Big Data Adoption

Big Data Enables deeper data

analysis More value from old

data New risks if data is not

protected

24

Data Concentration Risks– Financial Positions– Market Position– Changes to big picture– Corporate Compliance risk

Cloud Adoption Risks– Sensitive data in untrusted

systems.– Data in storage, in use,

transmitted to cloud.

Data Sharing Risks– Compliance challenges

with 3rd party risk– Data in and out of the

enterprise

Breach Risks– Internal users– External shares– Backup’s, Hadoop

stores, data feeds.

Copyright 2013 Voltage Security

Page 25: Don't Let Security Be The 'Elephant in the Room

Data Security ApproachesIT Infrastructure Security

Database

Network

Access

Application

Security Gap

Security Gap

Security Gap

Full disk encryption

Transparent Database Encryption (TDE)

SSL/TLS

Authentication and Access Control

OS/Storage

Secu

rity

Cove

rage

Copyright 2013 Voltage Security 25

Page 26: Don't Let Security Be The 'Elephant in the Room

Data Security ApproachesIT Infrastructure Security

Database

Network

Access

Application

Secu

rity

Cove

rage

Security Gap

Security Gap

Security Gap

Full disk encryption

Transparent Database Encryption (TDE)

SSL/TLS

Authentication and Access Control

OS/Storage

• More keys• More secure• Less computation• Application aware

• Less keys• Less secure• More computation• Transparent

“check box” encryption, Available from cloud providers

Copyright 2013 Voltage Security 26

Page 27: Don't Let Security Be The 'Elephant in the Room

Traditional IT Infrastructure Security

Data-Centric Security Top down:

Application-layer data protection provides seamless end-to-end data security

Encrypt once, persistently protect from point of capture:in storage, in transit, in use

If attacked, data has no value

Database

Network

Access

Data/Application

OS / Storage

Secu

rity

Cove

rage

Secu

rity

Cove

rage

Full disk encryption

Transparent Database Encryption (TDE), triggers

SSL/TLS/Firewalls

Authentication and Access Control

Security Gap (Data in the Clear)

Security Gap (Data in the Clear)

Security Gap (Data in the Clear)

Security Gap (Data in the Clear)

Traditional IT Security vs. Data-centric security

Copyright 2013 Voltage Security 27

Page 28: Don't Let Security Be The 'Elephant in the Room

Requirements for Big Data Security

28

Lock data in place

More keys to manage

Horizontal support to wherever your data travels

Copyright 2013 Voltage Security

Page 29: Don't Let Security Be The 'Elephant in the Room

Data – structure, value, and meaning

Take a simple Tax ID. It’s more than just a number.

• It has a format and structure • It has value in being unique • It’s parts have value – e.g. last 4 digits

Copyright 2013 Voltage Security 29

Page 30: Don't Let Security Be The 'Elephant in the Room

Traditional Encryption Practically Eliminates Value in the Data

• Destroys the original value – makes data secure, but incompatible

• Changes format of data – requires schema changes• Changes size of field – increases storage• Always requires application and data flow changes: “Ripping up

the Roads”• Destroys any special encoding or checksums (Luhn checksum

in credit cards, driver’s license checksums for certain states)

934-72-2356Tax ID AES-CBC

uE28W&=209gX32F*52Encrypted Tax ID

Copyright 2013 Voltage Security 30

Page 31: Don't Let Security Be The 'Elephant in the Room

• Standard, proven mode of AES (NIST FFX mode – ask NIST)• Encrypt at capture. Data stays protected at all times• Fit into existing systems, protocols, schemas – any data• Enable operation on encrypted data – retains the value of the original data• Protect live data in applications & databases, business process or transactions• Create de-identified data for test, cloud apps, outsourcers• Can preserve validation checksums

Voltage Format-Preserving Encryption™ (FPE)

31

Credit Card934-72-2356

Tax ID

Regular AES 8juYE%Uks&dDFa2345^WFLERG

FPE 7412 3423 3526 0000 298-24-2356

Ija&3k24kQarotugDF2390^32

7412 3456 7890 0000

Copyright 2013 Voltage Security

Page 32: Don't Let Security Be The 'Elephant in the Room

Stateless Key Management

32

Keys when you need them, not when you don’t.• Keys derived on the fly• Simple - lower risk, lower cost• Scale to millions of users• Keys don’t stay resident• Standards Based• FPE/AES Symmetric keys• Structured and unstructured

data

Identity Based Encryption IEEE 1363.3

Copyright 2013 Voltage Security

Page 33: Don't Let Security Be The 'Elephant in the Room

High-performance Data Security

33

Voltage SecureData™ for Hadoop

Hadoop ecosystem: ETL tools, HIVE, MapReduce jobs, other query and analysis tools

Copyright 2013 Voltage Security

Page 34: Don't Let Security Be The 'Elephant in the Room

Three Insertion Points into Hortonworks Data Platform (HDP)

#1. Upon Ingest:APIs, CL, Batch toolsfor ETL, SQOOP, Streaming, etc.

Copyright 2013 Voltage Security 34

Page 35: Don't Let Security Be The 'Elephant in the Room

Three Insertion Points into Hortonworks Data Platform (HDP)

#2. Executed asMap Job

Copyright 2013 Voltage Security 35

Page 36: Don't Let Security Be The 'Elephant in the Room

Three Insertion Points into Hortonworks Data Platform (HDP)

#3. UDFs for PIG,Hive, etc.

Copyright 2013 Voltage Security 36

Page 37: Don't Let Security Be The 'Elephant in the Room

Benefits of Voltage SecureData

• Solves complex global compliance issues • Ensures data stays protected wherever it goes• Enables accurate analytics on encrypted data• Optimizes performance• Flexibly adapts to the fast-growing Hadoop ecosystem• Delivers maximum return on information – without

increased risk

Copyright 2013 Voltage Security 37

Page 38: Don't Let Security Be The 'Elephant in the Room

Use Case: Fortune 50 HealthcareProducts and Services Company

• Challenge– Sell new information-based services to

medical suppliers & drug companies– Big Data team tasked with securing 1000

node Hadoop cluster for HIPAA, HITECH

• Solution – Data de-identified in ETL move before

entering Hadoop– Ability to decrypt analytic results when

needed, through multiple tools

• Benefits – Ability to monetize existing medical data, and

fine-tune manufacturing and marketing

04/11/2023 38Copyright 2013 Voltage Security

Page 39: Don't Let Security Be The 'Elephant in the Room

04/11/2023 39

Use Case: BankingTop Worldwide Financial Institution

• Challenge– Credit risk and consumer fraud groups – PCI compliance is #1 driver– ETL offload use case with Hadoop alongside DW

• Solution– Integrate with Sqoop on ingestion, and Hive and Pig on

the applications / query side to protect 20 types of data– Fraud analysts work with SST tokenized credit card

numbers and only de-tokenize as needed

• Benefits– Enable fraud and risk analytics directly in Hadoop on

protected data– Use Hadoop processing with security and compliance for

faster time to insight

Copyright 2013 Voltage Security

Page 40: Don't Let Security Be The 'Elephant in the Room

40

Contacts

- Hortonworks- http://hortonworks.com/ - http://hortonworks.com/partners/certified-technology-program/

- USA: (855) 8-HORTON (1 for sales) - Intl: (408) 916-4121 (1 for sales)

- Voltage Security- http://www.voltage.com/- http://www.voltage.com/partners/technology-partners/hortonworks/

- Tel: +1 (408) 886-3200- [email protected]

Copyright 2013 Voltage Security

Page 41: Don't Let Security Be The 'Elephant in the Room

THANK YOU