data masking using enterprise manager
DESCRIPTION
TRANSCRIPT
<Insert Picture Here>
Data Masking using Enterprise Manager – Managing Sensitive Information in
Non-Production EnvironmentsOfir ManorSenior Technology Specialist, Oracle [email protected]
Agenda
• Introduction
• Data Masking Overview
• Data Masking Examples
• Related EM technology
Agenda
• Introduction
• Data Masking Overview
• Data Masking Examples
Securing Production Environment
• In recent years, increasing attention is given to securing the production environment:• Regulatory Requirements (you know the list…)• Internet access every where (customers, partners)• Increasing threats• Increasing awareness to inside and outside threats
• Oracle Database has a lot of functionality for this. For example:• Authentication – ASO (Advanced Security Options)• Network Traffic Encryption - ASO• Data At Rest Encrypting – ASO’s Transparent Data encryption,
Oracle Secure Backup• Access Control – privileges, roles, VPD, Label Security…• Auditing – regular audit, Fine-Grained Audit, Oracle Audit Vault• Limiting “Super Users” – Oracle Data Vault
What About Other Environments?
• Important systems -> many environments• Pre-prod, test, dev, training
• Usually more than one of each type
• Sensitive information all over the place
• QA / dev can usually do anything in these environments.
• DBAs / sys admins can usually do anything in these environments
• Sometimes partners have full access to these environments (consultants, outsourcing dev / testing / monitoring etc)
• Are these environments audited?• Do you practice careful access control?
What Can Be Done?
There are two options:
1. Heavily investigate in securing all your database environments• Adds IT administrative overhead – auditing,
privilege management etc
• Annoying QA / dev – “Not fun”
• Will be always in lower priority
• Might be neglected, worked around etc over time
2. Make sure no sensitive data arrives to these environments• Mask the data while provisioning these
environments
• Sensitive data can not leak if it’s not there
• An elegant, compliant solution
Agenda
• Introduction
• Data Masking Overview
• Data Masking Examples
What is data masking?
What• The act of anonymizing customer,
financial, or company confidential data to create new, legible data which retains the data's properties, such as its width, type, and format.
Why• To protect confidential data in test
environments when the data is used by developers or offshore vendors
• When customer data is shared with 3rd parties without revealing personally identifiable information
LAST_NAME SSN SALARYAGUILAR 203-33-3234 40,000
BENSON 323-22-2943 60,000
D’SOUZA 989-22-2403 80,000
FIORANO 093-44-3823 45,000
LAST_NAME SSN SALARYANSKEKSL 111—23-1111 40,000
BKJHHEIEDK 111-34-1345 60,000
KDDEHLHESA 111-97-2749 80,000
FPENZXIEK 111-49-3849 45,000
Major features• Data mask format library• Define once; execute multiple times• View sample data before masking• Automatic database referential integrity when masking
primary keys• Implicit – database enforced
• Explicit – application enforced
• Installed as part of Oracle Enterprise Manager (Grid Control) 10g Release 4 (10.2.0.4)
Enterprise ManagerData Masking Pack
Production Staging
MaskTest
TestCloneClone
Format Libraries
• Mask Primitives• Random Number
• Random String
• Random Date within range
• Shuffle
• Sub string of original value
• Table Column
• User Defined Function • National Identifiers
• Social Security Numbers
• Credit Card Numbers
Example – Create a New Format
User-defined mask formatsEmail notification testing
Masking Definitions
• Associates formats with database• Maps formats to table columns
being masked
• Defines dependent columns
• Associated Database target
• Automatically identifies Foreign key relationships
• Can specify undeclared constraints as related columns
• Import-from or export-to XML• “Create like” to apply to similar
databases
Referential Integrity Enforcement
Database-enforced
Application-enforced
Pre-Masking Validation
• Ensure uniqueness can be maintained
• Ensure formats match column data types
• Check Space availability• Warn about Check
Constraints• Check presence of default
Partitions
Masking WorkflowS
ecu
rity
A
dm
inD
BA
Identify Data
Formats
Identify Sensitive
Information
Format
Library
Masking Definition
StagingProd Test
Review Mask Definition
Execute Mask
Clone Prod to Staging
Clone Staging to Test
Performance
• Optimizations• SQL Parallelism for tables > 1 million rows• Statistics collection before & after masking• CTAS statement with NOLOGGING
• Test results• Case 1
• 60GB Database• 100 tables, 215 columns• 20mins
• Case 2• 6 column, 100 million row table• Random Number• 1.3 hours
Data Masking Pack feature details
• Data Masking primitives• Random numbers• Random digits• Random strings• Random date• User defined function (PL/SQL)• Exportable and importable format
definition (XML-based)• Masking algorithms
• Unique value generation• Shuffle• Constant
• Mask definition• Association of masking formats with
application schema• Related application columns without
defined constraints in data dictionary• Exportable and importable XML mask
definitions• “Create Like” to apply mask definition to
other databases
• Validation• Mask validation with data type
• Data overflow validation
• Multiple parent FKs, circular dependency, constraints
• Automatic exclusion of CLOB, BLOB, NCLOB, LONG, LONG RAW, XML column types
• Imported mask definition validated against database schema
• Space availability check
• Efficiency• One bulk operation per table
regardless of number of masked columns
• CTAS to recreate masked table
• Leverage database features, e.g. parallelism, no logging.
Agenda
• Introduction
• Data Masking Overview
• Data Masking Examples
Handling First / Last Name
• Using Shuffle• Useful if first name and last name are in different columns• Preserves real values and real data distribution• Bigger data sets minimize leak risk
• Using Random Strings• Really random• Not real names, different data distribution
• Using a table based lookup• Example – fakenamegenerator.com
Israeli ID Number
• Israeli ID Number uses a check digit• IsraCard, Mastercard etc also uses some kind of check digit
• The check digit protects from:• One digit error• Two adjacent digits replaced
• The algorithm is well documented• Easy to write a function to do it
Israeli ID Number Algorithm
Israeli ID Number Algorithm
Israeli ID Number Algorithm
Israeli ID Number - Format
Agenda
• Introduction
• Data Masking Overview
• Data Masking Examples
Q A&