dbta data summit : eliminating the data constraint in application development
TRANSCRIPT
Eliminating the data constraint in Application Development
Kyle Hailey, Technical Evangelist at Delphix
Technology Disruption
“Software is eating the world.”- Marc Andreessen
Increasing Commoditization
Competitive Pressures
• Problem : Data Constraint• Solution : Virtual Data• Use Cases : Development, Security, Cloud
In this presentation :
The Phoenix Project
What is the constraint
in IT ?
Flow of Features
Product Management
Development
QAIntegration
testing
Deployment
Testing
Customer
Flow of Features
6
Product Management
Development
QAIntegration
testing
Deployment
Testing
Customer
1
DevelopmentEnvironments
2
QA & Testing Environments
Product ManagementFeatures
2 2
Code Architecture 3Code Speed
4 5
Data
Development Pipeline for QA
SQL
Build Deploy
Environment
Database
8
PRODDEV Test UAT
DBA
Sys Admin
Storage Admin
Legacy Data Movement: Slow & expensive
?
Slow environment builds: delays
9
Development Pipeline for QA
0 2 4 6 8 10 12 14 16 18 20 22 24
ResetTest ResetTest ResetTest
Physical Data
Wait Time
Hours
Refresh( > 80%)
Testing (< 20%)
10
Data Management not Agile
• 20% SDLC time lost waiting for data
• 60% dev/QA time consumed by data tasks
Conclusion:
Data management does not scale to Agile
- Infosys
Data is the Constraint
11
Application Development Constraints
1. Not enough resources2. Bad test data leading to bugs3. Slow environment builds
1. Not Enough Resources: shared bottlenecks
Frustration Waiting
1. Not Enough Resources : bugs because of old data
Old Unrepresentative Data
1. Not enough resources: limited environments
2. Bad data leads to bugs: subsets
16
Production
2. Bad data leads to bugs: Production Wall
2. Bad data leads to bugs: late stage bugs
Dev QA UAT Production
# bugsFound
2. Bad data leads to bugs: late stage bugs
1 2 3 4 5 6 70
10203040506070
Cost ToCorrect
Software Engineering Economics – Barry Boehm (1981)
Dev Testing UAT Production
Developer Asks for DB
Get Access
Manager approves
DBA Request system
Setup DB
System Admin
Requeststorage
Setup machine
Storage Admin
Allocate storage (take snapshot)
3. Slow environment builds: delays
Companies unaware
Could I have a copy of the production DB ?
Developer, tester or AnalystBoss, Storage Admin, DBA
• Data Constraint• Solution• Use Cases
In this presentation :
Development UATQA
99% of blocks are identical
Solution
Development QA UAT
Thin Clone
Three Technologies
Production
DevelopmentStorage
Provision
Synchronize (copy)
Clone (snapshot)
Install Delphix on Intel hardware
• .• .• .• .• .• Data• .• Binaries• Application Stacks• EBS • SAP• Flat files
Allocate Any Storage to Delphix
Any Storage
Pure Storage + DelphixBetter Performance for 1/10 the cost
29© 2015 Delphix. All Rights Reserved. Private & Confidential.
One time backup of source database
Production
3 TB1 TB
30© 2015 Delphix. All Rights Reserved. Private & Confidential.
One time backup of source database
Production
3 TB1 TB
31© 2015 Delphix. All Rights Reserved. Private & Confidential.
Three Physical CopiesThree Virtual Copies
32
PROD DEV DEV Test Test UAT
Data as a Service : fast, elastic, secure
Self Service
• Problem in the Industry• Solution• Use Cases
1. Development 2. Security3. Cloud Migration
Use Cases
Development: Virtual Data
Development
Virtual Data: Parallelize
gif by Steve Karam
Virtual Data: Full size
Production
Virtual Data: Self Service
Environments: increase the limit
Physical Data : late stage bugs
Dev QA UAT Production
Dev Testing UAT Production0
50
100
150
200
250
300
350
400
450
500
Bugs Discovered Legacy
Physical Data : find bugs fast
Dev QA UAT Production
Dev Testing UAT Production
1 2 3 4 5 6 70
10203040506070
Cost ToCorrect
42
RefreshTest RefreshTest RefreshTest
Virtual Data : Fast Refresh
0 2 4 6 8 10 12 14 16 18 20 22 24Hours
Virtual Data
Physical Data
Bookmark, Reset
99% Less Downtime Data FederationVersion ControlBookmark and BranchQuickly Refresh Sync across data sources
Virtual Data: Version Control
43
Dev Dev
2.1 2.2
Production Time Flow
Live Archive data for years• Archive EBS R11 before upgrade to R12• Sarbanes-Oxley• Dodd-Frank• Financial Stress tests
Production
1. Development & QA2. Security3. Cloud Migration
Use Cases
Tradition Protection: Network & Perimeter
EndpointsPerimeter DefenseProtect the Interior
Encryption
Network Intrusion Detection
Endpoint Defense
“Organizations should use data Masking to protect sensitive data at rest and in transit from insiders' and outsiders' attacks.”
- Gartner Magic Quadrant for Data Masking Technology
Insider Threats Are Costly
Botnets
Viruses, worms, trojans
Malware
Stolen devices
Malicious code
Phishing & social engineering
Web-based attacks
Denial of services
Malicious insiders
$1,075
$1,900
$7,378
$33,565
$81,500
$85,959
$96,424
$126,545
$144,542
Average Annualized Cyber Crime Cost Weighted by Attack Frequency
Consolidated view, n = 252 separate companies
2015 Global Cost of Cyber Crime Study, Ponemon Institute
• Ease of Use• Instant data,
no copying• Consistent
across data centers and databases vendors
Costs moreQuality is lower
Hard to mask consistently
Moving data from prod to non-prod takes a long time
Delphix Virtual Data Masking
• Automates discovery • Provides different masking algorithms for different data types• Mask once clone many with thin cloning
Mask Data
6 hours Clone 18 Hours
Clone15 min
Mask Data
Mask4
hours
Mask Data
Production Dev, QA, UAT Reporting BackupSecurity problem
Production Dev, QA, UAT Reporting SandboxSecurity management improvement
ProductionDev, QA, UAT Reporting Sandbox
Security Solution
1. Development & QA2. Security3. Cloud Migration
Use Cases
53
Migration to Cloud
Three Clones=Moving 3 x the Source
54
Migration to Cloud with Delphix
Three Clones=Moving 1/3 of Source Size
55
Cloud OptimizationsON PREMISE /
PRIVATE CLOUD
Replication
Encrypted
Compressed
Masked
56
Cloud OptimizationsON PREMISE /
PRIVATE CLOUD
57
Cloud Optimizations
$$$
ON PREMISE / PRIVATE CLOUD
58
Cloud OptimizationsON PREMISE /
PRIVATE CLOUD
59
Cloud OptimizationsON PREMISE /
PRIVATE CLOUD
60
Cloud OptimizationsON PREMISE /
PRIVATE CLOUD
61
Cloud OptimizationsON PREMISE /
PRIVATE CLOUD
1. Development & QA– Dev throughput increase by 2x
2. Secure– Mask once, clone many
3. Cloud Enablement– Compressed, encrypted replication– active/active replication
Summary
• Projects “12 months to 6 months.”– New York Life
• Insurance product “about 50 days ... to about 23 days”– Presbyterian Health
• “Can't imagine working without it”– State of California
Virtual Data Quotes
Thank you!• Kyle Hailey - Technical Evangelist (Oracle Ace Director, Oaktable)
– [email protected]– kylehailey.com– slideshare.net/khailey– @virtdata