mlw 2014 - data governance for regulated industries
DESCRIPTION
Securely and cost-effectively managing petabytes of data from siloed systems is both a threat and an opportunity for banking, healthcare, and other organizations in highly regulated industries. Technology advancements and the changing economics of storage and compute have made it possible to leverage this data to do more far-reaching and sophisticated analysis. However, sweeping changes to privacy and transparency laws have heightened the importance of data governance. In this session we will examine best practices around the use of MarkLogic as part of a regulated data environment, including retention, provenance, privacy, and security. Drawn from production projects impacted by Dodd-Frank, Basel III, and FATCA, we will illustrate architecture and governance policies across real-time operational and long-tail historical data.TRANSCRIPT
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Data Governance for Regulated Industries Amir Halfon, CTO Financial Services Jim Clark, Senior Director, Product Management MarkLogic World, June2014
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 2
Hello, my name is Big Data focused for the last 8 years Enterprise Database & Start-ups in NoSQL & Big Data Focus on Secutiry, Bitemporal and Hadoop
Hello, my name is CTO for Financial Services at MarkLogic for 2 years Previously at Oracle, Sun Microsystems Many years in the FinServe sector
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 3
Agenda
Data governance considerations Legacy approaches: How we got here Case studies: Solutions Q&A
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 4
Data Governance Considerations
Security
Privacy Continuity
Provenance Compliance
Retention
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 5
Data Governance Considerations
Security
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 6
Data Governance Considerations
Security
Privacy
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 7
Data Governance Considerations
Security
Privacy
Provenance
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 8
Data Governance Considerations
Security
Privacy
Provenance
Retention
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 9
Data Governance Considerations
Security
Privacy Continuity
Provenance
Retention
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 10
Data Governance Considerations
Security
Privacy Continuity
Provenance Compliance
Retention
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 11
Why is this difficult? And risky?
And expensive? And behind schedule?
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 12
Last Generation
OLTP
Warehouse
Data Marts Archives
“Unstructured”
“ ”
Video Audio
Signals, Logs, Streams
Social
Documents, Messages
{ } Metadata
Search 🔍
Reference Data
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 13
Can anything be done?
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 15
Enterprise NoSQL
Flexible data model, comprehensive indexes o Documents: Hierarchy, text, values, tags—schema “on-read” o Scalars: Aggregates and range filters, including geospatial o Triples: Linked facts and inferencing o Permissions: Users, roles, compartments, and privileges o Queries: Reverse indexes for alerting, matching
In-memory writes, lock-free reads Ad hoc dimensions, real-time transformation Strict consistency throughout
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 17
Case Studies
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 18
Financial Services Case Studies
Records Retention and Investigations Trade Operational Data
Store Regulatory Compliance Customer On-Boarding
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 19
Case Study: Records Retention and Investigations
Accurately respond to litigation Hold, review, produce data across current, legacy systems Repatriate and reconcile distributed data Demonstrate fidelity and audit trail Reduce infrastructure and maintenance costs
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 20
Old Generation Records Retention and Investigations
Oracle
Mainframe
Sybase
87 total systems
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 21
New Generation Records Retention and Investigations
Oracle
MarkLogic Mainframe
Sybase
87 total systems
Shared Storage NAS HDFS
Ingest
Offline
Query
Replication
MarkLogic
100TB 40TB
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 22
Case Study: Operational Trade Data Store
Comply with regulations requiring operational insights Quickly operationalize business innovation Support risk management requirements Reduce costs per trade
Trade processing exceptions
infrastructure and maintenance costs
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 23
ETL
Old Generation Operations and Analytics
Multiple Relational Data Stores for different instrument types
…
Limited, fragmented analytics and reporting capabilities
Long, costly development cycles
Derivatives Rates
FX
…etc.
Expensive, error-prone post-trade processing
…etc.
Matching Clearing
Settlement
ETL ETL
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 24
New Generation Operations, Compliance and Analytics
Executed Trades
HDFS
Clearing Settlement …etc. Matching
Exceptions Management
Simplified workflow architecture
Tiered Storage using Hadoop
Post Trade Processing
Surveillance, Risk & Compliance
Historical Analysis
MarkLogic
Single ODS for all instrument types persisted as-is off a message bus
Single source of truth
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 25
Case Study: On-Boarding Compliance
Thousands of rules, 1–2M accounts, 30–40M documents Encoding, adjusting, and matching rules must scale Impossible to pre-define dimensions, relationships Vet new accounts and “show your work” Real-time decision-making
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 26
Old Generation On-Boarding Compliance
Documents Policies Regulations
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 27
New Generation On-Boarding Compliance
Documents
MarkLogic
Onboarding Workflow
Policies Regulations
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 28
Case Study: Dodd Frank Compliance
Trace lineage of order lifecycle for OTC derivatives Search, link supporting communications, documents Strict reporting and retention rules, response times Existing policies, point solutions don’t scale
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 29
Old Generation Regulatory Compliance
Reference Data
Reporting
Categorization Linking
Trade Records
Operations
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 30
New Generation Regulatory Compliance
MarkLogic
Operations
Reporting Surveillance Ad hoc analysis
Categorization Enrichment Linking
Reference Data
Trade Records
{ } Metadata
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 31
Enrichment and Linking
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 32
Management Dashboard
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 33
What now?
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 34
New Generation Data Governance
Security
Privacy Continuity
Provenance Compliance
Retention
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 35
Take-Aways
New and more data is both an opportunity and a threat Last generation of data management is not sufficient More copies, representations, transformations increase risk ETL and up-front modeling reduce agility Index once and reuse across workloads, lifecycle
© COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 37
SECURE Minimize duplication,
costly ETL, reduce risk
REAL-TIME Interactive search, delivery & analytics
MARKLOGIC
OPERATIONAL Run enterprise, mission-critical
applications