applying data engineering and semantic standards to tame the "perfect storm" of data...

28
©2017 Cambridge Semantics Inc. All rights reserved. Company Confidential Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management March 2 nd , 2017 Marty Loughlin Vice President Cambridge Semantics 500 Boylston St., Suite 1700, Boston, MA www.cambridgesemantics.com [email protected] (o) 617.855.9565

Upload: cambridge-semantics

Post on 11-Apr-2017

315 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

©2017 Cambridge Semantics Inc. All rights reserved. Company Confidential

Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

March 2nd, 2017

Marty LoughlinVice PresidentCambridge Semantics500 Boylston St., Suite 1700, Boston, [email protected](o) 617.855.9565

Page 2: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

Introduction to Cambridge Semantics (CSI)

Agenda

• IntroductionMarty Loughlin, Vice President, Cambridge Semantics

• Financial Industry Data Challenges & Solution OverviewCarl Reed, Adviser, Cambridge Semantics

• Regulatory Perspective & FIBO UpdateMike Atkin, Managing Director, Enterprise Data Management Council

• State Street - FIBO Interest Rate Swap DemoArthur Keen, Managing Director, Cambridge Semantics

• Q&A

Page 3: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

The Anzo Smart Data LakeSmart Data Discovery, Analytics & Management

Company: Founded in 2007 by senior team from IBM’s Advanced Internet Technology Group Privately Funded Select customers:

Software: Market leading Anzo software suite is built on open Semantic Web standards 3rd generation of Anzo in production

Introduction to Cambridge Semantics (CSI)

MIT Innovation Showcase

Business Intelligence / Analytics Solutions

Page 4: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

Financial Industry Data Challenges & Solution OverviewCarl Reed, Adviser, Cambridge Semantics

Page 5: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

Trad

ing

Settl

& C

lear

Risk

Ope

ratio

ns

Ord

er M

gmt

Com

plia

nce

Trea

sury

Reg

Repo

rting

Ref D

ata

...

Enterprise Data Governance, Architecture & Execution

The World Most of Us Grew Up In

• Process Driven Architecture• Vertically Alligned Implementations

Regulatory

BCBS239, CCAR, MiFiD II, CATS, .....

Big D

ata

Mar

ket,

Clien

t, Op

erati

onal,

Risk

& R

ep

Operating Margins

Cybe

r Sec

urity

Data Center Mgm

t

DisruptionTension

Carl Reed February 24th 2017

Can We Turn Tension and Disruption into Opportunity?

Page 6: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

Three Key IngredientsThree Key Ingredients

Organization Structure

Technology ArchitectureCommon “Lingua Franca”

Enterprise Data

GOVERN

S

SPECIFIES IMPLEMENTS

Page 7: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

Data Engineering Data Science

Knowledge Engineering(Ontology)

Enterprise Data

External Data

Ontologies

Domain Expertise(Business SME’s)

Harmonized Data Expertise

Business Intelligence Requirements

New Intelligence

Scope

Semantic Mappings

Knowledge Graphs

Data Governance

Internal

External

1: Data Oriented Roles and Activities

C Suite Accountability, Responsibility, Authority

Carl Reed February 24th 2017

1. Data Oriented Roles and Activities

Page 8: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

2.1: A Semantically Driven Enterprise Data Archtecture

Carl Reed February 24th 2017

Business & Technology Governance

Information Marts/Warehouses

Source Meta Data

ConceptsRelationships

Domains

Scale Out Compute

Semantic Enrichment

Semantic TransformsIdentity Resolution

Scale Out Storage

Indexing

Integrated Data SetsRaw Data Sets

Data Engineering

Business Intelligence & Data Analytics

Client/Customer Market Operational Risk/Reputational

OntologyExecutionPersistence

Data Sourcing

DistributionRefinement

Structured Unstructured Visual PhysicalCommunication

Data Sources

Acquisition Modes

Search

Source Registry

Business Glossary

Access Control

Relational NoSQL GraphTSDB Archive BRM Other

Lineage

2.1: A Semantically Driven Enterprise Data Architecture

Page 9: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

Carl Reed January 25th 2017

Business & Technology Governance

Information Marts/Warehouses

Source Meta Data

ConceptsRelationships

Domains

Scale Out Compute

Semantic Enrichment

Semantic Transforms

Identity Resolution

Scale Out Storage

Indexing

Integrated Data Sets

Raw Data Sets

Data Engineering

Business Intelligence & Data Analytics

Client/Customer Market Operational Risk/Reputational

OntologyExecutionPersistence

Data Sourcing

DistributionRefinement

Structured Unstructured Visual PhysicalCommunication

Data Sources

Acquisition Modes

Search

Source Registry

Business Glossary

Access Control

Relational NoSQL GraphTSDB Archive BRM Other

Lineage

Koverse

FTP/CSV, Apache Kafka, Sqoop, Storm

Cloudera

Koverse

Cambridge Semantics

ANZO

GQERedOwl

Digital Reasoning

TopBraidAllegro

2.2: That Can be Implemented and Execute at Scale 2.2: That Can be Implemented and Executed at Scale

Page 10: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

The New Big Data EcosystemLegacy Enterprise Data Problems Incrementally solving legacy data problems

using new Big Datatechnology & techniques

Carl Reed February 24th 2017

Add sources to data registry and distribute via hub supporting legacy client semantics for existing clients and enforcing enterprise semantics for new.

Migrate Over Time

2.3: That Can Accommodate the Existing as well as Execute the New2.3: That Can Accommodate the Existing as well as Execute the New

Page 11: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

Regulatory Perspective & FIBO UpdateMike Atkin, Managing Director, Enterprise Data Management Council

Page 12: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

Data Management in Perspective

Beachhead for Data Management Established

Data Management Implementation Based on Best Practice

Unified View of Data Meaning (primary data objective)

Consistent Measurement of Data Management Progress

Data Management Operational Playbook

Inference Processing for Analytical Adaptability

Page 13: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

Why Harmonized (common language) Data Matters

Page 14: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

Why Harmonized (common language) Data Matters

• Degree of interconnectedness

• Transitive relationship• State contingent cash flow• Collateral flow• Degree of centricity • Funding durability• Leverage & liquidity• Guarantee & transmission

of risk• Degree of diversification

Instruments• Identification• Classification• Description (rates, dates,

features, schemes, provisions)

• Value (i.e. price, date, time)• Calculate (volatility,

correlation, duration, tax)• Maintain (corporate actions)

Entities• Entity type (legal persons,

formal organizations, corporations, partnerships, affiliates, trusts, functional, etc.)

• Ownership structures• Controlling relationships

Obligations• Issuance process• Trade and execution• Guarantee • Allocate and administer• Clear and settle• Transfer

Holdings• Firm portfolio (individual

entity risk)• Corporate structure

(organizational risk)• Industry wide (systemic

risk)

Page 15: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

BCBS 239 in Context

2008 Crisis: Inability to model contagion (who finances who, who is linked to who, what are the obligations of complex financial instruments)

Senior Banking Supervisors Group: Observations on Developments in Risk Appetite Frameworks and IT Infrastructure (intractable relationship between data and risk management and definition of control environment)

BCBS 239: Principles of Risk Data Aggregation and Reporting (governance, content infrastructure and data quality as mandatory objectives)

Page 16: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

EDMC Regulatory AreasRegulatory Actions

Fundamental Review of Trade Book (FRTB)

Dodd-Frank: Title I (systemic risk) and Title VII (derivatives)

European Market Infrastructure Regulation (EMIR)

BCBS 239: Principles of Risk Data Aggregation & Reporting

Comprehensive Capital Analysis and Review (CCAR) and Basel III

General Data Protection Regulation (GDPR)

Investment Book of Records (IBOR)

Bank Integrated Reporting Dictionary (BIRD)

Financial Data Standardization Project (EC)

Regulatory Fitness and Performance Program (REFIT)

Common Data Template for Systemically Important Banks (FSB)

Data Gaps Initiative (FSB), Common Reporting (COREP) Template and Inventory of Data Reporting Requirements (DRR)

Markets in Financial Instruments Directive (MiFID2)

Capital Requirements Regulation & Directive (CCD/CDR IV)

Alternative Investment Fund Managers Directive (AIFMD)

Directive on Undertakings for Collective Investments in Transferable Securities (UCITS)

Solvency II (EIOPA)

Regulatory Agencies• Office of the Comptroller of the Currency (OCC)• Federal Reserve Board (FRB)• Federal Deposit Insurance Corporation (FDIC)• Securities and Exchange Commission (SEC)• Commodity Futures Trading Commission (CFTC)• CPMI-IOSCO Harmonization Group• House Financial Services Committee (Financial CHOICE Act)• Senate Banking Committee consolidated audit • Financial Stability Oversight Council (FSOC) and Office of Financial

Research (OFR)• Consumer Financial Protection Bureau (CFPB)• White House: National Economic Council (NEC)• White House: Office of Science and Technology Policy (OSTP)• National Institute of Science and Technology (NIST)• European Central Bank (ECB)• Financial Stability Board (FSB)• Basel Committee on Banking Supervision• European System of Financial Supervision (ESFS)• European Banking Authority (EBA)• European Security and Markets Authority (ESMA)• European Commission (EC): Directorate General for Financial Stability,

Financial Services and Capital Markets Union (DG FISMA)• European Reporting Framework (ERF)• European Systemic Risk Board (ESRB) • European Insurance and Occupational Pensions Authority (EIOPA)• Single Resolution Board (SRB)

Page 17: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

Data Management Principles

Page 18: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

Principles of Data Management

Content Infrastructure Data Quality Governance Integration

1. Executive Air Cover with Visible Support2. Line of Business Alignment with Commitment3. Enterprise Wide Ontology stored as Metadata4. Reverse Engineering of Business Processes5. Authority via Mandatory Policy6. Resources for Sustainability

STRATEGY• Data Strategy• Cultural Alignment• Stakeholder Commitment

FORMALITY• CDO/ODM• Policy Compliance• RACI (accountability)

INFRASTRUCTURE• Data Domains and Mapping• Identifiers and X-reference• Conceptual Model/Unified View of

Meaning• Business Definitions• Physical Data Models• Metadata Repository

DQ/CONTROL• Reverse Engineering• Data Lifecycle• Business Requirements to Data

Requirements• Fit-for-Purpose Quality

Organizational Goals

Data Content Goals Operational Goals

COLLABORATION• Coordinate with IT• Align with Control Functions• Data Flow Forensics• Technical Integration

GOVERNANCE• Funding• Roadmaps and Project Plans• Metrics and Reporting• Communication• Education and Training

Page 19: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

Financial Industry Business Ontology (FIBO)

FIBO is a business conceptual model that precisely describes financial instruments,

pricing, legal entities and financial processes (what they are and how they work)

FIBO facilitates data harmonization across disparate repositories based on legal meaning and

contractual obligation

FIBO provides structural validation to ensure completeness,

consistency and allowable values

FIBO feeds analytical processes with trusted data and powers smart contracts

FIBO is expressed in the W3C standard (RDF/OWL) for flexible and scenario-

based/inference analysis

FIBO is built on state-of-the-art collaboration technology and supported by documented and tested governance

Page 20: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

Infrastructure for linking users into the “Build, Test, Deploy, Maintain”

process is fully operational

(generate diagrams from OWL and incorporate changes from diagrams to OWL)

FIBO – Collaboration Process is OPERATIONAL

Page 21: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

Unified repository linking all FIBO domain ontologies has been delivered

(published on spec.edmcouncil.org/fibo)

automated testing and generation of machine executable FIBO

FIBO Master and FIBO Release are OPERATIONAL

Page 22: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

Tools are now in place toexpedite SME verification of domain models

FIBO Model Validation Pathway

Foundational Elements(core components needed to express

financial concepts)

FIBO-FoundationsBusiness Entities

Financial/Business ConceptsIndices/Indicators

FIBO Content Teams(organized and validated)

EquitiesCorporate Bonds

Interest Rate SwapsLoan Concepts

Model Validation(member SME activity ready for rollout

and implementation)

DerivativesDebt (beyond corporate bonds)

MortgagesFunds

Rights/WarrantsPricing

Financial Processes (corporate actions, issuance, securitization)

DELIVERED Organized and Regular Meetings

Operational Rollout 2017

Continual Enhancement

Page 23: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

Regulation W (business rules) – CompletedState Street (unified meaning and classification) – Completed

|-------------------------------------------------------|

CFTC (navigation across multiple counterparties) – 2Q1725 Member Use Cases (EDW Conference) – April 2017

|-------------------------------------------------------|

FIBO Training & Certification – Planned 2018FIBO Applications Event – Planned 2018

FIBO Pilots and POCs to Demonstrate Potential

Page 24: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

FIBO Contributors

Page 25: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

State Street - FIBO Interest Rate Swap DemoArthur Keen, Managing Director, Cambridge Semantics

Page 26: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

Business Objectives

• Purpose: Demonstrate Real World Capability- The practicality of using FIBO to harmonize diverse derivative and entity data- The usefulness of FIBO for comprehensive reporting and analytics, both traditional and

innovative

• PoC approach: Apply FIBO to operational “In the wild” data- Implement using a state-of-the-art semantics platform

• Rapid implementation, no coding required

• Project Participants:State Street Business requirements and operational dataEDM Council FIBO mode and recommended reports/analytics

Cambridge Semantics Operational platform and implementation services

dun & bradstreet Business Entity and Corporate Hierarchy data

Wells Fargo FIBO consultation

Page 27: Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" of Data Management

State Street Bank/D&B/EDM CouncilFIBO PoC Solution Architecture

FrontArenaData

Dun &BradstreetData

Internal Data Sources

Map & Load (QA) Link & Query (Classification, analytics)

External Data Sources

Derivatives Data

Entity &Corp. Hierarchy

Data

Reports & Analytics

© 2016 State Street Corporation. All rights reserved. Information Classification: Limited Access16