denodo data virtualization platform architecture: data discovery and data governance (session 4 from...

18
Five In-depth Technology and Architecture Sessions on Data Virtualization Session 4: Data Discovery & Governance

Upload: denodo

Post on 29-Jul-2015

194 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Page 1: Denodo Data Virtualization Platform architecture: Data Discovery and Data Governance (session 4 from Architect to Architect webinar series)

Five In-depth Technology and Architecture Sessions on Data Virtualization

Session 4: Data Discovery & Governance

Page 2: Denodo Data Virtualization Platform architecture: Data Discovery and Data Governance (session 4 from Architect to Architect webinar series)

Today’s Speaker

Anastacio Molano

Head of Solutions and Business Development

Page 3: Denodo Data Virtualization Platform architecture: Data Discovery and Data Governance (session 4 from Architect to Architect webinar series)

Architect-to-Architect Series

■ Series of five webinars over 3 months

■ Deeper look into Denodo Platform

■ Architectural Overview

■ Performance

■ Scalability (today’s session)

■ Data Discovery and Governance

■ Security

Page 4: Denodo Data Virtualization Platform architecture: Data Discovery and Data Governance (session 4 from Architect to Architect webinar series)

Denodo Express

■ Denodo Express

■ Free to Download

■ Fully functioning Data Virtualization Platform

■ Single user, supports common data sources

■ Many of the same capabilities of Denodo Platform

■ Performance, Data Discovery, Governance, internal Security, Publishing, …

Page 5: Denodo Data Virtualization Platform architecture: Data Discovery and Data Governance (session 4 from Architect to Architect webinar series)

Data Discovery & Governance –Architecture Modules

Page 6: Denodo Data Virtualization Platform architecture: Data Discovery and Data Governance (session 4 from Architect to Architect webinar series)

Data Discovery and Governance

■ Metadata Repository

■ Embedded Apache Derby database

■ Contains view and data source configuration data

■ Relationships between views – dependencies, etc.

■ Searchable – Catalog searches, etc.

■ Data Lineage

■ Trace how data changes between source and consumer

■ Change Impact Analysis

■ What is the impact of a change in a data source schema?

Page 7: Denodo Data Virtualization Platform architecture: Data Discovery and Data Governance (session 4 from Architect to Architect webinar series)

Data Discovery and Governance

■ RESTful Web Services

■ Supports Global Search functionality

■ Index and search data sources

■ ‘Google’ like search

■ Linked Data Services

Page 8: Denodo Data Virtualization Platform architecture: Data Discovery and Data Governance (session 4 from Architect to Architect webinar series)

Metadata Introspection

■ Denodo Platform gathers metadata from data sources

■ Automatically or via configuration

■ Maps native data types to ‘Denodo types’

■ Inspects indexes in the sources

■ Analyzes source query capabilities and abstracts them into common model

■ Stores all metadata and configuration data in Metadata repository

■ Uses built-in Apache Derby database

■ Small size – only stores metadata…actual data is retrieved in real time from sources or cache

Page 9: Denodo Data Virtualization Platform architecture: Data Discovery and Data Governance (session 4 from Architect to Architect webinar series)

Metadata Catalog

■ Two ways to inspect catalog

■ Graphically in Denodo Admin Tool

■ Search and browse contents of Metadata Repository

■ Filter by element type, name, date created, etc.

■ Drill down to view schema for individual elements

■ Programmatically

■ SQL Query using ‘list’ and ‘desc’ commands

■ e.g. ‘list views’ or ‘desc view address’

■ Stored procedures for complex catalog queries

Page 10: Denodo Data Virtualization Platform architecture: Data Discovery and Data Governance (session 4 from Architect to Architect webinar series)

Data Lineage

■ Graphical view for showing data lineage for any field in any virtual view

■ Trace source of any field

■ Includes any functions applied to field contents

■ Trace source of calculated fields

■ View calculations used to create new fields

Page 11: Denodo Data Virtualization Platform architecture: Data Discovery and Data Governance (session 4 from Architect to Architect webinar series)

Used By Tool

■ Graphical view for showing where a view is used

■ ‘Big picture’ view of usage

■ Useful tool for seeing impact of changes on whole system

Page 12: Denodo Data Virtualization Platform architecture: Data Discovery and Data Governance (session 4 from Architect to Architect webinar series)

Change Impact Analysis

■ Denodo Platform can perform impact analysis to show impact of data source changes

■ Highlights changes to data source

■ Shows other views impacted by the change

■ i.e. derived views using the changed base view

■ Select which views you want to propagate changes to…

Page 13: Denodo Data Virtualization Platform architecture: Data Discovery and Data Governance (session 4 from Architect to Architect webinar series)

Data Discovery – Global Search

■ Denodo Platform also supports data discovery for non-Admin Tool users

■ Not all users have access to Admin Tool

■ Browser-based ‘global search’ provides simple search mechanism

■ Keyword-based searches for intuitive discovery process

■ Search metadata and data to find what you want

■ Browse schema or actual data

■ Traverse relationships between entities

Page 14: Denodo Data Virtualization Platform architecture: Data Discovery and Data Governance (session 4 from Architect to Architect webinar series)

Data Discovery – Under the Covers

■ Global Search uses Denodo Aracne’s indexer to index data source contents

■ Based on Lucene indexer

■ Indexes stored in Denodo Platform

■ Searches run against indexes

■ Faster retrieval of ‘hits’ without overhead of full scans on sources

■ Indexing can be scheduled

■ Denodo Scheduler runs indexing jobs

■ e.g. overnight when minimal impact

Page 15: Denodo Data Virtualization Platform architecture: Data Discovery and Data Governance (session 4 from Architect to Architect webinar series)

Data Discovery & Governance - Summary

■ Data discovery and governance is pervasive in Denodo Platform

■ Users can inspect catalog of virtualization objects through catalog search to find data combinations for reuse

■ Data lineage helps users to understand where data has come from and how it has changed from the source

■ Impact analysis helps architects understand the consequences of changes in the data source schemas

■ Propagate changes selectively with a single click

■ Global Search gives a full view of the Data Virtualization project

■ Both data and metadata at the same time

■ Start from whole data sets, then drill down to individual data rows

■ Point and click to traverse associations between entities and to access related data

Page 16: Denodo Data Virtualization Platform architecture: Data Discovery and Data Governance (session 4 from Architect to Architect webinar series)

Q&A

Page 17: Denodo Data Virtualization Platform architecture: Data Discovery and Data Governance (session 4 from Architect to Architect webinar series)

Data Virtualization – Next Steps

Move forward at your own pace

Download Denodo Express –

The fastest way to Data Virtualization

Denodo Community: Documents, Videos, Tutorials, and more.

Attend Architect-to-Architect Series

Performance

Scalability

Move forward with one of our Data Virtualization experts

Phone: (+1) 877-556-2531 (NA)

Phone: (+44) (0)20 7869 8053 (EMEA)

Email: [email protected] | www.denodo.com

Data Discovery and Governance

Security

Page 18: Denodo Data Virtualization Platform architecture: Data Discovery and Data Governance (session 4 from Architect to Architect webinar series)

Five In-depth Technology and Architecture Sessions on Data Virtualization

Thank You!

Next SessionSession 5

Denodo Platform: Security