data virtualization, data federation & iaas with jboss teiid

33
DATA VIRTUALIZATION & INFORMATION AS A SERVICE (IAAS) By Anil Allewar Senior Solutions Architect - Synerzip 1

Upload: anilallewar

Post on 18-Nov-2014

1.006 views

Category:

Data & Analytics


0 download

DESCRIPTION

Enterprise have always grappled with the problem of information silos that needed to be merged using multiple data warehouses(DWs) and business intelligence(BI) tools so that enterprises could mine this disparate data for businessdecisions and strategy. Traditionally this data integration was done with ETL by consolidating multiple DBMS into a single data storage facility. Data virtualization enables abstraction, transformation, federation, and delivery of data taken from variety of heterogeneous data sources as if it is a single virtual data source without the need to physically copy the data for integration. It allows consuming applications or users to access data from these various sources via a request to a single access point and delivers information-as-a-service (IaaS). In this presentation, we will explore what data virtualization is and how it differs from the traditional data integration architecture. We’ll also look at validating the data virtualization and federation concepts by working through an example(see videos at the GitHub repo) to federate data across 2 heterogeneous data sources; mySQL and MongoDB using the JBoss Teiid data virtualization platform.

TRANSCRIPT

Page 1: Data virtualization, Data Federation & IaaS with Jboss Teiid

DATA VIRTUALIZATION&INFORMATION AS A SERVICE (IAAS)

By Anil Allewar

Senior Solutions Architect - Synerzip

1

Page 2: Data virtualization, Data Federation & IaaS with Jboss Teiid

About Me!!2

Anil Allewar

Senior Solutions Architect @ Synerzip

Technology Evangelist & speaker

Core interests: JEE, EAI, EII

Page 3: Data virtualization, Data Federation & IaaS with Jboss Teiid

• Use cases

Agenda3

• What does it mean?

• Implementation Frameworks

• Demo

• Questions?

• Architecture explained

Page 4: Data virtualization, Data Federation & IaaS with Jboss Teiid

Why it makes sense?4

Page 5: Data virtualization, Data Federation & IaaS with Jboss Teiid

Use Cases

Data Warehous

e

ETL

Financial

Data

OLTP Data

ETL

3rd Party Data

Data Mart

ETL

Web Servic

e 1

Web Servic

e 2

Legacy Data

Custom

Program

Excel files

5

Page 6: Data virtualization, Data Federation & IaaS with Jboss Teiid

Traditional Data Integration6

Enterprise Information System

ETL

Source System

Source System

ETL

Business Applications

Page 7: Data virtualization, Data Federation & IaaS with Jboss Teiid

Problems with ETL 7

More than 1 copy of data for staging

Intermediate data => Errors

Lead time to add new source

Domain knowledge for mapping

Batch Process => No real time data

Page 8: Data virtualization, Data Federation & IaaS with Jboss Teiid

Problems with DBMS consolidation8

Alternate approach => Single EIS (say

RDBMS)

Extensive changes to existing apps

Might not satisfy everyone’s

requirements

Page 9: Data virtualization, Data Federation & IaaS with Jboss Teiid

• Use cases

Agenda9

• What does it mean?

• Implementation Frameworks

• Demo

• Questions?

• Architecture explained

Page 10: Data virtualization, Data Federation & IaaS with Jboss Teiid

Data Virtualization & Federation10

Single API to access data

Only metadata stored at

virtualization layerReal time access without

copying/moving data Federate data

across hetero/homogenou

s sources

Page 11: Data virtualization, Data Federation & IaaS with Jboss Teiid

Data Virtualization11

Page 12: Data virtualization, Data Federation & IaaS with Jboss Teiid

• Use cases

Agenda12

• What does it mean?

• Implementation Frameworks

• Demo

• Questions?

• Architecture explained

Page 13: Data virtualization, Data Federation & IaaS with Jboss Teiid

Architecture13

UserApplicati

on

Com

mon A

ccess

API

Connector 1

Connector 2

RUNTIME & QUERY ENGINE

VirtualDatabase

Translator 1

Translator 2

Page 14: Data virtualization, Data Federation & IaaS with Jboss Teiid

• Use cases

Agenda14

• What does it mean?

• Implementation Frameworks

• Demo

• Questions?

• Architecture explained

Page 16: Data virtualization, Data Federation & IaaS with Jboss Teiid

Selected Platform – JBoss Teiid16

Open Source

Number of relational/NoSQL/ERP/CRM data

stores

JEE standards

Add custom EIS support using

JEE components

Active & responsive community

Synerzip contribution: Defect discovery, root cause analysis, feature

verification

Page 17: Data virtualization, Data Federation & IaaS with Jboss Teiid

Teiid Components17

Virtual Database container for components used to integrate data

from multiple data sources Source Models

structure and characteristics of physical data sources View Models

structure and characteristics of abstract structures you want to expose to your applications

Teiid Designer Eclipse based UI to dynamically discover data

source objects and apply data federation Generate virtual database from 1 or more

sources

Page 18: Data virtualization, Data Federation & IaaS with Jboss Teiid

Teiid Components18

Translator Provides abstraction later between Teiid Query

Engine and source system Convert Teiid SQL commands to source specific

execution commands Convert result data from source system to Teiid

specific format Resource Adapter

Provides connectivity to the physical data source Integration provided through Java Connector

Architecture (JCA) API

Page 19: Data virtualization, Data Federation & IaaS with Jboss Teiid

Teiid – Supported EIS

Amazon SimpleDB Apache Accumulo Apache SOLR Cassandra File Google Spreadsheet JPA LDAP Excel – as file SalesForce

JDBC MS access, DB2, derby,

excel-odbc, greenplum, h2 , hive(for accessing Hadoop), oracle, teradata and most RDBMS

MongoDB Object OData OLAP Web Services SAP Netweaver

Gateway

19

Page 20: Data virtualization, Data Federation & IaaS with Jboss Teiid

Performance Characteristics20

Access same data using Oracle and Teiid drivers

Retrieval times comparable when accessing tables having no Blobs

0

5,000

10,000

15,000

20,000

25,000No. of rows Vs Time: No Blobs

Oracle-JDBCTeiid-JDBC

No. of rows

ms

Page 21: Data virtualization, Data Federation & IaaS with Jboss Teiid

Performance Characteristics21

Teiid slower when accessing Blob data Can be tuned

0 0 2 42 21,804 32,531 185,4540

5,000

10,000

15,000

20,000

25,000

30,000

No. of rows Vs Time: Blobs

Oracle-JDBCTeiid-JDBCm s

No. of rows

Page 22: Data virtualization, Data Federation & IaaS with Jboss Teiid

• Use cases

Agenda22

• What does it mean?

• Implementation Frameworks

• Demo

• Questions?

• Architecture explained

Page 23: Data virtualization, Data Federation & IaaS with Jboss Teiid

Demo23

JDBC Clien

t

JDB

CA

PI

RDBMS Resource Adapter

MongoDB Resource Adapter

TEIID RUNTIME & QUERY ENGINE

Federated VDB

mySQL Translat

or

MongoDB

Translator

mySQL

Page 24: Data virtualization, Data Federation & IaaS with Jboss Teiid

Demo-Steps24

Pre-requisites mySQL server 5.5+ installed MongoDB 2.4.x+ installed

Steps Load the mySql and MongoDB database with sample data Setup environment – JBoss, Eclipse Create Teiid project in Eclipse using Teiid designer

Import source model using JDBC Create the virtual model and federate data from the

source model Create a virtual database (VDB) and deploy to JBoss

Access data using JDBC client or through browser using OData

Page 25: Data virtualization, Data Federation & IaaS with Jboss Teiid

Demo – Scenario25

Federated

Data

Page 26: Data virtualization, Data Federation & IaaS with Jboss Teiid

Demo – Connection Profile26

Page 27: Data virtualization, Data Federation & IaaS with Jboss Teiid

Demo – Source Model27

Page 28: Data virtualization, Data Federation & IaaS with Jboss Teiid

Demo - Source Model Generation28

Page 29: Data virtualization, Data Federation & IaaS with Jboss Teiid

Demo – Map Source To View29

Page 30: Data virtualization, Data Federation & IaaS with Jboss Teiid

Demo - Association30

Page 31: Data virtualization, Data Federation & IaaS with Jboss Teiid

Demo – Data Federation31

Page 32: Data virtualization, Data Federation & IaaS with Jboss Teiid

Demo – Source Code32

Source code https://github.com/anilallewar/JBoss-Teiid Contains

Configuration files Instructions “How-to” videos VDBs, source models and view models

Page 33: Data virtualization, Data Federation & IaaS with Jboss Teiid

Conclusion33

Data Virtualization and Federation is a rapidly emerging technology that solves traditional BI/ETL problems.

It provides lower time to market, distributes data across the enterprise as a service and provides real time access to enterprise data.