red hat jboss data virtualization
Post on 20-Jan-2017
175 Views
Preview:
TRANSCRIPT
www.dlt.com
Red Hat JBoss Data Virtualization
July, 2016
Rick Stewart, Middleware SA
Herndon, VA
7/19/16 DLT Solutions LLC – Proprietary & Confidential 2
“Kiss” “Whitesnake” “Poison”
“Bad Company”Data
Warehouse
“Bad Company”
7/19/16 DLT Solutions LLC – Proprietary & Confidential 3
“Kiss” “Whitesnake” “Poison”
Data
WarehouseData Virtualization Server
What does Data Virtualization software do?
7/19/16 DLT Solutions LLC – Proprietary & Confidential 4
Virtual Consolidated Data Source
BI Reports
Data Virtualization Software•Consume•Compose•Connect
SAP Salesforce.comOracle DW XML, CSV& Excel files
Siloed &Complex
VirtualizeAbstractFederate
Easy,Real-time
InformationAccess
Applications
DATA CONSUMERS
DATA SOURCES
“Bad Company”
7/19/16 DLT Solutions LLC – Proprietary & Confidential 5
“Kiss” “Whitesnake” “Poison”
Data
WarehouseData Virtualization Server
“Bad Company”
7/19/16 DLT Solutions LLC – Proprietary & Confidential 6
“Kiss” “Whitesnake” “Poison”
Data
WarehouseData Virtualization Server
Data Challenges Getting Bigger
7/19/16 DLT Solutions LLC – Proprietary & Confidential 7
BI ReportsOperational
ReportsEnterprise
Applications Cloud Native Applications
Mobile Applications
Hadoop NoSQL Cloud Apps Data Warehouse & Databases
Mainframe XML, CSV& Excel Files
Enterprise Apps
Integration Complexity
Consumption & Creation
Siloed
How to Integrate?
Improve Access to Your Data
7/19/16 DLT Solutions LLC – Proprietary & Confidential 8
BI ReportsOperational
ReportsEnterprise
Applications Cloud Native Applications
Mobile Applications
Hadoop NoSQL Cloud Apps Data Warehouse & Databases
Mainframe XML, CSV& Excel Files
Enterprise Apps
Broad & Streamlined
Adaptable & Secure
Federated & MeaningfulData Virtualization Server
Simplify Access to Your Data
7/19/16 DLT Solutions LLC – Proprietary & Confidential 9
streamingdatabases
socialmedia data
productionapplication
big datastores
website
ESB
analytics& reporting
unstructureddata
mobileApp
datawarehouse
& data marts
internalportal dashboard
externaldata
privatedata
ODBC/SQL JDBC/SQL XML/SOAP REST/JSON OData SQL
JMS SQL JDBC OData Hive RSS Excel JSONREST SOAP
JMS message SQL statement SOAP messageData Virtualization Server
productiondatabases
applications
Turn Siloed Data into Actionable Information
7/19/16 DLT Solutions LLC – Proprietary & Confidential 10
Connect
Compose
Consume
BI Reports & AnalyticsMobile Applications
Applications & Portals ESB, ETL
Native Data Connectivity
Standard based Data ProvisioningJDBC, ODBC, SOAP, REST, OData
JBoss Data
Virtual-ization
Data Consumers
Data Sources
Design Tools
Dashboard
Optimization
Caching
Security
Metadata
Hadoop NoSQL Cloud Apps Data Warehouse & Databases
MainframeXML, CSV
& Excel Files
Enterprise Apps
Siloed &Complex
VirtualizeTransformFederate
Easy,Real-time
InformationAccess
Unified Virtual Database / Common Data ModelData Transformations
Supported Data Sources
7/19/16
DLT Solutions LLC – Proprietary & Confidential 11
Enterprise RDBMS:
•Oracle
•IBM DB2
•Microsoft SQL Server
•Sybase ASE
•MySQL
•MariaDB
•PostgreSQL
•Ingres
Enterprise EDW:
•Teradata
•Netezza
•Greenplum
Search:
•Apache SOLR
Hadoop:
•Apache
•HortonWorks
•Cloudera
•More coming…
Office Productivity:
•Microsoft Excel
•Microsoft Access
•Google Spreadsheets
Specialty Data
Sources:
•ModeShape Repository
•Mondrian
•MetaMatrix
•LDAP
•Apache POI for Excel
NoSQL:
•JBoss Data Grid
•MongoDB
•Cassandra
•More coming…
Enterprise & Cloud
Applications:
•Salesforce.com
•SAP
Technology
Connectors:
•Flat Files, XML Files,
XML over HTTP
•SOAP Web Services
•REST Web Services
•OData Services
7/19/16
Data As A Service
DLT Solutions LLC – Proprietary & Confidential 127/19/16
Contextual view of disparate source data
Single point of accessStandard based interfacesShareable integration and
transformation logicReusable data services
But you cannot achieve this by writing more application code…
Hadoop NoSQL Cloud Apps Data Warehouse & Databases
Mainframe XML, CSV& Excel Files
Enterprise Apps
JBoss Data Virtualization
BI Dashboard & Reports
Analytical Applications
ESB/SOA Integration
BPM Applications
Mobile Applications
SQL Statement SOAP MessageREST Message
REST Request
JSON Result
SQL Request
SQL Result
Logical Architecture
7/19/16 DLT Solutions LLC – Proprietary & Confidential 13
Data Consumers
Data Sources
Teiid Data Virtualization Designer
7/19/16 DLT Solutions LLC – Proprietary & Confidential 14
7/19/16 DLT Solutions LLC – Proprietary & Confidential 15
Tooling VirtualDB Engine Server
7/19/16 DLT Solutions LLC – Proprietary & Confidential 16
Tooling VirtualDB Engine Server
Users create data models based on metadata:
•Imported from data
sources
•Supplied via DDL
•Provided by Engine
•Specified by user
Models are packaged in a Virtual Database (VDB)
Physical Models representing actual data sources
Logical Models
7/19/16 DLT Solutions LLC – Proprietary & Confidential 17
Tooling VirtualDB Engine Server
Build XML Document
models from XML Schemas
Map XML Document
models to other data models
Enable data access via
XML
7/19/16 DLT Solutions LLC – Proprietary & Confidential 18
Tooling VirtualDB Engine Server
Virtual Databases (VDBs) are deployment archives similar to .WAR.
VDBs contain
•Source metadata and models
•View metadata and models
•System metadata
•Connection information, which is bound to
sources at deployment time
VDBs are deployed to the query engine
VDB Internals
Source Models
Connector
Binding
Properties
View Models
Manifesto Info
7/19/1619
Tooling VirtualDB Engine Server
JBoss Data Virtualization can offer finer-grained
security control:
Authentication: Kerberos, LDAP, WS-UsernameToken, HTTP Basic, SAMLAuthorization: Virtual data views, Role based access
controlAdministration: Centralized management of Virtual DB
privilegesAudit: Centralized audit logging and dashboardProtection:
Row and column maskingSSL encryption (ODBC and JDBC)
DLT Solutions LLC – Proprietary & Confidential
7/19/16 DLT Solutions LLC – Proprietary & Confidential 20
Tooling VirtualDB Engine Server
Query Engine
JDBC API
VDB
Connector Binding (1)
Connector Binding (2)
C1 C2
DBOracle
DB SQL Server
Data Consumer Apps
Query Engine is core data virtualization functionality: Federating relational query engine. Rule and cost based optimizer, advanced query planner, caching, hint processing.
Query Engine hosts VDBs, binds to data sources, performs query execution and results processing.
7/19/1621
Tooling VirtualDB Engine Server
The Teiid Query engine is hosted in JBoss EAP and uses key container-provided services:
•Transaction manager
•JAAS security framework
•Container managed data sources
•EAP management infrastructure
•EAP deployment
The Server exposes views /services to consumers and managed connections and connection pools for data sources.
DLT Solutions LLC – Proprietary & Confidential
JBoss EAP
ApplicationsSecurity
JAASTransaction
Manager
JDV Runtime Engine
BufferMgrThreading
Local Cachesetc.
VDBVDBs
ODBC Socket Transport
Admin Socket Transport
JDBC Socket Transport
Profile Service
ODBC
JDBC
Admin / AdminShell
JON
DSDS
DS
DS
JC
A
Tra
nsla
tors
Embedded DSxxx-ds.xml
yyy-ds.xml
zzz-ds.xml
7/19/1622
Tooling VirtualDB Engine Server
DLT Solutions LLC – Proprietary & Confidential
CACHING & MATERIALIZATIONMultiple levels of caching to meet performance requirements and manage load on source systems:Materialized Views
–External or Internal materialized views–Ability to override use of materialized views
Result set Caching–Applied to results return from user queries and virtual procedure calls–Configurable time to live and max. number of entries
Code Table Caching–Suited for integrating reference data with transaction/operational data e.g. Country code, State Code etc.
QUERYAccess Patterns – criteria requirements on
pushdown queriesPushdown – decompose user query into
source queries–Projection minimization to remove unused select items–Decompose aggregates over joins/unions–Generating SQL matching Teiid system functions
Dependent Joins (can use hints) – feed equi-join values from one side of the join to the otherPartition aware aggregation and joinsCopy Criteria – uses criteria transitivity to
minimize join tuples.
PERFORMANCE OPTIMIZATION
Business Dashboard
7/19/16 DLT Solutions LLC – Proprietary & Confidential 23
Bring It All Together
7/19/16 DLT Solutions LLC – Proprietary & Confidential 24
Hadoop
Data IntegrationJBoss Data Virtualization
In-memory CacheJBoss Data Grid
BI Analytics (historical, operational, predictive)
Composite Applications
Messaging and Event Processing JBoss A-MQ and JBoss BRMS
J
Structured DataStreaming
DataSemi-Structured
Data
Cap
ture
& P
rocess
Inte
gra
te &
An
aly
ze
Red Hat Storage
25
Questions
?
Bring It All Together
7/19/16 DLT Solutions LLC – Proprietary & Confidential 26
27
Thank
You!
JBoss Data Virtualization – Use Cases
7/19/16 DLT Solutions LLC – Proprietary & Confidential 28
Self-Service Business Intelligence
The virtual, reusable data model provides business-friendly representation of data, allowing the user to interact with their data without having to know the complexities of their database or where the data is stored and allowing multiple BI tools to acquire data from centralized data layer. Gain better insights from Big Data using JBoss Data Virtualization to integrate with existing information sources.
360◦
Unified View
Deliver a complete view of master & transactional data in real-time. The virtual data layer serves as a unified, enterprise-wide view of business information that improves users’ ability to understand and leverage enterprise data.
Agile SOA Data Services
A data virtualization layer deliver the missing data services layer to SOA applications. JBoss Data Virtualization increases agility and loose coupling with virtual data stores without the need to touch underlying sources and creation of data services that encapsulate the data access logic and allowing multiple business service to acquire data from centralized data layer.
Regulatory Compliance
Data Virtualization layer deliver the data firewall functionality. JBoss Data Virtualization improves data quality via centralized access control, robust security infrastructure and reduction in physical copies of data thus reducing risk. Furthermore, the metadata repository catalogs enterprise data locations and the relationships between the data in various data stores, enabling transparency and visibility.
7/19/16 DLT Solutions LLC – Proprietary & Confidential 29
BA C D
JBoss Data Virtualization
Leveraged TPC-H like schema, data and queries
Use 4 different commercial enterprise RDBMS
Each database with 1 TB data representing
•150 million customers, with over
•600 million order records, and
•6 billion order line items.
•Total 4 TB of data
Findings:
•No measurable JDV queries overhead vs. direct queries
•Queries to federated data from four data sources ran
61.7 percent faster vs. baseline
•Scaling queries workload by 2x resulted in <10% impact
on response time
Download Benchmark Study @ http://www.redhat.com/en/resources/jboss-data-virtualization-query-performance-benchmark-study
top related