denodo platform 7 · • tibco ems semantic repositories • semantic repositories in triple stores...

7
1 Copyright © 2018, Denodo Technologies With the advent of big data and the proliferation of multiple information channels, organizations must store, discover, access and share massive volumes of traditional and new data sources. Data virtualization is a modern approach to data integration that transcends the limitations of traditional techniques. Denodo is the leader in data virtualization, providing agile, high performance data integration and data abstraction across the broadest range of enterprise, cloud, big data and unstructured sources as real-time data services at half the cost of traditional approaches. The new release of the Denodo Platform 7.0 redefines data management for next generation helping companies to move from a data-driven business where users are drowning in data to a truly insights-driven business. Denodo data virtualization simplifies the access to the information for business users and applications, hiding the underlying complexity and bridging the gap between nowadays IT infrastructure and information consumers, exposing the data in a business friendly manner. Denodo Platform 7.0 brings significant enhancements that will make even easier for business users to discover and find the information they need, with unprecedented scalability even in the most demanding big data and analytics environments, and tools to manage big deployments in multi-location architectures that can seamlessly combine both on-prem and multi-cloud deployments. Datasheet Denodo Platform 7.0

Upload: buinhi

Post on 14-May-2018

256 views

Category:

Documents


8 download

TRANSCRIPT

Page 1: Denodo Platform 7 · • Tibco EMS Semantic Repositories • Semantic repositories in Triple Stores / RDF accessed through SPARQL endpoints. Packaged Applications • SAP ERP/ECC

1Copyright © 2018, Denodo Technologies

With the advent of big data and the proliferation of multiple information channels, organizations must store, discover, access and share massive volumes of traditional and new data sources. Data virtualization is a modern approach to data integration that transcends the limitations of traditional techniques. Denodo is the leader in data virtualization, providing agile, high performance data integration and data abstraction across the broadest range of enterprise, cloud, big data and unstructured sources as real-time data services at half the cost of traditional approaches.

The new release of the Denodo Platform 7.0 redefines data management for next generation helping companies to move from a data-driven business where users are drowning in data to a truly insights-driven business. Denodo data virtualization simplifies the access to the information for business users and applications, hiding the underlying complexity and bridging the gap between nowadays IT infrastructure and information consumers, exposing the data in a business friendly manner.

Denodo Platform 7.0 brings significant enhancements that will make even easier for business users to discover and find the information they need, with unprecedented scalability even in the most demanding big data and analytics environments, and tools to manage big deployments in multi-location architectures that can seamlessly combine both on-prem and multi-cloud deployments.

Datasheet

Denodo Platform 7.0

Page 2: Denodo Platform 7 · • Tibco EMS Semantic Repositories • Semantic repositories in Triple Stores / RDF accessed through SPARQL endpoints. Packaged Applications • SAP ERP/ECC

2Copyright © 2018, Denodo Technologies

What Are the Key Features of Denodo Platform 7.0?

Denodo Platform 7.0 delivers breakthrough performance for big data, logical data warehouses and operational scenarios with a combined use of Parallel In-Memory Fabric and the Denodo Dynamic Query Optimizer.

Data virtualization provides a data delivery layer for business users and applications comprised of business views and data services that the users can get access to in a self-service manner. Denodo Platform 7.0 simplifies the discovery and exploration process for business users through a new Catalog of data assets, where users can easily navigate through business categories and tags, search to find the information they need and build their own queries that can be shared with other users and also promoted for global consumption.

For managing big multi-location deployments that can involved several clusters in different locations both on-prem and in the cloud, Denodo Platform 7.0 includes the new Denodo Solution Manager that facilitates DevOps in such a complex scenarios for automated lifecycle management (promotion of new metadata between environments, management of updates, etc.)

Denodo Platform 7.0 offers a multi-location architecture for multi-cloud, hybrid and edge scenarios without sacrificing performance, security and governance. It has been specifically designed to provide location transparency minimizing expensive data movement between different locations.

Fig. 1 Denodo Platform 7.0 Components

Drag-and-Drop for Data-Oriented Users

Data Adapters

Data Federation

Data Abstraction

Data Transformation

Publish Data Services

Business-Friendy UI

Catalog of Information Assets and Data Services

Data Discovery

Data Exploration

Customization of Datasets

Sharing & Collaboration

Promotion for IT Review for Global

Use

Dynamic Query Optimization

Parallel In-Memory Fabric (query

processing and caching)

Data Caching

Other optimization techniques such as

advanced query rewrites

Authentication & Authorization

Data & Metadata Security

Data Masking

Auditing

Data Lineage

Automatic Source Refresh

Change Control

Policies Execution

Integration with External

Governance, Catalog Tools

Location Transparency

Centralized Metadata

Management across Multiple Locations

Multi-Layered Architecture

(Maximize local processing and

optimize network tra�c)

Cloud Marketplace and Docker O�erings

Web-UI Console for Centralized Deployment Management

Monitoring & Diagnostics

Resource Manager

High-Availability

Scheduler

Data Modeling

Management& Monitoring

Multi-Cloud,Hybrid, Edge

Security &Governance

PerformanceOptimization

DataCatalog

Technical, Operational and Business Metadata

Consume data using any tool in any preferable format

Page 3: Denodo Platform 7 · • Tibco EMS Semantic Repositories • Semantic repositories in Triple Stores / RDF accessed through SPARQL endpoints. Packaged Applications • SAP ERP/ECC

3Copyright © 2018, Denodo Technologies

The unique differentiators of Denodo Platform 7.0 are the following ones:

• Big data query acceleration through native integration with Parallel query processing engines. Denodo Platform 7.0 includes native integration with MPP engines based on popular parallel processing technologies such as Spark, Hive, Impala and Presto that can execute Denodo data processing tasks in parallel. This brings the Denodo Platform to the next level in terms of scalability in Big data and analytical scenarios. In addition Denodo Platform 7.0 supports MPP caching for faster access to previously processed data.

• Dynamic Query Optimizer with advanced query rewriting techniques that minimize network traffic and post-processing in the virtual layer for scalable architecture that it is being used in combination with MPP parallel processing and MPP caching for query acceleration.

• Dynamic Data Catalog that combines the features of a traditional catalog (classification and tagging, data lineage, descriptions, etc.) with keyword-based search, discovery, and business-friendly data exploration and preparation.

• Modern data services layer that implements the latest standards (e.g., OAuth 2.0, SAML, OpenAPI, OData 4) for easy cloud interoperability.

• Automated lifecycle management that greatly simplifies Dev/Ops tasks: the new Denodo Solution Manager facilitates managing complex multi-cluster deployments with centralized management of updates, license keys, data source parameters, monitoring, logs, and easy promotion of new metadata to different environments in multi-location architectures with different clusters from a centralized Web-UI.

• Availability on major cloud platforms such as AWS and Azure Marketplaces with support for seamless scalability (e.g. automatic auto-scaling of resources in both AWS and Azure) and containerization platforms (Docker).

Fig. 2 Denodo Platform 7.0 Architecture

DOC

W

DISPARATE DATA SOURCES

DATA CONSUMERS

SQL Queries(JDBC, ODBC, ADO.NET)

Web Services(SOAP, REST, OData, Open API)

Data Catalog Secure Delivery(SSL/TLS, SAML, OAuth)

DATA VIRTUALIZATION

Executive Engine& Optimizer

MetadataRepository

Solution Manager

Monitoring & Auditing

Corporate Security

Caching

MPP Processing

Page 4: Denodo Platform 7 · • Tibco EMS Semantic Repositories • Semantic repositories in Triple Stores / RDF accessed through SPARQL endpoints. Packaged Applications • SAP ERP/ECC

4Copyright © 2018, Denodo Technologies

I Highly Economical Integrate data reliably at a fraction of the time, cost, infrastructure, and rigidity of traditional integration approaches such as ETL.

I Faster Path to Value Deliver contextual and reliable information faster for actionable insights. Agile enterprises are more likely to be industry leaders

I Location-Agnostic Deploy in any location - on-premises, cloud, and edge - without sacrificing scale or governance.

I Business-Friendly Abstract the complexity of modern data ecosystem (myriad of sources, multiple formats, distributed, heterogeneous, diverse) from business. Expose data in the format and naming conventions for every type of user and application at almost no cost. Rapidly adjust to changes in requirements.

I Enterprise Grade Support multiple lines of business, multiple use cases, multiple persona, and thousands of users.

I A Digital Marketplace Enable a digital marketplace that empowers a community of analysts to find and use information assets quickly, which is essential in this age of self-service analytics.

I Rationalize and Integrate Multiple Enterprise BI Platforms Integrate, leverage, and reuse components from multiple business intelligence platforms.

DATA VIRTUALIZATION LAYER

DATA VIRTUALIZATION LAYER

Business Benefits

Fig. 3 Data virtualization in the enterprise

Page 5: Denodo Platform 7 · • Tibco EMS Semantic Repositories • Semantic repositories in Triple Stores / RDF accessed through SPARQL endpoints. Packaged Applications • SAP ERP/ECC

5Copyright © 2018, Denodo Technologies

DATA SOURCES

Relational Databases• Generic (JDBC)• IBM DB2 (JDBC): 8, 9, 10, 11, 12

for LUW, 9,10 for z/OS• Multi Layered Denodo

deployments (JDBC): 5.5, 6.0, 7.0• Apache Derby (JDBC): 10• Informix (JDBC): 7, 12• MS SQL*Server (JDBC, ODBC):

2000, 2005, 2008, 2008R2, 2012, 2014, 2016

• MySQL (JDBC): 4, 5• Oracle (JDBC): 8i, 9i, 10g, 11g, 12c• Oracle E-Business Suite (JDBC): 12• PostgreSQL (JDBC): 8, 9• Sybase Adaptive Server

Enterprise (JDBC): 12, 15• MS Access (ODBC)

In-Memory Databases• SAP HANA (JDBC): 1• Oracle TimesTen (JDBC): 11g• Oracle 12c In-Memory

Parallel databases and appliances• GreenPlum (JDBC): 4.2• HP Vertica (JDBC): 7, 8• Oracle Exadata (JDBC): X5-2• ParAccel 8.0.2 (by using

ParAccel 2.5.0.0 JDBC3g with SSL driver)

• Netezza (JDBC): 4.6, 5.0, 6.0, 7.0• SybaseIQ (JDBC) 12.x, 15.x• Teradata (JDBC): 12, 13, 14, 15

Cloud Data Warehouse• Amazon Redshift (JDBC)• Amazon Athena (JDBC)• Amazon Aurora (JDBC)• Snowflake (JDBC)• Amazon DynamoDB

Big Data/NoSQL• Apache Hive (JDBC): 0.12,

1.1.0, 1.1.0 for Cloudera, 1.2.1 for Hortonworks, 2.0.0

• Impala (JDBC): 2.3• Spark SQL (JDBC): 1.5, 1.6• Google BigQuery (JDBC)• Presto (JDBC)

Web Automation• Denodo’s ITPilot automates

extraction from web pages

Indexes and unstructured content• CMS, file systems, pdf, word,

text, email servers, knowledge bases, indexes

• Elastic Search

Multi-Dimensional Sources• SAP BW (BAPI/XMLA): 3.x• SAP BI 7.x (BAPI): 7.x• Mondrian (XMLA): 3.x• MS SQL Server Analysis Services

200x• Essbase (XMLA): 9, 11

Web Services• SOAP • REST (XML, RSS, ATOM, JSON)

Flat and Binary Files• CSV, pipe-delimited, Regular

expression-parsed• MS Excel xls 97-2003• MS Excel xlsx 2007 or later• MS Access • XML• JSON• All files can be locally accessible

or in remote filesystems, through FTP/ SFTP/FTPS, and in clear, zipped and/or encrypted format.

Active Directory as source or leveraging security• LDAP v3 • Microsoft Active Directory 2003,

2008

Cloud, Saas,Web Sources with Simplified OAuth Security• Amazon• Google• Facebook• LinkedIn• MS Azure Data Lake• MS Sharepoint (by using the

OData connector)• MS Dynamics (by using the

OData connector)• Marketo

• Salesforce• Twitter via APIs with simplified

OAuth integration (1.0, 1.0a and 2.0)• Workday

MS Queues as data source and Delivery• MQSeries• SonicMQ• ActiveMQ• Tibco EMS

Semantic Repositories• Semantic repositories in Triple

Stores / RDF accessed through SPARQL endpoints.

Packaged Applications• SAP ERP/ECC (BAPIS and tables) • Oracle E-Business Suite 12• Siebel• SAS (SAS JDBC Driver): 7 and

higher

Denodo SDK for Custom Connectors• CouchDB• Lotus Domino• MongoDB

Mainframe• IMS

• IBM IMS native drivers: 8, 9• IMS Universal Drivers: 11

Hierarchical databases• Adabas(SOA Gateway and

Denodo’s SOAP connector): 5, 6

The following other data sources have been successfully tested with Denodo using JDBC and ODBC drivers, WS/SOAP and WS/REST, and DenodoConnect adapters (not exhaustive list):• Apache Solr• Kafka Messages• SAS Files• Hadoop HBase• Hadoop HCatalog• Hadoop HDFS• Files in S3• IBM BigInsights• Pivotal HAWQ

Denodo Platform 7.0 Capabilities Sheet

Page 6: Denodo Platform 7 · • Tibco EMS Semantic Repositories • Semantic repositories in Triple Stores / RDF accessed through SPARQL endpoints. Packaged Applications • SAP ERP/ECC

6Copyright © 2018, Denodo Technologies

PUBLISHING OPTIONS

• SQL Based access via JDBC, ODBC and ADO.NET

• Web Services • SOAP• REST• OData• Open API (a.k.a Swagger)

• Oauth, OAuth 2.0 (JWT)• SAML• SSL• WS-Security• JMS listeners for message

queues• Denodo Scheduler for batch

process and lite ETL

DATA CATALOG

• Web-UI for data discovery and exploration for business users

• Business Categories and Business Tags

• Full search capabilities on metadata and actual data

• Query wizards for customizing datasets

• Export to CSV, Excel and Tableau Data Extracts

• Collaboration features (save and share)

• Export to a shared sandbox for IT review before final publication for global use

PERFORMANCE OPTIMIZATIONS

• Massive Parallel Processing (MPP) integration for Query Acceleration and Caching

• Full and partial aggregation pushdown

• Data movement optimization• Cost Based Optimization (data

statistics, data source indexes, data source execution model and parameters, network transfer rates)

• Pushdown of selections/projections/joins/groupby operations

• Multiple join strategies• Simplifying partitioned unions

(Partition pruning)• and many more

• Multi-mode caching: full, partial, incremental or total refresh, event-based or scheduled, configured at the view level, incremental queries for SaaS sources

CACHE AND DATA MOVEMENT OPTIONS

• Amazon Redshift• IBM DB2 ( 8, 9, 10, 11, 12 for LUW,

9,10 for z/OS)• Hive 2.0.0• Impala • Presto • MS SQL Server (2000, 2005,

2008, 2008R2, 2012, 2014, 2016)• MySQL (4 and 5)• Netezza (6 and 7)• Oracle (8i, 9i, 10g, 11g, 12c, 12c

in-memory)• Oracle TimesTen 11g• Snowflake• SAP HANA• Spark 1.5, 1.6 • Teradata 12, 13, 14 and 15• Vertica 7• Configurable “generic” adapter

for other databases with JDBC drivers

MASSIVE PARALLEL PROCESSING OPTIONS

• Impala • Presto • Spark 1.5, 1.6 • Hive

DATA GOVERNANCE

• Data source refresh, change impact analysis, dependency tree, full data lineage

• Denodo Governance Bridge: integration with IBM Information Governance Catalog

SECURITY

Data in Motion – secure channels• Using SSL/TLS• Client-to-Denodo and Denodo-

to-source• Available for all protocols (JDBC,

ODBC, ADO.NET and WS)

Data at Rest - secure storage• Cache: third party database.

Can leverage its own encryption mechanism

• Swapping to disk: serialized temporarily stored in a configurable folder that can be encrypted by the OS

Encryption/Decryption• Support for custom decryption

for files and web services• Transparent integration with

RDBMs encryption

User and Role Based including integration with AD/LDAP• Row and Column level

authorization• Masking• Custom policies for specific

security constraints and integration with external policy servers

Authentication• Native and LDAP/Active

Directory based• Support for Kerberos and

Windows SSO• Base64• Kerberos• NTLM• Oauth, OAuth 2.0 (JWT)• SAML• SSL• WS-Security• Pass-through session credentials• Leverage existing source

privileges

DATA MODELING

• Integrated Development Studio for data modeling: drag&drop zvisual studio, publish data services with a few clicks.

• Bottom-Up and Top-Down (through Interface Views)

• Denodo Model Bridge: Integration with third-party modeling tool• ER/Studio Data Architect• ERwin Data Modeler• IBM InfoSphere Data

Architect• SAP PowerDesigner

Page 7: Denodo Platform 7 · • Tibco EMS Semantic Repositories • Semantic repositories in Triple Stores / RDF accessed through SPARQL endpoints. Packaged Applications • SAP ERP/ECC

DATA QUALITY

• Library of transformation, filter and matching functions and quality rules for validating, cleansing, enriching, standardizing, matching and merging data

• Extensible through Custom Functions

• Integration with external DQ tools

MONITORING

• Denodo Monitoring and Diagnosing tool, Denodo Query Monitor, Denodo Monitor Reports: all monitoring information is delivered via the Denodo tools for realtime and historical analysis. Historical dashboards can be created using the monitor information

• Monitoring is also available via SNMP and JMX standards. Therefore interoperate with most leading Systems Management packages (e.g., HP OpenView, Nagios, Zenoss, Osmius, IBM Tivoli and Microsoft WinRM)

OPERATIONS

• Multi-User Development with Version Control integration• Subversion• Microsoft TFS• Git

• Resource Manager to limit and allocate resources to each session, role or user in a way that optimizes resources utilization for each application • Change resources priority• Enforce limited timeouts or

limits on number of rows• Add daily quotas per minute/

day/month: e.g. only 50 queries per day

• Solutions Manager to automate operations and promotions tasks• Centralized management

and distribution of updates to clients

• Centralized management of license keys

• Defining promotion revisions and their dependencies and deploy them to a production cluster with zero downtime

• Centralized management of data source properties and logs

• REST API for automation of tasks from DevOps tools (e.g. jenkins)

DEPLOYMENT PATTERNS

• On-premises, private cloud, public cloud• Basic single server

configuration• HA cluster with load

balancing (Active-Passive and Active-Active)

• Shared or distributed local cache

• Geographically distributed server environments

• Multiple Denodo instances peer-to-peer or multi-layered

• Containerization support through Docker

• Public cloud• Denodo Platform for AWS• Denodo Platform for Azure• Auto-scaling support both in

AWS and Azure

USER INTERFACES

• Integrated Development Studio: Drag-and-drop and low-code developer studio geared to data-oriented developers such as data engineers, power users, and citizen integrators; publish data services with a few clicks.

• Data Catalog UI: Easy-to-use

interface for business-oriented users such as data stewards, data analysts, and citizen analysts.

• Holistic Solution Manager UI: Centralized UI for administrators to manage deployments and promotions..

• Monitoring and Diagnosis Tool: centralized UI for monitoring, auditing and troubleshoot for data engineers and administrators.

OPERATING SYSTEMS

• Microsoft Windows (32-bit and 64-bit platforms) Windows Server 2016, Windows Server 2012, Windows Server 2008, Windows 10, Windows 8.1 and Windows 7

• Linux (32-bit and 64-bit platforms) Ubuntu 12.04 LTS and 14.04 LTS, CentOS 6 and 7, Red Hat Enterprise Linux (RHEL) 6 and 7, Oracle Linux 6 and 7

• UNIX (64-bit platforms), Sun Solaris

• Any Java 1.8 or greater compatible OS

MINIMUM HARDWARE REQUIREMENTS

• Processor : Intel Xeon quad-core or similar. High-load scenarios or cases with complex calculations may require 8 cores or more.

• Physical memory (RAM) : 16 gigabytes of memory so the Denodo server can allocate a runtime heap space up to 8 gigabytes.

• Disk space : Minimum: 5 gigabytes, Recommended: 100 gigabytes. Denodo only needs around 1 GB of disk space. If the cache is installed on the same server, more disk space will be required.

Visit www.denodo.com | Email [email protected] | Discover community.denodo.com