sap & mapr solution brief 2015

16
SAP Technical Brief Data Warehousing SAP IQ Supercharge SAP® IQ with the MapR Data Platform for Amazing Analytic Power Benefits Solution Objectives Quick Facts © 2015 SAP SE or an SAP affiliate company. All rights reserved.

Upload: vishwas-tengse

Post on 16-Aug-2015

47 views

Category:

Documents


2 download

TRANSCRIPT

SAP Technical Brief

Data Warehousing

SAP IQ

Supercharge SAP® IQ with the MapR Data Platform for Amazing Analytic Power

BenefitsSolutionObjectives Quick Facts

© 2

015

SAP

SE o

r an

SAP

affilia

te c

ompa

ny. A

ll rig

hts

rese

rved

.

2 / 15 © 2015 SAP SE or an SAP affiliate company. All rights reserved.

Objectives

Making the Most of Big DataTo seize the invaluable opportunities of Big Data – and meet the IT challenges it presents – companies like yours need to stay on the cutting edge of database technology and support mission-critical business intelligence, analytics, and data warehousing.

With software from SAP and MapR, you can implement a column-store database and data platform that offer advanced analytics with superior scale-out. Using the multiplex grid option of SAP® IQ database software and the MapR data platform to store your data files, you can realize significant performance gains compared to traditional file systems.

Pairing SAP IQ database software and the underlying storage technology from MapR enables server clusters in SAP IQ to exhibit near-linear scalability in storage input/output throughput as more server nodes in SAP IQ are added to the cluster. Working together, SAP and MapR conducted tests to demonstrate the high-speed performance of the software. Details of the test results and system architec-ture configurations are presented throughout this paper.

Making the Most of Big Data

Benefits Quick FactsSolution

3 / 15 © 2015 SAP SE or an SAP affiliate company. All rights reserved.

High-speed performance and powerful analyticsSAP IQ provides excellent data compression, fully parallel data loading, fast ad hoc queries, a rich dialect of structured query language (SQL), built-in full text search, a wide variety of database access protocols, and an extensi-bility framework for user-defined functions. Sophisticated indexing technology and a powerful optimizer distribute queries across a multiplex grid for massively parallel operation (see the figure on the next page).

The MapR data platform includes a file system called MapR-FS, a more powerful version of the Apache Hadoop distributed file system (HDFS). Unlike Apache HDFS, which is a layer that runs on other systems, MapR-FS is a true distributed file system that manages direct

disk access for Apache Hadoop and other software with demanding input/output requirements.

MapR is compliant with portable operating system interface (POSIX) criteria and provides an industry-standard network file system (NFS). With MapR you can perform random reads and writes and simultaneously read and write to a file. You get automatic and transparent data compression and integrated multitenant functionality. You can stream data directly to Apache Hadoop clusters and use thousands of existing tools and applications. MapR works well with non-Java programming languages and eliminates the need for most proprietary or specialized Hadoop connectors.

High-speed performance and powerful analytics

Comparing MapR functionality to the competition

Test parameters and architecture configuration

Test results show scalability

Test results show read performance scales linearly

Solution BenefitsObjectives Quick Facts

4 / 15 © 2015 SAP SE or an SAP affiliate company. All rights reserved.

BenefitsSolutionObjectives Quick Facts

High-speed performance and powerful analytics

Comparing MapR functionality to the competition

Test parameters and architecture configuration

Test results show scalability

Test results show read performance scales linearly

Multiplex using SAP® IQ

Shared storage fabric

Full mesh interconnectShared

interconnect

Shared

CPU, memory

Shared

Storage

5 / 15 © 2015 SAP SE or an SAP affiliate company. All rights reserved.

Comparing MapR functionality to the competitionThe MapR data platform and its included file system, MapR-FS, are compatible with Apache HDFS and serve the same role in Hadoop while providing substantial additional functionality. With the MapR data platform and MapR-FS, you get:

• Full random read/write access • True NFS access • Consistent snapshots • Volumes • Data placement control • Enterprise-grade high availability and disaster recovery

With MapR-FS, as with a standard file system, you can read and write data to any part of an existing file. But Apache HDFS, originally designed to index Web pages, only allows appending – not reading and writing – to existing files.

BenefitsSolutionObjectives Quick Facts

High-speed performance and powerful analytics

Comparing MapR functionality to the competition

Test parameters and architecture configuration

Test results show scalability

Test results show read performance scales linearly

You can mount MapR-FS using NFS for fast read/write access to data, and the software is scalable to handle extremely large volumes of data. Apache HDFS, on the other hand, requires a staging area to load data that precludes its usefulness as a large-scale platform.

MapR-FS helps protect your data with con-sistent snapshots that instantly capture an exact view of data at a specific point in time, so you can recover data accidentally deleted or corrupted by user or application error. Apache HDFS snapshots are not consistent, and they often include data written to open files well after the time the snapshot was actually taken.

Continued

6 / 15 © 2015 SAP SE or an SAP affiliate company. All rights reserved.

You can implement policies such as quotas, security, and disaster recovery configurations because the MapR-FS file system supports logical partitioning of the distributed disk space with volumes. Apache HDFS has no such partitioning construct for enabling granular policies.

Data placement control functionality in MapR-FS lets you isolate separate data sets by putting data on specific servers in the cluster. Apache HDFS does not allow you to specify data placement in this manner.

MapR-FS automatically supports high availability with no manual configuration required. Disaster recovery functionality includes scheduled mirroring that sends block-level differentials to a remote replica site. Getting high availability with Apache HDFS, however, requires a specific, complex configuration that is prone to error and failure, and there is no true remote repli-cation or mirroring capability. See the figure

on the next page.

Our test results show that you can get Big Data

scale-out performance with SAP IQ and MapR.

BenefitsSolutionObjectives Quick Facts

High-speed performance and powerful analytics

Comparing MapR functionality to the competition

Test parameters and architecture configuration

Test results show scalability

Test results show read performance scales linearly

7 / 15 © 2015 SAP SE or an SAP affiliate company. All rights reserved.

BenefitsSolutionObjectives Quick Facts

High-speed performance and powerful analytics

Comparing MapR functionality to the competition

Test parameters and architecture configuration

Test results show scalability

Test results show read performance scales linearly

MapR-FS and MapR distribution, including Apache Hadoop

8 / 15 © 2015 SAP SE or an SAP affiliate company. All rights reserved.

Test parameters and architecture configurationWorking together, SAP and MapR conducted tests to demonstrate the performance of SAP IQ running on the MapR data platform. The test system configuration is shown in the figure on the next page.

For the test, MapR-FS was set up on eight hosts, each with the following configuration:

• Forty-eight HGST 4TB 7200 RPM SAS hard disk drives

• 4U SuperMicro server • Two Intel Xeon Ivy Bridge E5-2603V2 1.8GHz CPU with 10 MB cache and four cores (eight cores total)

• Sixteen 16 GB DDR3-1600 RAM (256 GB) • One Mellanox 40GbE two-port adapter

A multiplex cluster using SAP IQ was set up on four hosts with the same configuration.

MapR is proven at petabyte scale for click-stream

analysis, retail processes, marketing personalization,

and mobile telecommunications analytics.

BenefitsSolutionObjectives Quick Facts

High-speed performance and powerful analytics

Comparing MapR functionality to the competition

Test parameters and architecture configuration

Test results show scalability

Test results show read performance scales linearly

9 / 15 © 2015 SAP SE or an SAP affiliate company. All rights reserved.

BenefitsSolutionObjectives Quick Facts

High-speed performance and powerful analytics

Comparing MapR functionality to the competition

Test parameters and architecture configuration

Test results show scalability

Test results show read performance scales linearly

Test system configuration

Full mesh interconnect

Server for

SAP IQ

Server for

SAP IQ

Server for

SAP IQ

Server for

SAP IQ

Multiplex using SAP® IQ

MapR file system

10 / 15 © 2015 SAP SE or an SAP affiliate company. All rights reserved.

The performance tests incorporated a variety of random and sequential block access writes and reads, executed from the database servers for SAP IQ to data files residing on the MapR-FS file system. Tests were run on a three- and four-node multiplex grid to measure scalability.

The first scale-out performance test was done with a combined workload of 80% write and 20% read activity, because this workload is similar to a typical application profile for SAP IQ. The test used a block size of 256K, as SAP IQ

executes input/output operations with large block sizes.

The three-node multiplex averaged 551 MB per second (MB/sec) for each database server for SAP IQ. With an additional server node, each host was still able to sustain performance of over 500 MB/sec (see the figure on the next page). This is a significant improvement over observed performance with a fibre channel array in which the maximum throughput is split across hosts.

BenefitsSolutionObjectives Quick Facts

High-speed performance and powerful analytics

Comparing MapR functionality to the competition

Test parameters and architecture configuration

Test results show scalability

Test results show read performance scales linearly

Test results show scalability

11 / 15 © 2015 SAP SE or an SAP affiliate company. All rights reserved.

BenefitsSolutionObjectives Quick Facts

High-speed performance and powerful analytics

Comparing MapR functionality to the competition

Test parameters and architecture configuration

Test results show scalability

Test results show read performance scales linearly

Combined speed of input/output (80% write/20% read) from all nodes on SAP IQ to MapR-FS

2,013 MB/sec

1,653 MB/sec

3 nodes on SAP® IQ 4 nodes on SAP IQ

2,500

2,000

1,500

1,000

500

0

12 / 15 © 2015 SAP SE or an SAP affiliate company. All rights reserved.

Another scale-out performance test was done with 100% read activity, as is typically seen on a data warehouse query, with a 256K block size. As with the 80% write workload, the 100% read performance scaled linearly when another host was added. The average

read input/output for three nodes was 1,583 MB/sec. With another node added to the multiplex, the average read input/output was still above 1,500 MB/sec – far above, actually, at 6,064 MB/sec (see the figure on

the next page).

BenefitsSolutionObjectives Quick Facts

High-speed performance and powerful analytics

Comparing MapR functionality to the competition

Test parameters and architecture configuration

Test results show scalability

Test results show read performance scales linearly

Test results show read performance scales linearly

Using SAP IQ and MapR can bring substantial cost

savings over traditional storage systems.

13 / 15 © 2015 SAP SE or an SAP affiliate company. All rights reserved.

BenefitsSolutionObjectives Quick Facts

High-speed performance and powerful analytics

Comparing MapR functionality to the competition

Test parameters and architecture configuration

Test results show scalability

Test results show read performance scales linearly

Combined speed of input/output (100% read) from all nodes on SAP IQ to MapR-FS

6,064 MB/sec

4,750 MB/sec

3 nodes on SAP® IQ 4 nodes on SAP IQ

7,000

6,000

5,000

4,000

3,000

2,000

1,000

0

14 / 15 © 2015 SAP SE or an SAP affiliate company. All rights reserved.

Your IT and business benefitsSAP IQ and the MapR data platform (along with the included MapR-FS file system) deliver a flexible, high-powered solution that can grow along with your business.

With SAP IQ, you can perform concurrent queries against large amounts of data and turn intelligence into insights to support better decision making across the enterprise. The software’s open architecture, application services layer, excellent performance charac-teristics, and low administrative overhead give it the flexibility, efficiency, and power to meet your needs.

MapR gives you Big Data scale-out capacity on commodity hardware for a database cluster (using the multiplex grid option of SAP IQ). At the same time, you can gain significant cost savings over traditional network attached storage or storage area network systems. What’s more, the MapR data platform has built-in high availability, data protection, and disaster recovery capabilities, making it an ideal industrial-strength storage solution.

Your IT and business benefits

BenefitsSolutionObjectives Quick Facts

MapR and SAP IQ offer you an economical and scalable storage solution with high availability, data

protection, and disaster recovery capabilities.

15 / 15

Summary

Software from SAP and MapR gives you a cutting-edge analytics solution. When you use a multiplex grid based on the SAP® IQ software running on top of a MapR data platform to store your data files, you will see near-linear scalability in storage input/output throughput as more server nodes are added for SAP IQ. The upshot is high performance and cost savings compared to typical file systems – made clear by the performance test results in this paper.

Objectives

• Rise to the opportunities and challenges of Big Data

• Support mission-critical business intelligence, analytics, and data warehousing

• Enable high-performance input/output throughput and scalability

Solution

• Disk-backed, column-store database with superior functionality for data compression, fast data loading, and ad hoc queries with SAP IQ

• Industry-standard distributed file system from MapR

Benefits • Gain business insights with sophisticated analytics

• Get Big Data scale-out capacity and high throughput performance

• Take advantage of built-in high availability, data protection, and disaster recovery

• Realize significant cost savings over traditional network attached storage systems

Learn more

To find out more, call your SAP representative today or visit us online athttp://scn.sap.com/community/developer

-center/analytic-server.

www.sap.com Quick FactsBenefitsSolutionObjectives

Studio SAP 37907enUS (15/05)

© 2015 SAP SE or an SAP affiliate company. All rights reserved.

© 2015 SAP SE or an SAP affi liate company. All rights reserved.

No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP SE or an SAP affi liate company.

SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP SE (or an SAP affi liate company) in Germany and other countries. Please see http://www.sap.com/corporate-en/legal/copyright/index.epx#trademark for additional trademark information and notices. Some software products marketed by SAP SE and its distributors contain proprietary software components of other software vendors.

National product specifi cations may vary.

These materials are provided by SAP SE or an SAP affi liate company for informational purposes only, without representation or warranty of any kind, and SAP SE or its affi liated companies shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP SE or SAP affi liate company products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.

In particular, SAP SE or its affi liated companies have no obligation to pursue any course of business outlined in this document or any related presentation, or to develop or release any functionality mentioned therein. This document, or any related presentation, and SAP SE’s or its affi liated companies’ strategy and possible future developments, products, and/or platform directions and functionality are all subject to change and may be changed by SAP SE or its affi liated companies at any time for any reason without notice. The information in this document is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. All forward-looking statements are subject to various risks and uncertainties that could cause actual results to diff er materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements, which speak only as of their dates, and they should not be relied upon in making purchasing decisions.