white paper - netapp: the global leader in hybrid cloud ... › us › media ›...

14
White Paper Open Apps on NetApp Optimizing Distributed Open Application Solutions with NetApp E-Series Storage By Brian Garrett, VP ESG Lab; Mark Peters, Practice Director & Senior Analyst; and Nik Rouda, Senior Analyst June 2015 This ESG White Paper was commissioned by NetApp and is distributed under license from ESG. © 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.

Upload: others

Post on 26-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: White Paper - NetApp: The Global Leader In Hybrid Cloud ... › us › media › wp-open-apps-on-netapp-e-series.pdfJune 2015 This ESG White Paper was commissioned by NetApp ... service

White Paper Open Apps on NetApp Optimizing Distributed Open Application Solutions with NetApp E-Series Storage

By Brian Garrett, VP ESG Lab; Mark Peters, Practice Director & Senior Analyst; and Nik Rouda, Senior Analyst

June 2015

This ESG White Paper was commissioned by NetApp and is distributed under license from ESG. © 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.

Page 2: White Paper - NetApp: The Global Leader In Hybrid Cloud ... › us › media › wp-open-apps-on-netapp-e-series.pdfJune 2015 This ESG White Paper was commissioned by NetApp ... service

White Paper: Open Apps on NetApp 2

© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.

Contents

Executive Summary ...................................................................................................................................... 3

Background ................................................................................................................................................... 3 IT Priorities ................................................................................................................................................................ 4 Open Application Frameworks ................................................................................................................................. 5 Business Considerations ........................................................................................................................................... 6 Technical Considerations .......................................................................................................................................... 7 Architectural Considerations .................................................................................................................................... 7

Storage Options for Open Application Frameworks .................................................................................... 7

Optimizing an Open Framework with NetApp E-Series Storage .................................................................. 9 Technical Overview ................................................................................................................................................... 9 Technical Benefits ..................................................................................................................................................... 9 Architectural Benefits ............................................................................................................................................. 12 Business Benefits .................................................................................................................................................... 12

The Bigger Truth ......................................................................................................................................... 13 All trademark names are property of their respective companies. Information contained in this publication has been obtained by sources The Enterprise Strategy Group (ESG) considers to be reliable but is not warranted by ESG. This publication may contain opinions of ESG, which are subject to change from time to time. This publication is copyrighted by The Enterprise Strategy Group, Inc. Any reproduction or redistribution of this publication, in whole or in part, whether in hard-copy format, electronically, or otherwise to persons not authorized to receive it, without the express consent of The Enterprise Strategy Group, Inc., is in violation of U.S. copyright law and will be subject to an action for civil damages and, if applicable, criminal prosecution. Should you have any questions, please contact ESG Client Relations at 508.482.0188.

Page 3: White Paper - NetApp: The Global Leader In Hybrid Cloud ... › us › media › wp-open-apps-on-netapp-e-series.pdfJune 2015 This ESG White Paper was commissioned by NetApp ... service

White Paper: Open Apps on NetApp 3

© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.

Executive Summary

Forward-looking IT professionals are experimenting with open application frameworks for big data and software-defined data centers (e.g., Hadoop, and OpenStack) with a goal of leveraging the affordability and flexibility of open source software running on industry-standard servers. When optimizing the storage architecture for open application frameworks, system engineers and storage administrators must ensure that new solutions have the technical capabilities that are needed to meet the information needs of line of business applications.

This report examines the compelling data protection and reliability benefits of switching from internal storage to commercial direct-attached storage systems when moving open application frameworks from proof of concept to production. No longer does a single disk failure cripple a node and immediately impact overall infrastructure performance. Because data is externally protected, additional performance and efficiency gains can be realized by reducing the amount of data replication, lightening the load on compute and network resources, and reducing the amount of storage required just for data protection. Recovery from disk failures is dramatically improved through centralized storage system management, replacing manual processes with automated systems, which reduces operating costs. The resulting improvements in total cost of ownership (TCO) and reductions in risk to the organization more than offset the higher initial purchase price of commercial external direct-attached storage systems.

Background

A growing number of organizations are experimenting with distributed open source application frameworks as a foundation for their big data and software-defined data center initiatives. Indeed, as previously conducted ESG research revealed, other than information security, software-defined data center (e.g., OpenStack, and Ceph) and data analytics (e.g., NoSQL, and Hadoop) initiatives are at the top of the “CIO whiteboard” initiatives and technology meta-trends list (see Figure 1).1

Figure 1. “CIO Whiteboard” Initiatives

Source: Enterprise Strategy Group, 2015.

Taking advantage of large clusters of inexpensive commodity servers holds the potential for significantly driving down IT costs. These commodity servers are becoming the underlying architecture for data center modernization

1 Source: ESG Research Report, 2015 IT Spending Intentions Survey, February 2015.

9%

9%

12%

12%

16%

43%

0% 10% 20% 30% 40% 50%

Reinventing application developmentprocesses for a mobile and cloud world

Mobility

Use of public cloud for applications andinfrastructure

Data analytics

Data center modernization (i.e., software-defined data center)

Information security

Which of these initiatives will be the most important (i.e., ranked number 1) for your organization over the course of 2015? (Percent of respondents, N=601)

Page 4: White Paper - NetApp: The Global Leader In Hybrid Cloud ... › us › media › wp-open-apps-on-netapp-e-series.pdfJune 2015 This ESG White Paper was commissioned by NetApp ... service

White Paper: Open Apps on NetApp 4

© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.

initiatives and open source application frameworks. OpenStack has become the de facto standard for public cloud service providers, and, increasingly, for enterprises architecting their own private cloud infrastructures with a goal of becoming internal service providers delivering the benefits of the cloud with locally controlled data compliance and security.

The promise of big gains from big data analytics is driving the rapid adoption of open source frameworks such as Hadoop and NoSQL. In fact, ESG research indicates that nearly half (45%) of IT organizations plan on deploying a new business intelligence/analytics solution in the next 12-18 months, and a growing number of organizations are reaping the benefits of big data solutions that have been deployed already.2

Figure 2. Plans to Deploy a New BI/Analytics Solution in the next 12-18 months

Source: Enterprise Strategy Group, 2015.

IT Priorities

Within many organizations, the initial impetus for open source application frameworks comes from individuals and departments pushing for a proof of concept (POC) or small installation to solve one specific and immediate problem. Small environments tend to grow organically as more people become aware of the availability and value of the solution, all without a formal budget or strategic planning.

As IT evaluates their initial proof of concept and looks to transition to large-scale production systems, they must make crucial decisions, defining the final architecture, selecting vendors, and determining operating costs in order to maximize their return on investment. ESG recently asked research respondents what they consider to be the most important criteria when selecting a technology vendor/solution. Unsurprisingly, total cost of ownership (TCO) (43%) and overall price (42%) are the most-often cited important criteria (see Figure 3). Many of the other most-cited criteria, including vendor service and support (35%), ease of initial implementation (26%), ease of ongoing management (24%), industry-specific expertise (23%), and reputation/brand of vendor (22%), are drivers of TCO in one way or another. 3

2 Source: ESG Brief, 2015 ‘Big Data’ Spending Trends, April 2015. 3 Source: ESG Research Report, 2015 IT Spending Intentions Survey, February 2015.

Yes, 45%

No, 45%

Don't know, 9%

Does your organization have plans to deploy a new BI/analytics solution in the next 12-18 months? (Percent of respondents,

N=370)

Page 5: White Paper - NetApp: The Global Leader In Hybrid Cloud ... › us › media › wp-open-apps-on-netapp-e-series.pdfJune 2015 This ESG White Paper was commissioned by NetApp ... service

White Paper: Open Apps on NetApp 5

© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.

Figure 3. IT Purchasing Criteria

Source: Enterprise Strategy Group, 2015.

Open source solutions leverage the availability and simplicity of low-cost commodity servers with internal storage to accelerate the deployment of proof of concept solutions. This lets the IT operations and development teams focus on what’s really important to them: getting up to speed quickly on the new technology without impacting existing operations and development efforts.

While the care and feeding of internal storage within these open source frameworks is manageable for small-scale proofs of concept, moving into large-scale production increases the complexity, the risk, and the costs associated with meeting the information needs of the business. Single disk failures cripple performance and managing disk failures and replacements is decentralized and complex. Service, support, ease of implementation, and management—all of which are factors driving TCO—are also hard to ensure with industry-standard servers using internal storage.

Open Application Frameworks

Open application frameworks are designed with a goal of helping organizations rapidly derive value from data and make efficient use of IT resources. The big dogs of the open application frameworks are Hadoop and NoSQL for big data applications, and OpenStack and Ceph for private clouds and infrastructure-as-a-service. All of these frameworks share the same fundamental software-defined architecture—clusters of commodity servers with internal storage—with a goal of providing a cost-effective IT infrastructure platform with extremely high levels of scalability.

Hadoop

The Apache Hadoop framework enables distributed storage and distributed processing of very large data sets using a cluster of commodity servers with internal disk drives. MapReduce is the distributed processing component.

22%

23%

24%

26%

26%

35%

37%

42%

43%

0% 10% 20% 30% 40% 50%

Reputation/brand of vendor

Industry-specific expertise

Ease of ongoing management

Ease of initial implementation

Existing relationship with vendor

Vendor service and support

Product features/functionality

Price

Total cost of ownership (TCO) inclusive of capitalcosts, operational costs, productive benefits, etc.

In general, what would you consider to be the most important criteria to your organization when it comes to selecting a technology vendor/solution? (Percent of

respondents, N=601, five responses accepted)

Page 6: White Paper - NetApp: The Global Leader In Hybrid Cloud ... › us › media › wp-open-apps-on-netapp-e-series.pdfJune 2015 This ESG White Paper was commissioned by NetApp ... service

White Paper: Open Apps on NetApp 6

© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.

MapReduce uses a divide and conquer strategy, where the Map step divides, filters, and sorts the data, and the Reduce step processes the data, returning a summary result. MapReduce runs in parallel throughout the cluster.

Data services are provided by the Hadoop Distributed File System (HDFS), which is able to store files that are larger than the largest disk drive, node, or server. Data is split into smaller chunks and distributed throughout the cluster. Because commodity servers with internal storage are inherently unreliable, HDFS uses data replication to provide a form of fault tolerance. Multiple identical copies of each chunk of data are stored on separate server and disk nodes.

NoSQL

NoSQL covers a wide swath of different database technologies that were developed to overcome the limitations of 40-year-old relational database technology for use with modern web and clustered applications—namely the rise in the volume of data stored, the frequency with which this data is accessed, and the performance and processing needs.

To achieve these goals, NoSQL databases throw out the RDBMS concept of the tabular relationship. In its place, the more than 150 NoSQL vendors have implemented a variety of new data relationship models, including column or wide column store, document store, key-value or tuple store, graph, multimodel, object, XML, multidimensional, and multivalue. As the use of NoSQL increases, many vendors are starting to support SQL or SQL-like interfaces, simplifying programming and broadening the user base. Thus, NoSQL is now often interpreted as “Not only SQL.” Like Hadoop, NoSQL uses server-based clustering and network-based data replication for availability and reliability.

OpenStack

OpenStack is a framework providing infrastructure-as-a-service (IaaS)—otherwise known as the cloud. OpenStack provides pools of infrastructure resources—compute (usually virtual machines), networking (firewalls, load balancers, VLANs, etc.), storage (block, file, and object storage), and software (application servers, databases, etc.). It makes horizontal scaling easy, providing tools for users to spin up new resources as required. Within a single OpenStack environment, users can build pools for virtual desktops (VDI), big data analytics (Hadoop), databases (RDBMS, and NoSQL), and applications (e-mail, web, etc.).

Fundamentally, all of the many moving parts of the OpenStack framework run on industry-standard servers with internal or external direct-attached storage. Compute resources are provided by Nova, which is the fabric controller and is used to deploy and manage large numbers of virtual machines and other resources. OpenStack includes two storage components, Swift and Cinder. Swift provides object and file storage, and, like HDFS, data is replicated across multiple drives and nodes for data protection. Cinder provides block storage, and, in addition to using internal storage, can use external software-based storage platforms (e.g., Ceph) as well as commercial block-based storage systems.

Ceph

Ceph is an open source software-defined storage platform that provides object, block, and file storage services. Ceph is yet another distributed open source framework system that was designed to run on a cluster of commodity servers with internal or external direct-attached storage. Data objects are split into multiple chunks and replicated over the network for scale-out parallelism and the ability to recover from a drive or server failure.

Business Considerations

The primary consumers of open application framework environments are line of business owners, data scientists, and analysts, who demand a quick response to the information needs of the business. They also require high availability and scalability to support more applications, analytics, and users as their needs grow. Growing in importance, and often overlooked, is the need for increased levels of information security because exposing even small amounts of the simplest data can result in significant costs in the form of bad press, loss of business, breach mediation, and lawsuits.

Page 7: White Paper - NetApp: The Global Leader In Hybrid Cloud ... › us › media › wp-open-apps-on-netapp-e-series.pdfJune 2015 This ESG White Paper was commissioned by NetApp ... service

White Paper: Open Apps on NetApp 7

© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.

The CIOs and their teams are tasked with delivering value to the business through efficient infrastructure and operations. IT organizations must meet a demanding set of requirements for functionality, performance, reliability, and availability as specified in service level agreements (SLAs). Ultimately, all of these business requirements must be met at a cost lower than ever before.

Technical Considerations

When optimizing the storage architecture for open application frameworks, system engineers and storage administrators must ensure that new solutions have the technical capabilities that are needed to meet the information needs of line of business applications. First and foremost are performance and scalability. For the most part, the new architecture must provide high aggregate throughput. In some cases, such as some NoSQL databases, performance requirements include supporting the processing of large numbers of small transactions rather than high throughput. Another factor affecting overall performance is storage latency. Ultimately, the final architecture must support more applications, workloads, and users on the same storage infrastructure.

Equally important, the system must provide reliability, with tools to manage and automate recovery, reducing the load on IT staff, and increasing performance with fewer nodes and clusters impacted by long-running rebuilds. Any storage solution must be efficient, decreasing the load on compute and network resources, and requiring less raw disk capacity to store the same amount of user data.

Architectural Considerations

To provide the biggest benefits to their organizations, system architects must ensure that the production environment is very stable. Degradation of system performance or downtime due to internal storage failure is much more than a nuisance; it can cost the company through missed opportunities and loss of business.

From the system architect’s perspective, white box servers with internal storage are viable low-cost options for proofs of concept, but quickly become untenable when transitioning to large-scale production environments. Production systems depend on the increased reliability, serviceability, and manageability that come with branded commercial solutions. Operations staff lean heavily on branded service and support organizations that bring with them many years of experience, enabling rapid resolutions to both common and esoteric problems.

Storage Options for Open Application Frameworks

When first deploying open application frameworks for a proof of concept, it makes sense to leverage low-cost, readily available commodity servers using internal storage. The goal is to reduce both the POC cost and the time it takes to acquire, install, and provision hardware, enabling developers and users to get up and running as quickly as possible. As system familiarity increases, the proof of concept can be expanded by quickly adding more servers to the environment (see Figure 4).

Page 8: White Paper - NetApp: The Global Leader In Hybrid Cloud ... › us › media › wp-open-apps-on-netapp-e-series.pdfJune 2015 This ESG White Paper was commissioned by NetApp ... service

White Paper: Open Apps on NetApp 8

© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.

Figure 4. Optimizing Open Source Clusters with Purpose-built Direct-attached Storage

Once the decision has been made to deploy a full production open source framework, implementers can expand their system architecture horizons, looking to optimize the environment for the long-term. One area of open framework architecture that is ripe for optimization is storage. While internal disks are quick and easy to deploy, lowering initial investment, they represent a significant management headache at scale. Unfortunately, disks fail much more frequently than other system components. Although open frameworks are designed to tolerate disk failures, there are few tools available to manage storage. And while the repair process is automatic, the extra network traffic and CPU overhead to replicate data to another platform injects random variability and overhead into the deliverability of business value.

Some organizations have considered shared storage—both traditional SAN and scale-out NAS—as potential improvements to the underlying clustered server with internal storage architecture. However, open frameworks build sharing into their software stacks. Scale-out NAS and SAN solutions are encumbered with expensive network and storage processing resources whose sole purpose is to facilitate sharing—resources that aren’t needed in an open framework. For all of these reasons, shared storage solutions are suboptimal for open application frameworks.

Open application frameworks can be optimized by leveraging the many benefits of commercial direct-attached storage systems. Purpose-built direct-attached storage can be used to meet the production-level requirements for distributed open application frameworks while still maintaining an open source paradigm. Internal storage can be replaced with direct-attached external storage systems, using the same SAS interface from the servers. This is completely transparent to the open application framework, and requires no modification to the software, and no other changes to the system architecture.

Data protection and reliability is increased through the use of advanced data protection schemes including Dynamic Disk Pooling (DDP), erasure coding, and advanced hardware-assisted RAID algorithms. No longer does a single disk failure cripple a node and immediately impact overall infrastructure performance. Because data is externally protected, additional performance and efficiency gains can be realized by reducing the amount of data replication, lightening the load on compute and network resources, and reducing the amount of storage required just for data protection. Recovery from disk failures is drastically improved through centralized storage system management consoles, replacing manual processes with automated systems, which reduces operating costs. The resulting improvements in TCO and reductions in risk to the organization more than offset the higher initial purchase price of commercial external direct-attached storage systems.

Page 9: White Paper - NetApp: The Global Leader In Hybrid Cloud ... › us › media › wp-open-apps-on-netapp-e-series.pdfJune 2015 This ESG White Paper was commissioned by NetApp ... service

White Paper: Open Apps on NetApp 9

© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.

Optimizing an Open Framework with NetApp E-Series Storage

NetApp E-Series direct-attached storage systems are ideally suited to meet the challenges associated with moving an open application framework from POC to production. NetApp E-Series storage systems provide an ideal balance of features, performance, reliability, and manageability, and are a direct drop-in replacement for internal drives in an open source framework architecture.

Technical Overview

NetApp E-Series storage systems are designed to provide terabytes to petabytes of robust, highly available, scalable modular storage. Supporting up to 2.3PB of capacity, 12GB/sec of throughput, and 192TB of flash, the E-Series product family provides a cost-effective range of performance and capacity options that are ideally suited to meet the production needs of open application frameworks.

Figure 5. NetApp E-Series Storage

Technical Benefits

Using NetApp E-Series to optimize open source application frameworks is easy. Simply replace the internal disk drives with an externally attached NetApp E-Series storage array, using a SAS interface. For reliability and redundancy, multiple SAS cables can be connected between servers and the E-Series systems. The open application framework sees the E-Series storage as a disk drive, so no changes are required to the open source software. This also opens up the opportunity to purchase performance-optimized thin servers to reduce power, space, and cooling needs. It also provides the flexibility of scaling storage and compute independently, which makes it easier to take advantage of the clustered scale-out architectures of open applications while cost-effectively meeting the growing needs of the business.

Performance

By coalescing the performance of multiple disk drives and implementing adaptive read and write caching, the NetApp E-Series provides significant performance improvements over single disks installed internally. Up to 12GB/sec of throughput can be achieved with one NetApp E-Series system. For those application frameworks that

Page 10: White Paper - NetApp: The Global Leader In Hybrid Cloud ... › us › media › wp-open-apps-on-netapp-e-series.pdfJune 2015 This ESG White Paper was commissioned by NetApp ... service

White Paper: Open Apps on NetApp 10

© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.

demand high transactional performance, the E-Series can sustain up to 650,000 IOPS. Data protection is implemented in hardware for quick rebuilds with no application server involvement. Up to eight SAS host interfaces provide up to 48Gb/s of quad-lane SAS connectivity to open application framework servers, per interface, and multiple servers can be attached to the same E-Series array.

This ability to connect multiple servers to the same pool of storage through multiple SAS interfaces is a powerful new concept compared with traditional direct-attached disk arrays. It provides the cost effectiveness of a shared pool of storage resources yet appears to an open application as a bunch of direct-attached disk volumes. NetApp refers to this capability, along with the extended features provided by the E-Series, as shared Enterprise DAS. Shared Enterprise DAS delivers the power of an enterprise-class SAN array with the cost effectiveness and flexibility of the direct-attached storage paradigm that’s preferred by open application framework developers.

ESG Lab previously demonstrated the performance gains that can be achieved with NetApp E-Series compared with an internal drive in an open application framework. As shown in Figure 6, ESG Lab testing of NetApp E-Series with network-free hardware RAID-5 (6+1) and a Hadoop replication count of two, increased storage capacity utilization by 22% compared with a Hadoop cluster with internal drives and a default replication count of three. In other words, network-free hardware RAID leverages the hardware-optimized data protection that’s built into an external direct-attached disk array to eliminate the overhead associated with using the network to maintain redundant copies of data in a cluster of servers with internal storage. Network-free hardware RAID also increased the performance and scalability of the cluster due to a 33% reduction in the amount of mirrored data flowing over the network.4

Figure 6. Increasing Open Application Performance with NetApp E-Series

Manageability

E-Series storage can be managed with NetApp SANtricity Web Services, a RESTful application programming interface (API), which is the preferred management method for forward-looking developers in open application framework communities. Representational State Transfer (REST) is a set of software architecture guidelines and

4 Source: ESG Lab Validation Report, NetApp Open Solution for Hadoop, May 2012.

Page 11: White Paper - NetApp: The Global Leader In Hybrid Cloud ... › us › media › wp-open-apps-on-netapp-e-series.pdfJune 2015 This ESG White Paper was commissioned by NetApp ... service

White Paper: Open Apps on NetApp 11

© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.

best practices for managing web-scale services. The stateless architecture of the RESTful client-server programming model was designed to simplify the management of large numbers of components and the interactions between those components. NetApp Web Service 1.2 provides the RESTful APIs that are ideally suited to meet the web-scale management needs of open application frameworks including Hadoop and OpenStack.

With RESTful API support providing the hooks for high-level policy-based management and deployment, SANtricity Storage Manager, the NetApp E-Series management console, provides a centralized management GUI for monitoring and managing low-level hardware events including drive failures and replacements. SANtricity Storage Manager manages multiple E-Series systems, which makes it easy for storage managers to quickly diagnose and resolve issues. All management tasks can be performed while the storage remains online with complete read and write access, enabling administrators to make configuration changes, perform maintenance, or expand storage capacity without impacting availability. Manageability is also enhanced with predictive drive failure and advanced performance monitoring.

Recoverability

With open application frameworks, when a drive fails, a node is compromised and overall system performance is degraded. The system compensates by making additional copies of data on other nodes that were resident on the failed drive, requiring both compute and networking resources, which are then no longer available to fulfill the primary functions of the architecture. In extreme cases, such as the default Hadoop configuration, disk failures can cause the node to be blacklisted. The Hadoop administrator must take the node offline, service and replace the failed drive, and then redeploy. This process can take several hours to complete.

During ESG Lab testing of NetApp E-Series with Hadoop, recovery after an internal drive failure took more than twice as long (225%) as a failure of a hardware RAID-protected NetApp E-Series drive.5 The internal drive failure took more than twice as long to recover because the data node was blacklisted and jobs had to be restarted on surviving nodes. The data node was not blacklisted during simulated drive failures of the NetApp E-Series due to the fact that a drive failure within a RAID-protected NetApp E-Series array is totally transparent to open source cluster management software.

The recoverability of applications running on open application frameworks can also be improved with Dynamic Disk Pooling (DDP), a powerful patent pending feature that’s supported on all NetApp E-Series models. DDP is a data protection scheme that was designed to provide greater simplicity, flexibility, and availability than traditional RAID. DDP distributes data, parity information, and spare capacity across a pool of drives, defining which drives are used for segment placement to ensure data is fully protected. DDP was designed with a goal of maintaining high performance while recovering from a drive failure and rebuilding more quickly. ESG Lab testing confirmed that DDP performs virtually the same as traditional RAID during normal operation and provides much faster rebuilds.6

Online Upgrades

When using internal drives, every expansion of storage capacity is paired with an additional server, imposing cost, time, space, power, cooling, and management penalties. The entire node must be configured, integrated into the environment, and brought online before the storage can be used. Alternatively, the NetApp E-series is highly expandable, with a single system supporting up to 384 drives. Drives can be added to the direct attached storage array as more storage capacity is required without the need to provision an additional server, thus reducing the storage scalability cost. New storage is immediately available for use, simplifying and speeding up the scaling of the entire environment, independent of compute needs.

Encryption at Rest

NetApp E-Series storage systems support advanced data security features including media erasure and full hardware disk encryption. SANtricity not only configures such features, but it also manages encryption keys for

5 Source: Ibid. 6 60% faster during ESG Lab testing. Source: ESG Lab Validation Report, NetApp E-Series, May 2014.

Page 12: White Paper - NetApp: The Global Leader In Hybrid Cloud ... › us › media › wp-open-apps-on-netapp-e-series.pdfJune 2015 This ESG White Paper was commissioned by NetApp ... service

White Paper: Open Apps on NetApp 12

© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.

each disk in the entire pool of NetApp storage systems. This also helps address the CIO concerns about data security.

Architectural Benefits

Architecting the optimum open source application environment is simplified by using NetApp E-Series direct-attached storage systems in place of internal disk drives. System architecture is identical, using the same SAS interconnections between servers and storage. Software views the E-Series direct-attached storage systems as internal storage—no software or configuration changes are required. This plug-and-play compatibility enables system architects to leverage the proven reliability, availability, serviceability, and manageability of NetApp E-series to deliver the benefits of shared Enterprise DAS without compromising the architectural spirit and intent of an open application framework.

Business Benefits

Business stakeholders, from the boardroom to line of business owners, data scientists, and analysts demand effective IT solutions to their problems. Business requirements are specified in SLAs, defining functionality, performance, reliability, availability, and security. NetApp E-series direct-attached storage has proven reliability, availability, and security, and it increases the performance of open application frameworks. Expert service and support teams back E-Series’ proven reliability, availability, serviceability, and manageability. Optimizing open source application environments with NetApp E-Series storage empowers CIOs and their teams to meet and exceed SLAs while they reduce the costs and risks associated with moving from POC to production.

Page 13: White Paper - NetApp: The Global Leader In Hybrid Cloud ... › us › media › wp-open-apps-on-netapp-e-series.pdfJune 2015 This ESG White Paper was commissioned by NetApp ... service

White Paper: Open Apps on NetApp 13

© 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved.

The Bigger Truth

Big data and the software-defined data center are not just fad-of-the-day buzzwords. Early adopters are deriving significant, real value from these new technologies. As the visibility of successes increases, more organizations are starting their own proof of concept programs, most often with open source application frameworks. Thus, at this point, data analytics and data center modernization are just behind information security as the most important CIO whiteboard initiatives.

Once features and functionality have been validated, IT organizations must figure out how to transition proof of concept experiments into large-scale production environments that meet the needs of many different business stakeholders. ESG research reveals that TCO is surveyed organizations’ most-cited important purchasing criteria in selecting a technology/vendor solution, followed closely by initial acquisition costs.7 Many other important factors affect system architecture and purchasing decisions as well, including improving information security and risk reduction.

Open source application frameworks for big data, such as Hadoop and NoSQL, and frameworks to implement the software-defined data center, such as OpenStack and Ceph, rely on a common architecture of clusters of servers with internal storage. This architecture leverages the economies of scale of commodity IT hardware to provide complex functionality with low initial acquisition costs and rapid deployment. These frameworks use software to cluster industry-standard servers with internal storage into a unified environment. Clustered systems are very easy to scale, and provide reliability and resiliency through redundancy.

It is a sad fact that disk drives are fragile mechanical devices that fail more often than most other IT components. Open application frameworks accommodate disk failures through replication with multiple copies of data replicated over the network between nodes in the cluster. While the use of internal drives is great for a proof of concept, it introduces performance overhead, costs, and risk when moving to wide-scale production.

The ESG analysis and hands-on testing presented in this report quantify the tangible benefits that you and your organization can achieve with a distributed open application framework (e.g., Hadoop, NoSQL, OpenStack, and Ceph) that leverages purpose-built direct-attached NetApp E-Series storage as you move from proof of concept to production. If you’re a line of business manager who is looking to accelerate insight and profitability for the business, an IT architect looking to reap the benefits of an open framework with better performance and less risk, or an IT administrator who is looking to save time and money, ESG recommends that you consider NetApp E-Series storage for your next distributed open application project.

7 Source: ESG Research Report, 2015 IT Spending Intentions Survey, February 2015.

Page 14: White Paper - NetApp: The Global Leader In Hybrid Cloud ... › us › media › wp-open-apps-on-netapp-e-series.pdfJune 2015 This ESG White Paper was commissioned by NetApp ... service

20 Asylum Street | Milford, MA 01757 | Tel: 508.482.0188 Fax: 508.482.0218 | www.esg-global.com