rackspace analytical compute grid (acg)

Post on 22-May-2015

1.012 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Rackspace’s Enterprise Business Intelligence group (EBI) was looking for a cost-effective way to support the reporting and information needs of its internal users, which include business and operations personnel. It was also looking to scale out new infrastructure in order to meet their increasing business demands, house increasing amounts of data, and customize the collection of data, while seeking a way to move away from their legacy Data Warehouse solution. To do this, Rackspace built the Analytical Compute Grid (ACG) by using Hadoop, Cassandra and PostgreSQL with an OpenStack cloud. Read more about it in this presentation.

TRANSCRIPT

April 12, 2023

Analytical Compute Grid (ACG)

Elastic “Big Data” Infrastructure

by Natasha Gajic

Big Data on Open Cloud

2RACKSPACE® HOSTING | WWW.RACKSPACE.COM

Rackspace’s EBI Environment

Current EnvironmentWindows and Linux

operating systemsOracle and Microsoft

databases solutionsMicrosoft and Oracle

replication technologySSISInformaticaDedicated serversRapid data set growth

“Big Data” ProblemCost of purchasing

additional licensesTime required to set up

new hardwareIncreased demand for DBA

resourcesSystem performanceSystem scalabilityCapacity

3RACKSPACE® HOSTING | WWW.RACKSPACE.COM

Analytical Compute Grid (ACG) Features

•Host ever growing set of data•Quick data collection and retrieval•Rapid scalability•Ease of maintenance•Provide standard data access API

4RACKSPACE® HOSTING | WWW.RACKSPACE.COM

Analytical Compute Grid (ACG) Features

•Ability to provide variety of storage types:

• Columnar

• Relational

• HDFS

•Enable users to select optimal storage type for information collected

•Leverage Rackspace® Private Cloud powered by OpenStack® and open source technology

5RACKSPACE® HOSTING | WWW.RACKSPACE.COM

Analytical Compute Grid (ACG) Quality Attributes

6RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

High Level Architecture

7RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack® 

8RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Image

9RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Database Engine Selection

Columnar Cassandra

Relational PostgreSQL

HDFS Hadoop

10RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Node

11RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Node

12RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Node

13RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Node

14RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Controller

15RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Controller

16RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Controller

17RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

API

18RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Indexing Structure

19RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Indexing Structure

20RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Indexing Structure

What is ACG Indexing Structure?

• System entry point

• Set of pointers ultimately addressing database entities

21RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Indexing Structure

What is ACG Indexing Structure?

• System entry point• Set of pointers ultimately addressing database entities

Where is Indexing Structure Located?

• It is a part of ACG so it resides on Open Cloud• ACG Controller manages Indexing Structure

22RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Indexing Structure

What ACG Indexing Structure Enables?

• Splitting of large data sets across many instances• Query parallelization• Controlled data store size• Optimal data store configuration• Uniform access to data residing in various storage types• System scalability as it expands horizontally and vertically to address ever growing data set

23RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes

24RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes - Performance

Rackspace® Private Cloud powered by OpenStack®

Creates ACG node in 30 secondsCreates ACG nodes concurrentlyRe-size ACG nodes adding CPUs

25RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes - Performance

Rackspace® Private Cloud powered by OpenStack®

Creates ACG node in 30 secondsCreates ACG nodes concurrentlyRe-size ACG nodes adding CPUs

ACG

Indexing structure and controlled data set size allow for: Quick data distribution Query parallelization

26RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Availability

Rackspace® Private Cloud powered by OpenStack®

Rapidly replace failed ACG nodes

27RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Availability

Rackspace® Private Cloud powered by OpenStack®

Rapidly replace failed ACG nodes

ACG

Deploys data store native availability mechanisms (replication, data distribution…)

28RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Maintainability

Rackspace® Private Cloud powered by OpenStack®

Adding ACG nodes expands: Storage capacity CPU power MemoryNo DBA or system administrators activity required

29RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Maintainability

Rackspace® Private Cloud powered by OpenStack®

Adding ACG nodes expands: Storage capacity CPU power RAM No DBA or system administrators activity required

ACG

Controlled data set size enables: Optimal and stable data store configuration Reducing demand for managing data store objects Stable query execution plans

30RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Flexibility

ACG

Variety of storage types:Columnar – Cassandra : time series dataRelational – PostgreSQL : relational dataHDFS – Hadoop : un-structured data

Ability to select optimal storage type for individual use case

31RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Usability

ACG

Standard interfaces: SQL language JDBC API ODBC

ACG Management Console

ACG Monitoring Console Loader utility implementing: Bulk Loader Insert Loader

32RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Current State

33RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Current State

ACG Controller

•ACG Manager•Rule Engine•Node Manager•ACG Management Console•ACG Monitoring

Columnar Implementation

•Data Store Controller•JDBC extended to work with supercolumn•Loader integrated with Informatica

Relational Implementation

•Data Store Controller•JDBC driver extended with distributed query rewrite•Loader integrated with Informatica•ODBC (In Progress)

HDFS Implementation

•Will start soon

34RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Rackspace Use Case

35RACKSPACE® HOSTING | WWW.RACKSPACE.COM

• Subject:

• Complex availability calculation sourcing 3 months of monitoring data and creating 1 billion records in initial calculation

ACG on Rackspace® Private Cloud powered by OpenStack®

Rackspace Use Case

36RACKSPACE® HOSTING | WWW.RACKSPACE.COM

• Environment 1

• Data Warehouse Microsoft SQL server database• SSIS data loading• SQL server with 24 CPUs and 250GB RAM was dedicated to the initial calculation

• SQL server stored procedure performed the calculation

• Source and result are stored in traditional data warehouse structure

ACG on Rackspace® Private Cloud powered by OpenStack®

Rackspace Use Case

37RACKSPACE® HOSTING | WWW.RACKSPACE.COM

• Environment 2

• ACG running two Cassandra clusters 4 nodes each

• Informatica with Cassandra bulk loader• Each ACG node has 2CPUs and 8GB RAM• Java program running on instance with 4CPUs and 8GB RAM

• Source and result are stored in columnar structure suitable for time series data

ACG on Rackspace® Private Cloud powered by OpenStack®

Rackspace Use Case

38RACKSPACE® HOSTING | WWW.RACKSPACE.COM

• Calculation Duration

•Microsoft SQL Server lasted 5 days•ACG calculation completed in 3.5 hours

• Storage Size• Microsoft SQL server 500GB •ACG 20 GB

• Complexity of the calculation•Columnar data store is optimal for time series data. Sourcing from columnar data store resulted in relatively simple Java calculation process comparing to SQL server stored procedure

ACG on Rackspace® Private Cloud powered by OpenStack®

Rackspace Use Case - Result

39RACKSPACE® HOSTING | WWW.RACKSPACE.COM

• Selecting optimal data store for use case resulted in:

• Substantial performance improvement• Reduced storage demand•Simplified processes•Ability to process terabytes of data per day close to real-time and on-demand

•Improved trending and reporting:• enhances support capabilities

• improved Rackspace customer experience

• Significant cost reduction

ACG on Rackspace® Private Cloud powered by OpenStack®

Rackspace Use Case - Conclusion

40

RACKSPACE® HOSTING | 5000 WALZEM ROAD | SAN ANTONIO, TX 78218

US SALES: 1-800-961-2888 | US SUPPORT: 1-800-961-4454 | WWW.RACKSPACE.COM

RACKSPACE® HOSTING | © RACKSPACE US, INC. | RACKSPACE® AND FANATICAL SUPPORT® ARE SERVICE MARKS OF RACKSPACE US, INC. REGISTERED IN THE UNITED STATES AND OTHER COUNTRIES. | WWW.RACKSPACE.COM

41RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG UI

42RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG UI

43RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG UI

top related