big data goes airborne. propelling your big data initiative with ironcluster & amazon web...

22
Big Data Goes Airborne

Upload: syncsort

Post on 21-Jun-2015

228 views

Category:

Software


2 download

DESCRIPTION

Learn about the only solution to instantly provision a full-featured ETL environment running on AWS for less than your Sunday newspaper!

TRANSCRIPT

Page 1: Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster & Amazon Web Services

Big Data Goes Airborne

Page 2: Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster & Amazon Web Services

Big Data Goes Airborne

Jorge A. LopezDirector Product Marketing, Syncsort

Chris KeyserPartner Solution Architect, Amazon Web Services

Page 3: Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster & Amazon Web Services

Agenda

1. The Cloud as a Data Platform

2. Addressing Data Processing Challenges with Ironcluster & AWS

3. DEMO

4. Closing Comments + Q&A

Page 4: Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster & Amazon Web Services

Why Are Customers Adopting Cloud and AWS?

1.

Cost savings through

economics of scale

Don’t have to guess on capacity

3.

Agility, Speed to market & Flexibility

4.

Global in minutes

5.

2.

Trade capital expense for

variable expense

Security and Compliance

6.

Page 5: Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster & Amazon Web Services

AWS Global Infrastructure

10 Regions

26 Availability Zones

51 Edge Locations

Page 6: Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster & Amazon Web Services

The Good News Is that Cloud Isn’t an ‘All or Nothing’ Choice

On-Premises Resources

Cloud Resources

Integration

Corporate Data Centers

Page 7: Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster & Amazon Web Services

Integrating Your On-Premises, AWS and SaaS Infrastructure

Applications on premise

App Migration/Archiving

Hybrid Data Warehouse / BI

Active Directory

Network ConfigurationCorporate

Data Centers

Users & Access Rules (IAM)

Your Private Network (VPC)

Your On-Premises Data Center

AWS Direct Connect Your CloudData Center

Applications on AWS

Data Warehouse/BI

Managed Databases

Page 8: Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster & Amazon Web Services

AWS Provides Broad and Deep Services

Regions Availability Zones Content Delivery POPs

Storage GatewayS3 EBS Glacier Import/ExportDynamoDB ElastiCache

StorageCompute Databases

RDS

MySQL, PostgreSQLOracle, SQL Server

Elastic Load BalancerEC2 Auto Scaling

Direct Connect Route 53VPCNetworking

Analytics

Data PipelineRedshiftEMR Kinesis SWFSNS SQS CloudSearchSES AppStreamCloudFront

Application Services

WorkSpaces

Management &AdministrationIAM CloudWatchCloudTrail APIs and SDKsManagement ConsoleCloud HSM Command Line Interface

Elastic Beanstalk for Java, Node.js, Python, Ruby, PHP and .Net OpsWorks CloudFormationContainers & Deployment

Technology Partners Consulting Partners AWS MarketplaceEcosystem

Support CertificationTrainingProfessional Services

Page 9: Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster & Amazon Web Services

G2

GPUenabled

M3

General purpose

Memoryoptimized

R3

Storage and IOoptimized

C3

Computeoptimized

I2 HS1

32 vCPU60 GB RAM720 GB SSD

32 vCPU244 GB RAM6.4 TB SSD

16 vCPU117 GB RAM48 TB HDD

8 vCPU15 GB RAM1536 CUDA cores4 GB Video RAM

32 vCPU244 GB RAM720 GB SSD

c3.8xlarge i2.8xlarge hs1.8xlarge r3.8xlarge G2.2xlarge

8 vCPU30 GB RAM160 GB SSD

m3.2xlarge

Amazon EC2 - Broad Selection of Compute Instance Families

Page 10: Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster & Amazon Web Services

AWS as a Data Platform

EC2EBS

Instance Storage

RedshiftRDS

SQL Stores

EMR

hadoop

DynamoDB

NoSQL

Kinesis

stream

CloudSearch

search

S3

Storage Services

CloudFrontGlacier

DBA

Data

Velocity

Variety

Volume

Structured, Unstructured, Text, Binary

Gigabytes, Terabytes, Petabytes

Millisecond, Second, Minute, Hour, Day

Page 11: Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster & Amazon Web Services

Master instance group

Task instance group

Core instance group

HDFS HDFS

Amazon S3Amazon Redshift

Amazon DynamoDB

Amazon EMR - Hadoop Tuned for AWS

Page 12: Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster & Amazon Web Services

Amazon Redshift - Petabyte Scale Data Warehouse

Leader Node– SQL endpoint– Stores metadata– Coordinates query execution

Compute Nodes– Local, columnar storage– Execute queries in parallel– Backup and restore via S3– Parallel load from S3, EMR, or

DynamoDB

HW optimized for data processing– DW1: 2TB – 1.6PB Magnetic– DW2: 160GB – 256TB SSD

10 GigE(HPC)

IngestionBackupRestore

SQL Clients/BI Tools

128GB RAM

16TB disk

16 cores

Amazon S3 / DynamoDB / SSH

JDBC/ODBC

128GB RAM

16TB disk

16 coresCompute Node

128GB RAM

16TB disk

16 coresCompute Node

128GB RAM

16TB disk

16 coresCompute Node

LeaderNode

Page 13: Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster & Amazon Web Services

The Data Processing Challenge

!! !

Page 14: Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster & Amazon Web Services

Innovative Cloud Solutions

Ironcluster ETL,Amazon EC2 Edition

COLLECT, PROCESS & DISTRIBUTE DATA AT DISRUPTIVE SCALE & COST

Blazingly FAST, infinitely SCALABLE EASY to use graphical user interface Self-tuning engine for SMART data integration The capacity you need, when YOU need it Instantly provision with single-click access

Ironcluster Hadoop ETLfor Amazon EMR

Now FREEin the AWS

Marketplace!

Only pure-play ETL app available on the AWS Marketplace

Page 15: Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster & Amazon Web Services

Ironcluster – Enterprise-grade ETL in 3 Easy Steps

Done? Spin Down Ironcluster

Go to AWS Marketplace & Select Your Ironcluster Instance

Spin up Ironcluster & Start Developing

1 2 3

Page 16: Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster & Amazon Web Services

Got Big Data? – Enter Hadoop with Ironcluster Hadoop ETL

Now… How do I get productive quickly?

! Many use cases (Where do I start?)

!! Disparate tools (or BYOL)!!! Lots of manual coding!!!! Expensive, hard-to-find

skillsOutcomes: High Costs + Slow Results

Get Your Hadoop Cluster

! Procure!! Setup!!! Configure!!!! Deploy

Page 17: Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster & Amazon Web Services

Got Big Data? – Enter Hadoop with Ironcluster Hadoop ETL

Now… How do I get productive quickly?

! Many use cases (Where do I start?)

!! Disparate tools (or BYOL)!!! Lots of manual coding!!!! Expensive, hard-to-find

skillsOutcomes: High Costs + Slow Results

Get Your Hadoop ClusterGet Your Hadoop Cluster

! Procure!! Setup!!! Configure!!!! Deploy

Vs.

Now …Get right to work!

Fully Productive in Days + No Brainer Cost

Page 18: Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster & Amazon Web Services

Syncsort Ironcluster: Hadoop ETL for Amazon EMR

Blazingly Fast, Easy to Use Hadoop ETL on Amazon EMR

+( )

Develop MapReduce ETL jobs graphically Create sophisticated data flows in no time,

with a library of Use Case Accelerators Avoid the coding nightmare without

compromising on performance Develop once, reuse many times Leverage all your data, including Amazon

Redshift & S3 sources/targets Scale infinitely with a disruptively low,

“no brainer” price

It’s FREE!!

Page 19: Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster & Amazon Web Services

It’s All About Discovering New Insights

An End-to-End Approach to Data Processing & VisualizationCreate data extracts in seconds with just a click in Ironcluster!

Access your data from virtually any source including Social, Redshift, S3, XML, and more

Visualize w/ Tableau• Combined power of

Hadoop & AWS• Faster queries• All enterprise data• Advanced analytics

Vast Variety ofData Sources

Process w/ Ironcluster in AWS• Fastest & lightweight

run-time ETL engine• Deploy with or without

Hadoop• Comprehensive library of

transformations

TDEs at blazing speed• Directly create TDE

files or objects to load Tableau

• Cut latency• No pre-requisite

software to install

Ironcluster Tableau Connector

Page 20: Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster & Amazon Web Services

Lower Your Cost & Optimize Cloud Computing on Any AWS Platform

Redshift: Transform data, then load to Redshift for reporting and advanced analytics

S3: Stream log data from S3, aggregate for insight into web user behavior, stream back to S3

RDS: Translate data from MySQL, Oracle, Microsoft SQL Server, or PostgreSQL

DynamoDB: Join large data volumes & load to DynamoDB for mobile, gaming and add apps

<---> Throughput

Speed &Efficiency

*Users of the new Ironcluster ETL for EC2 can experience up to a 75% reduction in processing time and total cost of ownership when compared to legacy ETL approaches and tools. Based on Syncsort benchmarking and POCs.

$75% Processing

Time

Cost*

Page 21: Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster & Amazon Web Services

The Possibilities Are Endless

Sort & aggregate massive data volumes generated by mobile devices to improve customer satisfaction

Develop & run complex market risk models on big datasets with Ironcluster in Amazon EMR

Leverage Use Case Accelerators to quickly deploy click-stream and web log analysis applications in AWS

Pre-process PB of data from sensors and research new algorithms to support quality assurance

Page 22: Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster & Amazon Web Services

Visit Us @ The Amazon Web Services Marketplace

Try Ironcluster ETL FREE for 30 Days!www.syncsort.com/IronclusterEC2

Got Big Data? Get Ironcluster Hadoop ETL for Amazon EMR FREE!www.syncsort.com/IronclusterEMR

Watch this Webcast On-Demand - Including a Product Demonstration!http://bit.ly/1zYh9er