how to automate offloading etl processes to hadoop

26
Confidential OPERATIONAL EXCELLENCE FOR BIG DATA APPS

Upload: driven-inc

Post on 11-Apr-2017

830 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: How to Automate Offloading ETL Processes to Hadoop

Confidential

OPERATIONAL EXCELLENCE FOR BIG DATA APPS

Page 2: How to Automate Offloading ETL Processes to Hadoop

Confidential2

TRUSTEDby over 10,000

companies as their big data app platform

BACKEDby top Silicon Valley

investors True Ventures,Rembrandt VP, Bain

Capital

FOUNDEDin 2008, with

headquarters in San Francisco

Page 3: How to Automate Offloading ETL Processes to Hadoop

Confidential

PERFORMANCE MANAGEMENT FOR BIG DATA APPLICATIONS

your big data apps

MONITORto resolve

issues fasterbig data apps

more effectively

MANAGECOLLABORATE

Page 4: How to Automate Offloading ETL Processes to Hadoop

Confidential4

Java, Scala (Scalding), SQL SIMPLEEnsure best practices at any scale thanks to easy-to-learn design

principles

FLEXIBLELeverage existing Java,

Scala, and SQL skills and easily adapt to new

systems

WE ARE THE DEVELOPERS BEHIND CASCADING

RELIABLEAlways get optimal performance and

reliability for big data applications

Page 5: How to Automate Offloading ETL Processes to Hadoop

Confidential

• Use Hadoop for ETL / ELT • Ensure quality and manageability

of our ELT / ELT applications• Translate existing ETL work to

Hadoop• GUI ETL tool for developers that

don’t know Java, Scala, SQL

5

MIGRATING TO HADOOP FOR ETL AT ENTERPRISE SCALE.

Cascading

Driven

?

?

Page 6: How to Automate Offloading ETL Processes to Hadoop

Confidential6

TODAY’S SPEAKERS

Shahab KamalVice President at BitWise Inc.

Shahab is responsible for strategy, growth and client relations. Shahab works with client executives on ITStrategy for Business Intelligence, Big Data, Data Warehousing and Enterprise Applications. Shahab hasworked at Ford Motors, Aon Hewitt and Tribune Company on their PeopleSoft ERP implementation and support.His expertise has been around retrofitting data from legacy applications without loss of data integrity.

Mark CastilloDriven, Inc.

Mark is a Solutions Architect with 15+ years of software engineering background. He has worked in thefinance, security, healthcare, streaming music, marketing, and social networking industries. His technicalknowledge and skills are focused on distributed systems, data processing, networking, Linux appliances andBig Data.

Page 7: How to Automate Offloading ETL Processes to Hadoop

DataMigration–SeamlessTransition toHadoopShahabKamal&MarkCastillo

Page 8: How to Automate Offloading ETL Processes to Hadoop

AboutBitwise

Founded

in1996withHQinChicago,IL

Located

InofficesinIndia&Australia

ISO9001:2008&ISO27001:2005Certified

Backed

ByFortune500customers

ProprietaryTechnology

suiteofAcceleratorsthatreducetheexpense,timeandcomplexityoflarge-scaledataprojects.

Page 9: How to Automate Offloading ETL Processes to Hadoop

Reporting,Mining,Analytics

Analytics

Reporting,Mining,AnalyticsExploratoryDiscovery

Search

DATAMART

ReportingDataMining

STAGE TRANSFORM ARCHIVE

DataLake

Page 10: How to Automate Offloading ETL Processes to Hadoop

BitwiseMigrationSolutionApproach

~70%EffortSaving~60%EffortSaving

Inventory DeepDive MigrationDesign

Migration Validation

~30%EffortSaving

MigrationAutomationAssessmentAutomation TestAutomation

1 2 3

Page 11: How to Automate Offloading ETL Processes to Hadoop

BigDataProcessingPlatform

OTHERCUSTOM

LocalIn-Memory MapReduce&Tez

COMPUTATION FABRIC

CASCADINGEnterpriseDataApplication

BitWise BigDataProcessingPlatform

ETLMigration QualiDI DataQuality

FrameworkELT

Development

Development MigrationEngine Testing Checks&Balances

Page 12: How to Automate Offloading ETL Processes to Hadoop

CaseStudy

RECOVERYAPPLICATIONDATASOURCES

ANALYTICS

REPORTING

DeveloperUI

XMLCustomCode

ExecutionService

CascadingFramework

ETLApplication

RECOVERYAPPLICATIONDATASOURCES

ANALYTICS

REPORTING

AutomatedETL

Migration

RDBMS

RDBMS

DataQualityMonitoring

DataQualityMonito

ring

ETLTesting

OnExecution

GenerateCascadingFlow

LaunchMapReduce Jobs

Page 13: How to Automate Offloading ETL Processes to Hadoop

BitwiseELTToolArchitecture

ETLMigration QualiDI DataQuality

Framework

DeveloperUI

XMLCustomCode

ExecutionService

CascadingFramework

DevelopmentEnvironment

Page 14: How to Automate Offloading ETL Processes to Hadoop

KeyFeatures

IncreasesETLdeveloperproductivityonHadoopbyupto50%EASY

EFFECTIVE

ECONOMICAL

OPERATIONALVISIBILITY

PortsmajorityofexistingETLprocessestoHadoopwithlittletonochanges

OptimizesETLperformancebychoosingtherightcomputationfabric

ViewsETLprocessesinreal-timeforservicelevelmanagement

Page 15: How to Automate Offloading ETL Processes to Hadoop

BenefitsofBitwiseMigrationSolutionUpto60%ReductionduringAssessmentPhasewithDarkDataDiscoveryFrameworkSAVESTIME

ECONOMICAL

INCREASESPRODUCTIVITY

QUICKERVALIDATION

Upto70%Touch-FreeMigration

Upto40%IncreaseinDeveloperProductivity

Upto30%EffortSavingsinDataValidation

SAVESEFFORT Upto75%–90%EffortSavedforTestComplianceReports

Page 16: How to Automate Offloading ETL Processes to Hadoop

AxesUI

Page 17: How to Automate Offloading ETL Processes to Hadoop

AxesUI

Page 18: How to Automate Offloading ETL Processes to Hadoop

AxesUI

Page 19: How to Automate Offloading ETL Processes to Hadoop

AxesUI

Page 20: How to Automate Offloading ETL Processes to Hadoop

AxesUI

Page 21: How to Automate Offloading ETL Processes to Hadoop

Accelero Demo&UI

Page 22: How to Automate Offloading ETL Processes to Hadoop

Concurrent– CascadingandDriven

OTHERCUSTOM

LocalIn-Memory MapReduce&Tez

COMPUTATIONFABRIC

CASCADINGEnterpriseDataApplication

Page 23: How to Automate Offloading ETL Processes to Hadoop

BitwisehelpedalargeFortune500companysavemillionsofdollarsandanestimated30-50%timeinETLdevelopment through utilizationof theBitwiseproprietaryETLmigrationaccelerator,offloading fromacostlylegacyplatformtoHadoop.Itbeganwhentheclientexpressedtheirinterestinmoving toHadoop/BigDatabymigrating theirexistingRecoveryAbInitioETLs.BitwisecameupwithaphasedapproachtoProof, ValidateandConverttheexistingETLs.

Takingthepartnership further, Bitwiseproposed aGUItotheELTtooltoactasadeveloper IDEbasedonEclipseasaNextStep.

ProofingtheTechnologyStack

ValidationoftheBitwiseHadoopELTStack

ETLMigrationusingAcceleroConversionEngine

PartnershipinAcceleroDevelopmentEngine

PartnershipinAcceleroGUIDevelopment

Stage1 Stage2 Stage3 Stage4 Stage5

Page 24: How to Automate Offloading ETL Processes to Hadoop

Bitwisehasbeenworkingwithfortune500companytomovedatafromDatalaketoHadoopandidentify risksthatneedtobeaddressed.Theprimary focusondeveloping templatesandframeworkforDataIngestionandTestingafterthedataistransferredandbuild reportsontheoffloaded data.

PriorBitwisehashelped theclientwithDataIntegrationmigration throughutilizationoftheBitwiseproprietaryDataIntegrationmigrationexceleratorAccelero,offloading fromacostlylegacyplatformtoHadoop, saving30-50%timeinETLmigration.

Stage1 Stage2 Stage3

DataIngestionintoHadoop–ProofofConcept

TakingtheentireProofofConceptahead–datalakemovingtoHive

BuildoptimizedreportsrunningoftheoffloadeddataonHadoop

Conversion ofproprietaryETLtoAcceleroELTusingCascadingandDriven

Stage4

Page 25: How to Automate Offloading ETL Processes to Hadoop

ThankYou

Page 26: How to Automate Offloading ETL Processes to Hadoop

Confidential

• Bitwise website: http://www.bitwiseglobal.com/• Driven website: http://www.driven.io/

• Speakers contact information:- Bob Taylor: [email protected] Shahab Kamel: [email protected] Mark Catillo: [email protected]

ADDITIONAL RESOURCES