how to automate offloading etl processes to hadoop
TRANSCRIPT
Confidential
OPERATIONAL EXCELLENCE FOR BIG DATA APPS
Confidential2
TRUSTEDby over 10,000
companies as their big data app platform
BACKEDby top Silicon Valley
investors True Ventures,Rembrandt VP, Bain
Capital
FOUNDEDin 2008, with
headquarters in San Francisco
Confidential
PERFORMANCE MANAGEMENT FOR BIG DATA APPLICATIONS
your big data apps
MONITORto resolve
issues fasterbig data apps
more effectively
MANAGECOLLABORATE
Confidential4
Java, Scala (Scalding), SQL SIMPLEEnsure best practices at any scale thanks to easy-to-learn design
principles
FLEXIBLELeverage existing Java,
Scala, and SQL skills and easily adapt to new
systems
WE ARE THE DEVELOPERS BEHIND CASCADING
RELIABLEAlways get optimal performance and
reliability for big data applications
Confidential
• Use Hadoop for ETL / ELT • Ensure quality and manageability
of our ELT / ELT applications• Translate existing ETL work to
Hadoop• GUI ETL tool for developers that
don’t know Java, Scala, SQL
5
MIGRATING TO HADOOP FOR ETL AT ENTERPRISE SCALE.
Cascading
Driven
?
?
Confidential6
TODAY’S SPEAKERS
Shahab KamalVice President at BitWise Inc.
Shahab is responsible for strategy, growth and client relations. Shahab works with client executives on ITStrategy for Business Intelligence, Big Data, Data Warehousing and Enterprise Applications. Shahab hasworked at Ford Motors, Aon Hewitt and Tribune Company on their PeopleSoft ERP implementation and support.His expertise has been around retrofitting data from legacy applications without loss of data integrity.
Mark CastilloDriven, Inc.
Mark is a Solutions Architect with 15+ years of software engineering background. He has worked in thefinance, security, healthcare, streaming music, marketing, and social networking industries. His technicalknowledge and skills are focused on distributed systems, data processing, networking, Linux appliances andBig Data.
DataMigration–SeamlessTransition toHadoopShahabKamal&MarkCastillo
AboutBitwise
Founded
in1996withHQinChicago,IL
Located
InofficesinIndia&Australia
ISO9001:2008&ISO27001:2005Certified
Backed
ByFortune500customers
ProprietaryTechnology
suiteofAcceleratorsthatreducetheexpense,timeandcomplexityoflarge-scaledataprojects.
Reporting,Mining,Analytics
Analytics
Reporting,Mining,AnalyticsExploratoryDiscovery
Search
DATAMART
ReportingDataMining
STAGE TRANSFORM ARCHIVE
DataLake
BitwiseMigrationSolutionApproach
~70%EffortSaving~60%EffortSaving
Inventory DeepDive MigrationDesign
Migration Validation
~30%EffortSaving
MigrationAutomationAssessmentAutomation TestAutomation
1 2 3
BigDataProcessingPlatform
OTHERCUSTOM
LocalIn-Memory MapReduce&Tez
COMPUTATION FABRIC
CASCADINGEnterpriseDataApplication
BitWise BigDataProcessingPlatform
ETLMigration QualiDI DataQuality
FrameworkELT
Development
Development MigrationEngine Testing Checks&Balances
CaseStudy
RECOVERYAPPLICATIONDATASOURCES
ANALYTICS
REPORTING
DeveloperUI
XMLCustomCode
ExecutionService
CascadingFramework
ETLApplication
RECOVERYAPPLICATIONDATASOURCES
ANALYTICS
REPORTING
AutomatedETL
Migration
RDBMS
RDBMS
DataQualityMonitoring
DataQualityMonito
ring
ETLTesting
OnExecution
GenerateCascadingFlow
LaunchMapReduce Jobs
BitwiseELTToolArchitecture
ETLMigration QualiDI DataQuality
Framework
DeveloperUI
XMLCustomCode
ExecutionService
CascadingFramework
DevelopmentEnvironment
KeyFeatures
IncreasesETLdeveloperproductivityonHadoopbyupto50%EASY
EFFECTIVE
ECONOMICAL
OPERATIONALVISIBILITY
PortsmajorityofexistingETLprocessestoHadoopwithlittletonochanges
OptimizesETLperformancebychoosingtherightcomputationfabric
ViewsETLprocessesinreal-timeforservicelevelmanagement
BenefitsofBitwiseMigrationSolutionUpto60%ReductionduringAssessmentPhasewithDarkDataDiscoveryFrameworkSAVESTIME
ECONOMICAL
INCREASESPRODUCTIVITY
QUICKERVALIDATION
Upto70%Touch-FreeMigration
Upto40%IncreaseinDeveloperProductivity
Upto30%EffortSavingsinDataValidation
SAVESEFFORT Upto75%–90%EffortSavedforTestComplianceReports
AxesUI
AxesUI
AxesUI
AxesUI
AxesUI
Accelero Demo&UI
Concurrent– CascadingandDriven
OTHERCUSTOM
LocalIn-Memory MapReduce&Tez
COMPUTATIONFABRIC
CASCADINGEnterpriseDataApplication
BitwisehelpedalargeFortune500companysavemillionsofdollarsandanestimated30-50%timeinETLdevelopment through utilizationof theBitwiseproprietaryETLmigrationaccelerator,offloading fromacostlylegacyplatformtoHadoop.Itbeganwhentheclientexpressedtheirinterestinmoving toHadoop/BigDatabymigrating theirexistingRecoveryAbInitioETLs.BitwisecameupwithaphasedapproachtoProof, ValidateandConverttheexistingETLs.
Takingthepartnership further, Bitwiseproposed aGUItotheELTtooltoactasadeveloper IDEbasedonEclipseasaNextStep.
ProofingtheTechnologyStack
ValidationoftheBitwiseHadoopELTStack
ETLMigrationusingAcceleroConversionEngine
PartnershipinAcceleroDevelopmentEngine
PartnershipinAcceleroGUIDevelopment
Stage1 Stage2 Stage3 Stage4 Stage5
Bitwisehasbeenworkingwithfortune500companytomovedatafromDatalaketoHadoopandidentify risksthatneedtobeaddressed.Theprimary focusondeveloping templatesandframeworkforDataIngestionandTestingafterthedataistransferredandbuild reportsontheoffloaded data.
PriorBitwisehashelped theclientwithDataIntegrationmigration throughutilizationoftheBitwiseproprietaryDataIntegrationmigrationexceleratorAccelero,offloading fromacostlylegacyplatformtoHadoop, saving30-50%timeinETLmigration.
Stage1 Stage2 Stage3
DataIngestionintoHadoop–ProofofConcept
TakingtheentireProofofConceptahead–datalakemovingtoHive
BuildoptimizedreportsrunningoftheoffloadeddataonHadoop
Conversion ofproprietaryETLtoAcceleroELTusingCascadingandDriven
Stage4
ThankYou
Confidential
• Bitwise website: http://www.bitwiseglobal.com/• Driven website: http://www.driven.io/
• Speakers contact information:- Bob Taylor: [email protected] Shahab Kamel: [email protected] Mark Catillo: [email protected]
ADDITIONAL RESOURCES