hadoop in a relational data warehouse, expedia

9
HADOOP IN A RELATIONAL DATA WAREHOUSE Data and Analytics/Enterprise DW, Expedia June 2013 Arek Kaczmarek

Upload: innovation-enterprise

Post on 22-Apr-2015

64 views

Category:

Technology


1 download

DESCRIPTION

Big Data Toronto 2013

TRANSCRIPT

  • 1. HADOOP IN A RELATIONAL DATA WAREHOUSE Data andAnalytics/Enterprise DW, Expedia June 2013 Arek Kaczmarek
  • 2. Background Expedia Site Competitors DW Legacy EDW DNA Hadoop at Expedia Original Purpose Early expectations
  • 3. A case study Project objective Datasets Competitive shopping comparisons Properties Bookings Clickstream demand Forecast
  • 4. DW architecture whats different? Normalized vs denormalized tables Does it matter? Performance Ingestion speed Analytical flexibility
  • 5. DEV work do you need different skills? Data files: csv, tsv, txt or xml which work best? Hive: HQL UDFs for analytic functions do you need them? Optimization reuse your knowledge? Architecture (temp tables, partitions) HQL (set parameters) Load_tags: partitioning, appending, syncing
  • 6. RDBMSes and Hadoop whats their relationship? - Syncing from DB2 - Exporting into HBase - Importing from SQLServer - Exporting into SQLServer - Exporting into DB2
  • 7. Place of Hadoop in a Relational Data Warehouse? Conflicting Mutually exclusive Coexisting Complementing
  • 8. Whats the new Data Warehouse for data and analytics? Complementing: Polyglot Persistence
  • 9. Questions ? [email protected]