hadoop in a relational data warehouse, expedia
DESCRIPTION
Big Data Toronto 2013TRANSCRIPT
- 1. HADOOP IN A RELATIONAL DATA WAREHOUSE Data andAnalytics/Enterprise DW, Expedia June 2013 Arek Kaczmarek
- 2. Background Expedia Site Competitors DW Legacy EDW DNA Hadoop at Expedia Original Purpose Early expectations
- 3. A case study Project objective Datasets Competitive shopping comparisons Properties Bookings Clickstream demand Forecast
- 4. DW architecture whats different? Normalized vs denormalized tables Does it matter? Performance Ingestion speed Analytical flexibility
- 5. DEV work do you need different skills? Data files: csv, tsv, txt or xml which work best? Hive: HQL UDFs for analytic functions do you need them? Optimization reuse your knowledge? Architecture (temp tables, partitions) HQL (set parameters) Load_tags: partitioning, appending, syncing
- 6. RDBMSes and Hadoop whats their relationship? - Syncing from DB2 - Exporting into HBase - Importing from SQLServer - Exporting into SQLServer - Exporting into DB2
- 7. Place of Hadoop in a Relational Data Warehouse? Conflicting Mutually exclusive Coexisting Complementing
- 8. Whats the new Data Warehouse for data and analytics? Complementing: Polyglot Persistence
- 9. Questions ? [email protected]