cloudera impala technical deep dive
Post on 06-May-2015
Embed Size (px)
DESCRIPTIONWhy Cloudera build Imapla and how it achieves it's blistering speed.
- 1.Practical Performance Analysis and Tuning for Cloudera Impala Headline Goes Here Marcel Kornacker | email@example.comSpeaker Name or Subhead Goes Here 2013-11-11Copyright 2013 Cloudera Inc. All rights reserved.Copyright 2013 Cloudera Inc. All rights reserved.
2. Just how fast is Impala? You might just be surprised!2 3. How fast is Impala?DeWitt Clause prohibits using DBMS vendor name !3Copyright 2013 Cloudera Inc. All rights reserved. 4. Practical Performance Running with Impala!4 5. Pre-execution Checklist Datatypes Partitioning File Format!5Copyright 2013 Cloudera Inc. All rights reserved. 6. Data Type Choices Deneinteger columns as INT/BIGINT Operations Converton INT/BIGINT more efficient than STRINGexternal data to good internal types onload e.g.CAST date strings to TIMESTAMPS This avoids expensive CASTs in queries later!6Copyright 2013 Cloudera Inc. All rights reserved. 7. Partitioning Thefastest I/O is the one that never takes place. Understand your query lter predicates Fortime-series data, this is usually the date/timestamp column Use this/these column(s) for a partition key(s) Validatequeries leverage partition pruning using EXPLAIN You can have too much of a good thing Afew thousand partitions per table is probably OK Tens of thousands partitions is probably too much Partitions/Files should be no less than a few hundred MBs !7Copyright 2013 Cloudera Inc. All rights reserved. 8. Partition Pruning in EXPLAIN explain select count(*) from sales_fact where sold_date in('2000-01-01','2000-01-02');Note the partition count and missing predicates lter due to parse time! Partition Pruning (partitioned table) 0:SCAN HDFS table=grahn.sales_fact #partitions=2 size=803.15MB tuple ids: 0 ! Filter Predicate (non-partitioned table) 0:SCAN HDFS table=grahn.sales_fact #partitions=1 size=799.42MB predicates: sold_date IN ('2000-01-01', '2000-01-02') tuple ids: 0 !8Copyright 2013 Cloudera Inc. All rights reserved. 9. Why use Parquet Columnar Format for HDFS? Welldened open format - http://parquet.io/ Worksin Impala, Pig, Hive & Map/Reduce I/Oreduction by only reading necessary columns Columnar layout compresses/encodes better Supports nested data by shredding columns Usestechniques used by Googles ColumnIO Impala Gzip Quick !9loads use Snappy compression by defaultavailable: set PARQUET_COMPRESSION_CODEC=gzip;word on Snappy vs. Gzip Copyright 2013 Cloudera Inc. All rights reserved. 10. Parquet: A Motivating Example!10Copyright 2013 Cloudera Inc. All rights reserved. 11. Quick Note on Compression Snappy Fastercompression/decompression speeds Less CPU cycles Lowercompression ratio Gzip/Zlib Slowercompression/decompression speeds More CPU cycles Highercompression ratio Its all about trade-offs !11Copyright 2013 Cloudera Inc. All rights reserved. 12. Its all about the compression codec Hortonworks graphic fails to call out that different codecs are used. Gzip compresses better than Snappy, but we all knew that.Source: http://hortonworks.com/blog/orcle-in-hdp-2-better-compression-better-performance/Compressed with Snappy !12Copyright 2013 Cloudera Inc. All rights reserved.Compressed with Gzip 13. TPC-DS 500GB Scale Factor % of Original Size - Lower is Better 32.3% 34.8%Snappy26.9% 27.4%Gzip0!13Using the same compression codec produces comparable results0.0880.1750.263Copyright 2013 Cloudera Inc. All rights reserved.0.35Parquet ORCFile 14. Query Execution What happens behind the scenes!14 15. Left-Deep Join Tree largest* table should be listed rst in the FROM clause Joins are done in the order tables are listed in FROM clause Filter early - most selective joins/tables rst v1.2.1 will do JOIN ordering The!15 R3 R1 R2Copyright 2013 Cloudera Inc. All rights reserved.R4 16. Two Types of Hash Joins Defaulthash join type is BROADCAST (aka replicated) Eachnode ends up with a copy of the right table(s)* Left side, read locally and streamed through local hash join(s) Best choice for star join, single large fact table, multiple small dims Alternatehash join type is SHUFFLE (aka partitioned) Rightside hashed and shuffled; each node gets ~1/Nth the data Left side hashed and shuffled, then streamed through join Best choice for large_table JOIN large_table Only available if ANALYZE was used to gather table/column stats* !16Copyright 2013 Cloudera Inc. All rights reserved. 17. How to use ANALYZE TableStats (from Hive shell) analyze Columntable T1 [partition(partition_key)] compute statistics;Stats (from Hive shell) analyzetable T1 [partition(partition_key)] compute statistics for columns c1,c2, Impala1.2.1 will have a built-in ANALYZE command!!17Copyright 2013 Cloudera Inc. All rights reserved. 18. Hinting Joins select ... from large_fact join [broadcast] small_dim ! !select ... from large_fact join [shuffle] large_dim !*square brackets required!18Copyright 2013 Cloudera Inc. All rights reserved. 19. Determining Join Type From EXPLAIN explain select s_state, count(*) from store_sales join store on (ss_store_sk = s_store_sk) group by s_state; !!2:HASH JOIN | join op: INNER JOIN (BROADCAST) | hash predicates: | ss_store_sk = s_store_sk | tuple ids: 0 1 | |----4:EXCHANGE | tuple ids: 1 | 0:SCAN HDFS table=tpcds.store_sales tuple ids: 0!19explain select c_preferred_cust_flag, count(*) from store_sales join customer on (ss_customer_sk = c_customer_sk) group by c_preferred_cust_flag; 2:HASH JOIN | join op: INNER JOIN (PARTITIONED) | hash predicates: | ss_customer_sk = c_customer_sk | tuple ids: 0 1 | |----5:EXCHANGE | tuple ids: 1 | 4:EXCHANGE tuple ids: 0Copyright 2013 Cloudera Inc. All rights reserved. 20. Memory Requirements for Joins & Aggregates Impaladoes not spill to disk -- pipelines are in-memory Operators mem usage need to t within the memory limit This is not the same as all data needs to t in memory Buffereddata generally signicantly smaller than total accesseddata Aggregations mem usage proportional to number of groups Appliesfor each in-ight query (sum of total) Minimum of 128GB of RAM is recommended for impala nodes !20Copyright 2013 Cloudera Inc. All rights reserved. 21. Impala Debug Web Pages A great resource for debug information!21 22. !23 23. Quick Review Impala performance checklist!25 24. Impala Performance Checklist Chooseappropriate data types Leverage partition pruning Adopt Parquet format Gather table & column stats Order JOINS optimally Automatic in v1.2.1 Validate JOIN type Monitor hardware resources!26Copyright 2013 Cloudera Inc. All rights reserved. 25. Test drive Impala ImpalaCommunity: Download: Github: User!27http://cloudera.com/impalahttps://github.com/cloudera/impalagroup: impala-user on groups.cloudera.orgCopyright 2013 Cloudera Inc. All rights reserved. 26. SELECT questions FROM audience; Thank you.!28