cloudera impala presentation

Download Cloudera Impala presentation

Post on 26-Jan-2015




6 download

Embed Size (px)


Presentation on Cloudera Impala at PDX Data Science Group at Portland. Delievered on February 27, 2013. Presentation slides borrowed from Cloudera Impala's architect, Marcel Kornacker.


  • 1. Cloudera Impala:A Modern SQL Engine for Apache HadoopMark GroverSoftware Engineer, ClouderaFebruary 27, 2013

2. What is Impala? General-purpose SQL engine Real-time queries in Apache Hadoop Beta version released since October 2012 General availability (GA) release slated for April 2013 Open source under Apache license 3. Overview User View of Impala Architecture of Impala Comparing Impala with Dremel Comparing Impala with Hive Impala Roadmap 4. Impala Overview: Goals General-purpose SQL query engine: should work both for analytical and transactional workloads will support queries that take from milliseconds to hours Runs directly within Hadoop: reads widely used Hadoop file formats talks to widely used Hadoop storage managers runs on same nodes that run Hadoop processes High performance: C++ instead of Java runtime code generation completely new execution engine that doesnt build on MapReduce 5. User View of Impala: Overview Runs as a distributed service in cluster: one Impala daemon on eachnode with data User submits query via ODBC/JDBC to any of the daemons Query is distributed to all nodes with relevant data If any node fails, the query fails Impala uses Hives metadata interface, connects to Hives metastore Supported file formats: uncompressed/lzo-compressed text files sequence files and RCFile with snappy/gzip compression GA: Avro data files GA: columnar format (more on that later) 6. User View of Impala: SQL SQL support: patterned after Hives version of SQL essentially SQL-92, minus correlated subqueries limited to Select, Project, Join, Union, Subqueries, Aggregation and Insert only equi-joins; no non-equi joins, no cross products Order By only with Limit GA: DDL support (CREATE, ALTER) Functional limitations: no custom UDFs, file formats, SerDes no beyond SQL (buckets, samples, transforms, arrays, structs, maps, xpath, json) only hash joins; joined table has to fit in memory: beta: of single node GA: aggregate memory of all (executing) nodes 7. User View of Impala: Apache HBase HBase functionality: uses Hives mapping of HBase table into metastore table predicates on rowkey columns are mapped into start/stop row predicates on other columns are mapped into SingleColumnValueFilters HBase functional limitations: no nested-loop joins all data stored as text 8. Impala Architecture Two binaries: impalad and statestored Impala daemon (impalad) handles client requests and all internal requests related to query execution exports Thrift services for these two roles State store daemon (statestored) provides name service and metadata distribution also exports a Thrift service 9. Impala Architecture Query execution phases request arrives via odbc/jdbc planner turns request into collections of plan fragments coordinator initiates execution on remote impalads during execution intermediate results are streamed between executors query results are streamed back to client subject to limitations imposed to blocking operators (top-n, aggregation) 10. Impala Architecture: Planner 2-phase planning process: single-node plan: left-deep tree of plan operators plan partitioning: partition single-node plan to maximize scan locality, minimize data movement Plan operators: Scan, HashJoin, HashAggregation, Union, TopN,Exchange Distributed aggregation: pre-aggregation in all nodes, mergeaggregation in single node.GA: hash-partitioned aggregation: re-partition aggregationinput on grouping columns in order to reduce per-nodememory requirement Join order = FROM clause orderGA target: rudimentary cost-based optimizer 11. Impala Architecture: Planner Example: query with join and aggregationSELECT state, SUM(revenue)FROM HdfsTbl h JOIN HbaseTbl b ON (...)GROUP BY 1 ORDER BY 2 desc LIMIT 10 TopNAggTopN Agg Hash HashAgg Join JoinHDFS HBase Exch Exch ScanScanHDFSHBase at coordinator at DataNodesat region serversScan Scan 12. Impala Architecture: Query ExecutionRequest arrives via odbc/jdbcSQL App HiveHDFS NN Statestore ODBC MetastoreSQLrequest Query Planner Query Planner Query PlannerQuery Coordinator Query CoordinatorQuery Coordinator Query Executor Query ExecutorQuery Executor HDFS DNHBase HDFS DN HBaseHDFS DNHBase 13. Impala Architecture: Query ExecutionPlanner turns request into collections of plan fragmentsCoordinator initiates execution on remote impaladsSQL App HiveHDFS NN Statestore ODBC Metastore Query Planner Query Planner Query PlannerQuery Coordinator Query CoordinatorQuery Coordinator Query ExecutorQuery Executor Query ExecutorHDFS DN HBase HDFS DN HBaseHDFS DNHBase 14. Impala Architecture: Query ExecutionIntermediate results are streamed between impalads Queryresults are streamed back to client SQL AppHiveHDFS NN StatestoreODBCMetastore queryresults Query Planner Query Planner Query Planner Query CoordinatorQuery CoordinatorQuery CoordinatorQuery Executor Query Executor Query Executor HDFS DN HBaseHDFS DN HBaseHDFS DNHBase 15. Impala Architecture Metadata handling: utilizes Hives metastore caches metadata: no synchronous metastore API calls during query execution beta: impalads read metadata from metastore at startup Post-GA: metadata distribution through statestore Post-GA: HCatalog 16. Impala Architecture Execution engine written in C++ runtime code generation for "big loops" internal in-memory tuple format plus fixed-width data at fixed offsets uses intrinsics/special cpu instructions for text parsing, crc32 computation, etc. 17. Impala Execution Engine More on runtime code generation example of "big loop": insert batch of rows into hash table known at query compile time: # of tuples in a batch, tuple layout, column types, etc. generate at compile time: unrolled loop that inlines all function calls, contains no dead code, minimizes branches code generated using llvm 18. Impalas Statestore Central system state repository name service (membership) Post-GA: metadata Post-GA: other scheduling-relevant or diagnostic state Soft-state all data can be reconstructed from the rest of the system cluster continues to function when statestore fails, but per-node state becomes increasingly stale Sends periodic heartbeats pushes new data checks for liveness 19. Statestore: Why not ZooKeeper? ZK is not a good pub-sub system Watch API is awkward and requires a lot of client logic multiple round-trips required to get data for changes to nodes children push model is more natural for our use case Dont need all the guarantees ZK provides: serializability persistence prefer to avoid complexity where possible ZK is bad at the things we care about and good at thethings we dont 20. Comparing Impala to Dremel What is Dremel? columnar storage for data with nested structures distributed scalable aggregation on top of that Columnar storage in Hadoop: joint project between Clouderaand Twitter new columnar format: Parquet; derived from Doug Cuttings Trevni stores data in appropriate native/binary types can also store nested structures similar to Dremels ColumnIO Distributed aggregation: Impala Impala plus Parquet: a superset of the published version ofDremel (which didnt support joins) 21. More about Parquet What is it: container format for all popular serialization formats: Avro, Thrift, Protocol Buffers successor to Trevni jointly developed between Cloudera and Twitter open source; hosted on github Features rowgroup format: file contains multiple horiz. slices supports storing each column in separate file supports fully shredded nested data; repetition and definition levels similar to Dremels ColumnIO column values stored in native types (bool, int, float, double, byte array) support for index pages for fast lookup extensible value encodings 22. Comparing Impala to Hive Hive: MapReduce as an execution engine High latency, low throughput queries Fault-tolerance model based on MapReduces on-disk checkpointing; materializes all intermediate results Java runtime allows for easy late-binding of functionality: file formats and UDFs. Extensive layering imposes high runtime overhead Impala: direct, process-to-process data exchange no fault tolerance an execution engine designed for low runtime overhead 23. Comparing Impala to Hive Impalas performance advantage over Hive: no hardnumbers, but Impala can get full disk throughput (~100MB/sec/disk); I/O-bound workloads often faster by 3-4x queries that require multiple map-reduce phases in Hive see a higher speedup queries that run against in-memory data see a higher speedup (observed up to 100x) 24. Impala Roadmap: GA April 2013 New data formats: Avro Parquet Improved query execution: partitioned joins Further performance improvements Guidelines for production deployment: load balancing across impalads resource isolation within MR cluster 25. Impala Roadmap: 2013 Additional SQL: UDFs SQL authorization and DDL ORDER BY without LIMIT window functions support for structured data types Improved HBase support: composite keys, complex types in columns, index nested-loop joins, INSERT/UPDATE/DELETE 26. Impala Roadmap: 2013 Runtime optimizations: straggler handling join order optimization improved cache management data collocation for improved join performance Better metadata handling: automatic metadata distribution through statestore Resource management: goal: run exploratory and production workloads in same cluster, against same data, w/o impacting production jobs 27. Try it out! Beta version available since 10/24/12 Latest version is 0.6 We have packages for: RHEL 6.2/5.7 Ubuntu 10.04 and 12.04 SLES 11 Debian 6 We are targeting GA for April 2013 Questions/comments? My email address: My twitter handle: mark_grover