conquering "big data": an introduction to shard query

Download Conquering

Post on 26-Jan-2015




8 download

Embed Size (px)


This talk introduces Shard-Query, an MPP distributed parallel processing middleware solution for MySQL. Shard-Query is a federation engine which provides a virutal "grid computing" layer on top of MySQL. This can be used to access data spread over many machines (sharded) and also data partitioned in MySQL tables using the MySQL partitioning option. This is similar to using partitions for parallelism with Oracle Parallel Query. This talk focuses on why Shard-Query is needed, how it works (not detailed) and the best schema to use with it. Shard-Query is designed to scan massive amounts of data in parallel.


  • 1. Conquering big dataAn Introduction toShard-QueryA MPP distributed middleware solution for MySQL databases

2. Big Data is a buzzword Shard-Query works with big data, but it workswith small data too You dont have to have big data to have bigperformance problems with queries 3. Big performance problems MySQL typically has performance problems onOLAP workloads for even tens of gigabytes* ofdata Analytics Reporting Data mining MySQL is generally not scalable*^ for theseworkloads * By itself. The point of this talk is to show how Shard-Query fixes this :) ^ Another presentation goes into depth as to why MySQL doesnt scale for OLAP 4. Not only MySQL has these issues All major open source databases have problemswith these workloads Why? Single threaded queries When all data is in memory, accessing X rows isgenerally X times as expensive as accessing ONE roweven when multiple cpus could be used 5. MySQL scalability model is good for OLTP MySQL was created at a time when commoditymachines Had a small (usually one) CPU core Had small amounts of memory and limited disk IOPS Managed a small amount of data It did not make sense to code intra-queryparallelism for these servers. They couldnt takeadvantage of it anyway. 6. The new age of multi-coreIf your time to you is worth saving,then you better start swimming.Or youll sink like a stone.For the times they are a-changing.Core CoreCore CoreCore CoreCore CoreCPUCore CoreCore CoreCore CoreCore CoreCPUCore CoreCore CoreCore CoreCore CoreCPUCore CoreCore CoreCore CoreCore CoreCPU- Bob Dylan 7. It is 2013. Still only single threaded queries. Building a multi-threaded query plan is a lotdifferent than building a single threaded queryplan The time investment to build a parallel query interfaceinside of MySQL would be very high MySQL has continued to focus on excellence for OLTPworkloads while leaving the OLAP market untapped Just adding basic subquery options to the optimizer hastaken many years 8. MySQL scales great for OLTP because MySQL has been improved significantly, especiallyin 5.5 and 5.6 Many small queries are balanced over manyCPUs naturally Large memories allow vast quantities of hot data And very fast disk IO means that The penalty for cache miss is lower No seek penalty for SSD especially reduces cost ofconcurrent misses from multiple threads (no headmovement) 9. But not for OLAP Big queries "peg" one CPU and can use no moreCPU resources (low efficiency queries) Numerous large queries can "starve" smallerqueries This is often when innodb_thread_concurrency needsto be set > 0 10. But not for OLAP (cont) When the data set is significantly larger thanmemory, single threaded queries often cause thebuffer pool to "churn" While SSD helps somewhat, one thread can not readfrom an SSD at maximum device capacity Disk may be capable of 1000s of MB/sec, but the singlethread is generally limited to map/reduce interfaces Apache Hadoop/Apache Hive Impala Map/R Cloudera CDH Google built a SQL interface to BigTable too Limitations No correlated subqueries for example 13. What do those map/reduce things do? Split data up over multiple servers (HDFS) During query processing Map (fetch/extract/select/etc) raw data from files ortables on HDFS Write the data into temporary areas Shuffle temporary data to reduce workers Final reduce written Return results 14. Those sounds expensive It is (in terms of dollars for closed solutions) It is (in terms of execution time for open solutions) The map is especially expensive when data isunstructured and it must be done repeatedly foreach different query you run 15. And complicated You get a whole new toolchain A new set of data management tools A new set of high availability tools And all new monitoring tools to learn! 16. Even if MySQL supported parallel query:MySQL* doesnt do distributed queries Those Map/Reduce solutions (and the closedsource databases) can use more than one server! Building a query plan for queries that mustexecute over a sharded data set has additionalchallenges:SELECT AVG(expr)must be computed as:SUM(expr)/COUNT(expr) AS`AVG(expr)`* Again, Shard-Query does. Almost there.Probably the simplest example of a necessary rewrite 17. MySQL network storage engines Dont these engines claim to be parallel? Fetching of data from remote machines may be done inparallel, but query processing is coordinated by a serialquery thread A sum still has to examine each individual row from everyserver serially Joins are still evaluated serially (in many cases) The engine is parallel, but the SQL layer using theengine is not. 18. NDB NDB is bad for star schema Dimension table rows are not usually co-located withfact rows. Engine condition pushdown may help somewhat toalleviate network traffic but joins still have to traversethe network which is expensive Aggregation still serial 19. SPIDER SPIDER is bad for star schema too Nested loops may be very bad for SPIDER and starschema if the fact table isnt scanned first (must useSTRAIGHT_JOIN hint extensively). MRR/BKA in MariaDB might help? Still no parallel aggregation or join. 20. CONNECT Has ECP No ICP or ability to expose remote indexes Always uses join buffer(BNLJ) or BKAJ Fetches in parallel No parallel join No parallel aggregation 21. Those are not parallel query solutions Those engines are not OLAP parallel query They are for OLTP lookup and/or filteringperformance. Often cant sort in parallel. They can offer improved performance when largenumbers of rows are filtered from many machinesin parallel When aggregating, a query must return a smallresultset before aggregation for good performance star schema should be avoided 22. Enter Shard-QueryMassively parallel query execution for MySQL variants 23. Enter Shard-Query Keep using MySQL Choose a row store like XtraDB, InnoDB or TokuDB* Choose a column store like ICE*, Groonga** Use CSV, TAB, XML, or other data with the CONNECT**engine in MariaDB 10** These engines have not been thoroughly tested* These engines work, but with some limitations due to bugs 24. Shard-Query connects to 3306 Shard-Query can use any MySQL variant as a datasource You continue to use regular SQL, no map/reduce Is built on MySQL, PHP and Gearman well proventechnologiesYou probably already know these things. 25. Shard-Query re-writes SQL Flexible Does not have to re-implement complex SQLfunctionality because it uses SQL directly Hundreds of MySQL functions and features available outof the box Small subset* of functions not available last_insert_id(), get_lock(), etc.* 26. Shard-Query re-writes SQL Familiar SQL ORDER BY, GROUP BY, LIMIT, HAVING, subqueries, evenWITH ROLLUP, all continue to work as normal Support for all MySQL aggregate functions includingcount(distinct) Aggregation and join happens in parallel* 27. You dont have to knowPHP to use Shard-Query!Just use SQL 28. You can still connect to 3306 (and more)! Shard-Query has multiple ways of interacting withyour application The PHP OO API is the underlying interface. The other interfaces are built on it: MySQL Proxy Lua script (virtual database) HTTP or HTTPS web/REST interface Access the database directly from Javascript? Submit Gearman jobs (as SQL) directly from almost anyprogramming language 29. MySQL Proxy 30. Web Interface 31. Command line (with explain plan)echo "select * from (select count(*) from lineorder) sq;"|phprun_query --verboseSQL SET TO SEND TO SHARDS:Array ( [0] => SELECT COUNT(*) AS expr_2942896428 FROM lineorder AS`lineorder` WHERE 1=1 ORDER BY NULL )SENDING GEARMAN SET to: 2 shardsSQL FOR COORDINATOR NODE:SELECT SUM(expr_2942896428) AS `count(*)` FROM`aggregation_tmp_21498632`SQL SET TO SEND TO SHARDS:Array ( [0] => SELECT * FROM ( SELECT SUM(expr_2942896428) AS`count(*)` FROM `aggregation_tmp_21498632` ) AS `sq` WHERE 1=1 )SENDING GEARMAN SET to: 1 shardsSQL TO SEND TO COORDINATOR NODE:SELECT * FROM `aggregation_tmp_88629847`[count(*)] => 1199721041 rows returned Exec time: 0.053546905517578 32. Shard-Query constructs parallel queries MySQL cant run a single query in multiple threadsbut it can run multiple queries at once in multiplethreads (with multiple cores) Shard-Query breaks one query into multiplesmaller queries (aka tasks) Tasks can run in parallel on one or more servers 33. OLAP into OLTP 34. Partitioning tables for parallelismThis is similar to Oracle Parallel Query 35. Partitioning splits queries on a single machine Supports partitioning to divide up a table RANGE, LIST and RANGE/LIST COLUMNS over a singlecolumn Each partition can be accessed in parallel as anindividual task 36. A different way to look at it:You get to move all the pieces at the same timeT1T4T8T32T48T64T1T4T8VERSUSSINGLE THREADED PARALLEL*Small portion of execution is still serial, so speedup wont be quite linear (but should be close) 37. Sharding 38. Sharded tables split data over many servers Works similarly to partitioning. You specify a "shard key". This is like apartitioning key, but it applies to ALL tables in theschema. If a table contains the "shard key", then the table isspread over the shards based on the values of thatcolumn Pick a "shard key" with an even data distribution Currently only a single column is supported 39. Unsharded Tables Tables that dont contain the "shard key" arecalled "unsharded" tables A copy of these tables is replicated on ALL nodes It is a good idea to keep these tables relatively smalland update them infrequently You can freely join between sharded and unshardedtables You can only join between sharded tables when thejoin includes the shard key** A CONNECT or FEDERATED table to a Shard-Query proxy can be used tosupport cross-shard joins. Consider MySQL Cluster for cross-shard joins. 40. ParallelExecutionShardingand/orPartitionedTablesGearmanShard-QueryRESTProx