mysql visual analysis and scale-out strategy definition - webinar deck

30
MySQL Visual Analysis and Scale Out Options February 26, 2015

Upload: vladi-vexler

Post on 22-Jan-2018

397 views

Category:

Technology


0 download

TRANSCRIPT

MySQL Visual Analysis

and Scale Out Options

February 26, 2015

2

Agenda

• MySQL visual analysis

• Design considerations

• Web scale challenges

• Characteristics of a

distributed database

• ScaleBase Analysis Genie

• Demo

• Q & A

– Please enter your questions on the GTW side panel

3

Vladi Vexler

Vice President,

Technology and Product Marketing

• Over 15 years experience in software development

and product management

• Experienced in cloud, web and enterprise

• Author of patents in field of databases innovation,

dynamic data caching and machine learning analytics

4

Who Are We?

Distributed Database Management System

Architected for the Cloud

Simple. Reliable. Powerful.

Scale Out Design Considerations

6

What Is Your Goal?

• Scale (mostly) reads

• Scale (mostly) writes

• Performance of reads

– Affected by joins and big tables scans of big tables

• Performance of writes

– Affected by IO r/wr, CPU and table indexes(a growing overhead)

• Locks

• CPU/IO/ RAM issues

• Load peaks

• Data growth

• Geo-distribution, special data distribution needs

7

Database And Tables Metrics to Review

• Size

– Physical size on disk, Logical size (number of rows)

• Multiple/large indices

– Physical impacts (write time) and Logical impact (RAM)

• Reads vs. Writes

– Number of queries per table?

– % of total MySQL traffic

– % of table’s traffic

• Logical data relations – identify and analyze

– Joins – complexity of data distribution and data access

– Logical Data Chunks – related data in multiple tables

8

Example Visual Analysis: Tables

9

Scale Out Platform Considerations

DIY <> NewSQL <> NoSQL <> ScaleBase

• Short-term cost vs long-term cost

– Do-it-yourself - open source is not truly free

– Time to market

– Pareto principle – 20% of complications will take 80% of time

– High overhead cost in maintenance and future developments

• Reliability (ACID) vs. simplicity (BASE)

• Maturity and availability/reliability

• Features and limitations

• How to define a good data distribution policy?

– How to evaluate efficiency of a policy for data distribution and access?

– How to simulate different distribution policies and compare?

10

Scale Out

Methodologies

Comparison

Characteristics of a

Distributed Database

12

Distributed Table Types

• MASTER: Data on one shard only

– Example: general settings

• GLOBAL: Data copied to all shards

– Example: lookups

• DISTRIBUTED (root):

Data on a single shard, based on a key

– Example: Users table.

• CASCADED (distributed child table): Data on a single shard

however, distribution and access depend on the parent table

– Example: User_Photos, User_Photos_Likes – depend on Users

Note: Not all sharding platforms support Cascaded and Master table types

13

Distributed Queries Types

• ONE_DB - Single-shard execution. Global or Master tables, Distributed

& Cascaded tables, joins of a Distributed and Global tables

• ALL_DB – All-shards execution, one DB-node in a shard cluster:

– SELECT and Aggregate data from many shards – Parallel execution

(“map reduce” style) on all shards, Aggregate, Order, Group-By, Limit

– DDL statements

– DML on Global tables

• FULL_DB – Session statements (USE, SET) to be sent to all database

nodes in all shard clusters

• CROSS_DB – Sharding conflict resolution, such as cross-shard joins.

Note: Not all sharding platforms support ALL_DB, FULL_DB and CROSS_DB queries.

14

Importance of Logical Data Chunks

• Example: A Logical Data Chunk in a Facebook app:

– All rows in tables containing information related to George, from:

Users, Photos, Comments, Likes, Posts, Friends etc…

• Goals:

1. Optimal Data Distribution: Store maximum logical data chunks in

same shards

2. Maximize ONE_DB and ALL_DB queries

3. Handle all complex cases: related data is in multiple shards

– ALL_DB, CROSS_DB, FULL_DB queries

15

Data Relationships can be Extremely Complex

Usually, scale out is applied to growing-mature apps.

How do you define an optimal data distribution policy?

Analysis Genie:

MySQL Visual Analysis &

Optimal Distribution Policy Configuration

17

ScaleBase Analysis Genie

• A tool enabling MySQL visual analysis and building an optimal data

distribution policy; Designed for DBAs, Architects & Dev. Managers

• Two step-process:

– Analysis Assistant

– An agent captures app/DB information, including SQL traffic and

database metrics

– Obfuscates, summarizes and packages the App-DB data

– Analysis Genie

– a SaaS application, receives the AA package and presents the

visual analysis and details the policy configuration

Analysis Assistant Analysis Genie

18

ScaleBase Analysis Genie

• Advanced analytics

– Your schemas, data &

queries

• Identification of best

data distribution policy

– Customized for even the

most complex apps

• Complete policy control

• Quality assurance

– Review before production

• Policy simulation

– “What-if” analysis

https://www.scalebase.com/software/

19

MySQL Visual Analysis: Data and Data Access

20

Relationship Identification

Mapping includes:

• Schemas structures

• Tables & columns names

matching

• Queries parsing and

identification of joined

tables and columns

• Statistics on every object

size and access

21

Analyzing Relationships: From Chaos to Order

Understanding

and mapping

complex

relationships

22

Complete Control to Refine, Change and Simulate

23

Complete Control to Refine, Change and Simulate

Demo

25

ScaleBase Genie and ScaleBase Enterprise

Demo Environment

• Visual analysis

• Distribution policy identification and configuration

• Scale out load via data sharding (massive scale out)

ScaleBase

Enterprise

Analysis

Genie

Summary

27

Customer: Million+ User Online Gaming Company

Who:

• Mobile gaming company expanding globally

• Hosted on SoftLayer cloud in Hong Kong

Problem:

• Over a million downloads - peak period overload

• Needed scaling in place for expansion

Alternatives considered:

• Manually sharding/open source tools

• Other commercial solutions were too costly

Solution:

• Used visual analysis to determine optimized policy

• Up and running within a few weeks of initial download and now supports hundreds of

thousands of daily users

• Fully operational using data distribution and anticipating additional scale out within

next quarter

28

Scale out to unlimited users

Continuous availability

Dynamic workload optimization

Fast and simple deployment

Easily scale out a single

MySQL instance

Optimized for the Cloud

Reduces time-to-market

No changes needed to app or database

Database usage analytics

Intelligent load balancing

Centralized data management

ScaleBase Distributed Database Management System

29

Get Instant Application/Database Insight!

Use visual analysis to plan your scale out strategy

Download the Analysis Genie here:

https://www.scalebase.com/software

Questions?

Contact Info:Paul Campaniello

[email protected]

Vladi Vexler

[email protected]

Resources:www.scalebase.com

www.scalebase.com/resources

www.scalebase.com/blog

[email protected]

(617) 630.2800