#perconalive - accelerating mysql in open source hyperscale systems

30
Accelerating MySQL in Open Source Hyperscale Systems Doubling MySQL performance with Native Flash Access Torben Mathiasen Percona Live 2013 – Santa Clara

Upload: sandisk

Post on 10-Jun-2015

1.131 views

Category:

Technology


2 download

DESCRIPTION

This is the slide deck presented by Torben Mathiasen at Percona Live 2013 in Santa Clara, California. Learn how to double MySQL performance with native flash access.

TRANSCRIPT

Page 1: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

Accelerating MySQL in Open Source Hyperscale Systems

Doubling MySQL performance with Native Flash Access

Torben Mathiasen

Percona Live 2013 – Santa Clara

Page 2: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

2

The Hyperscale Market

• Characterized by distributed scale-out architecture

• Maximize compute efficiency with high volume servers

• Cost effective on both the hardware and software side

• Always looking for improvements. Even small changes can have a huge impact on overall efficiency

Page 3: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

3

Hyperscale continued..

Scale-out for different purposes• Increase DRAM• Increase CPU core count• Increase I/O performance with more storage devices

Scaling out does increase complexity• Software architecture and efficiency• Network latency and cost

• Often not using full hardware resources per box

Page 4: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

4

Hyperscale software

▸ Distributed data processing

• Hadoop, Hbase, MySQL, memcached, Cassandra, mongodb, etc (this is a long list)

• Open-source software

▸ Some key characteristics

• Often scales horizontally

• Many of them are built for spindles (minimize random I/O)

• Moving to optimize for seek-less storage

• Often run as multiple instances on the same box

Page 5: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

5

Application performance determine value

Linux Kernel (schedulers, I/O path, syscalls)

File System (XFS, Ext4, Btrfs, DirectFS)

Apps (MySQL, HBase, persistent memcached)

Page 6: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

6

Application Performance research at Fusion-io

▸Remember, efficiency per box matters

• Reducing CPU

• Improving application I/O engines

• Reducing application required writes

• Lower application latency

• Hardware footprint

Page 7: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

7

App value starts with low latency

▸ Time to submit and complete a single I/O

▸ Also impacts asynchronous I/O at low queue depth

▸ Important for any application (besides block benchmarks)

▸ MySQL usually sees <32 outstanding requests

▸ Bandwidth rarely the limiting factor

Page 8: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

8

directFS: Direct File System

▸Appears as Linux file system• Provides performance to applications “as is” • Focuses only on file namespace

▸Employs existing flash translation layer for:• Large virtualized addressed space• Direct flash access• Crash recovery mechanisms

▸Exports primitives through file namespace• Application access through directFS or straight to

device

Page 9: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

9

File System Efficiency

Native Flash Translation Layerblock allocation, mapping, recycling

ACID updates, logging/journaling, crash-recovery

directFSfile metadata mgmt

Kernel block layer

kernel-space

user-space

Ext3 file metadata mgmt,

block allocation, mapping, recycling,ACID updates, logging/journaling, crash-recovery

Primitive Interfaces

Application

Linux VFS (virtual file system) abstraction layer

Page 10: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

10

directFS: Speed Through Simplicity

directFS

ReiserFS

Ext4

Btrfs

XFS

0 10000 20000 30000 40000 50000 60000 70000

L INES OF CODE

Page 11: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

11

Achieving Raw Device Performance

▸directFS with Atomic Writes benefits • File system convenience • Performance of simple

writes to raw device

▸ Parity indicates significantly more functionality without any performance impactB

an

dw

idth

(M

iB/s

)I/O Size

Block directFS

8 Threads

Page 12: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

12

Fusion-io Software Development Kit

Traditional Storage

Proprietary Storage OS

Storage Media

Native Flash Translation Layer

Storage Media

Software Defined Storage

Applications

Block I/O Block I/O Enhanced I/OAtomic Writes / directFS

Key-Value Store API

Memory AccessExtended Memory

Auto Commit Memory

Page 13: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

13

Percona Server and MariaDB

▸Efficient XtraDB storage engine

▸Well optimized for seek-less storage like flash

▸Many config parameters to fine-tune performance

▸What else can be done?• Lock contention can still be improved as seen by using

multiple instances with the same storage device• Tapping into the native performance of flash by

exposing key FTL features to the application

Page 14: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

14

Atomic writes

▸ File system ioctl() tells DirectFS that all I/O to this file should be treated as atomic

▸ Works with both synchronous and async io_submit() interface

▸ Provides low latency async IO with atomic guarantees

▸ Minimal application changes required

Page 15: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

15

MySQL Writes Comparison

Traditional MySQL Writes MySQL with Atomic Writes

Page CPage

B

Page A

Buffer

DRAMBuffer

SSD (or HDD) Database

Database Server

Page C

Page B

Page A

Page C

Page B

Page A

Page C

Page B

Page A

Application initiates updates to pages A, B, and C.

1

MySQL copies updated pages to memory buffer.

2

MySQL writes to double-write buffer on the media.

3

Once step 3 is acknowledged, MySQL writes the updates to the actual tablespace.

4

ioMemory Database

Page C

Page B

Page A

DRAMBuffer

Page C

Page B

Page A

Application initiates updates to pages A, B, and C.

1

MySQL copies updated pages to memory buffer.

2

MySQL writes to actual tablespace, bypassing the double-write buffer step due to inherent atomicity guaranteed by the intelligent device.

3

Database Server

Page CPage

B

Page A

Page 16: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

16

Minimal application change

#define DFS_IOCTL_ATOMIC_WRITE_SET _IOW(0x95, 2, uint)

if (srv_use_atomic_writes && type == OS_DATA_FILE

&& os_file_set_atomic_writes(file, name)) {

close(file);

*success = FALSE;

file = -1;

}

int ret = ioctl (file, DFS_IOCTL_ATOMIC_WRITE_SET, &atomic_option);

Page 17: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

17

Atomic benchmarks

First, lets sum up the MySQL benefits here:

• Writing only 50% of the data otherwise required for ACID compliance

That’s pretty much it…but it gives us▸ Twice the flash endurance▸ Much better latency because of fewer syscalls▸ Much better application throughput due to less I/O▸ Better concurrency due to fewer locks

Page 18: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

18

Atomics TPC-C throughput

Page 19: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

19

Atomics latency

Page 20: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

20

Percona’s testing of atomics

Page 21: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

21

Flash Evolution Details

FLASH AS DISK

Application

Application source code converts native data structures into block I/O

Conventional I/O Access

Block I/O

Proprietary Storage OS

FLASH BEYOND DISK

Application

Application source code does I/O with native data structures

Native: Enhanced I/O

Atomic I/OTransaction

Key-ValueTransaction

User-DefinedObject

Transaction

Open Interface Layer

FLASH AS MEMORY

Application

Application source code manipulates native data structures

directly in persistent memory

Native: Persistent Memory

High-speedLogging

MemoryTransaction

Checkpointed Memory

Open Interface Layer

Page 22: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

22

Fusion-io advanced developmentStorage Class Memory

Small Capacity DRAM (volatile)$$/GB Big Capacity Flash

$/GB

Memory-speed persistenceByte-addressable vs. block addressable

Small Capacity SCM(persistent)Server Virtual

Memory

Server

Page 23: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

23

SCM research

▸Lets look at keeping a database log using memory semantics

▸Goal is to reduce latency, cost of flushing data to a persistent state and further minimize writes

▸SCM testing using modified Innosim toool

Page 24: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

24

SCM logger interface

▸ logger_open()

Open and initialize logging infrastructure within the FTL

▸ logger_close()

Clean-up

▸ logger_append()

Append to head of log at memory speeds. This basically translates to a memcpy()

▸ logger_sync()

Serialize data using assembler ‘mfence’ instruction

Page 25: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

25

Practical Database Use Case: MySQL

B a s e l i n e S c e n a r i o 1 S c e n a r i o 2

0

2,000

4,000

6,000

8,000

10,000

12,000

14,000

16,000

18,000

8000

16,000 15,750

INN

OS

IM O

PS

/SE

C

Nearly as fast as disabling

the transaction log completely.

Log transaction through block I/O No Logging Log to Fusion-io ACM

Page 26: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

26

High-Speed Transaction Logging

▸ 64KB of persistent memory

used as head of log

▸ 2x throughput increase for

update intensive transaction

workloads (MySQL innosim)

▸ 30% reduction in writes to

flash

▸ Performance very close to

disabling logs entirely 1 4 8 16 32 64

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

Innosim Reads (update intensive transaction

workload)

ACM txlog/binlog

Regular txlog/binlog

Disable txlog/binlog

Number of Users

Op

s/s

Page 27: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

27

The coming shift in software development

▸As an SSD, flash accelerates applications.

At full maturity, Non-Volatile Memorywill transform software development.

Page 28: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

28

Async atomics Availability

▸ Percona Server dev : 5.5.31/32

▸ MariaDB mainline: 5.5.31

▸ DirectFS drop available next week• Register at developer.fusionio.com

• Download at support.fusionio.com

▸ Oracle InnoDB atomics patch on launchpad next week

Page 29: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

29

Hyperscale native flash benefits

• Avoid scaling out for DRAM• More work per node, less nodes• Less DRAM per node for the same workload• Less DRAM per node may mean lower cost servers

Continue to work with vendors to

• Improve application efficiency• Move applications to be fully flash aware• Further allow application access to the FTL (ptrim, etc).

MySQL Atomics and Storage Class Memory shown at Fusion-io booth!

Page 30: #PerconaLive - Accelerating MySQL in Open Source Hyperscale Systems

f u s i o n i o . c o m | R E D E F I N E W H A T ’ S P O S S I B L E

T H A N K Y O U

Click to download the Whitepaper

MySQL Low Latency and High Throughputwith directFS and Atomics