database stalls, from the ordinary to the obscure...postgresql, redis, mongodb, and amazon aurora,...

36
Database Stalls, From the Ordinary to the Obscure Preetam Jinka (@PreetamJinka) Software Engineer Percona Live 2017

Upload: others

Post on 22-May-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

Database Stalls, From the Ordinary to the Obscure

Preetam Jinka (@PreetamJinka)Software Engineer

Percona Live 2017

Page 2: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

VividCortex’s database monitoring application is the best way to improve your database performance, efficiency, and uptime. Supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance problems before they impact customers. Say hello and see a demo, Booth #205.

We’re hiring!

Page 3: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

3

This talk isn’t about the math.Come to the O’Reilly booth after the talk to pick up a free copy of

our book!

Page 4: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

What is a stall?

4

Page 5: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

5

Stalls

● Short periods when work isn’t being done

● We’re detecting stalls as short as 1 second

● We do this with zero configuration and no fixed thresholds

○ The secret sauce: we have a model.

Page 6: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

6

We’re trying to catch small problems before they turn into bigger ones.

Page 7: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

Little’s Law● L = λ × W● Concurrency = Throughput × Latency● Little’s Law provides a model to relate throughput and concurrency

In MySQL:● Concurrency: threads_running

○ There’s one thread per query.○ From SHOW STATUS

● Throughput: queries completed per second

7

Page 8: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

MySQL Server Stall Example

8

More queries in progress

Fewer being completed

Page 9: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

MySQL Server Stall Example

9

All of the stalled queries are completing after the fault ends.

Page 10: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

Where do stalls come from?

10

● Running out of credits on EBS volumes

● MySQL query cache

● Lock contention

● A bad network cable!

● Transparent huge pages (THP)○ “If a transparent huge page isn’t available, the application will stall to let memory compaction

run to free a page.”

Page 11: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

But we don’t really care about any of those things.

We’re focused on the work your database is doing.

11

Page 12: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

Work-centric monitoring

12

Page 13: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

13

Work-centric monitoring in one slide

● Focus on the work your systems are doing

● Find relationships between metrics (maybe using a model)

● Monitor what you want to optimize

● Focus on heavy hitters

● Automatically detect changes

Page 14: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

How to respond to database stalls

14

Page 15: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

15

Slowness is about spending time on something.

Things spend timedoing work or waiting.

Page 16: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

16

Work

● CPU

● Disk I/O

● Various storage engine metrics

● Slow queries

○ Large scans

Waiting

● Lock contention

● Disk I/O

● Memory compaction

Page 17: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

Walkthrough

17

Page 18: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

18

Page 19: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

19

Page 20: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

20

Page 21: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

21

Page 22: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

22

Page 23: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

23

Be careful about causality.

Page 24: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

Thread states

24

Page 25: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

Back pressure

25

Page 26: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

26

Back pressure is about systems receiving more work than they can process.

Page 27: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

27

Page 28: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

28

It’s much better to handle back pressure higher up the stack.

Page 29: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

Clients

29

APIs

Database

System

Page 30: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

30

Low-level back pressure can cause unfair slowdowns higher up the stack.*

*Totally untested hypothesis. :)

Page 31: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

31

Page 32: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

32

50 ms shift

Page 33: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

33

50 ms shift~1 sec queries stay~1 sec queries (1x)

~1 ms queries become~50 ms queries (50x)

Page 34: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

● Rate limiting / throttling

● Use a queue to contain requests at a higher level

● Somehow prioritize some requests over others

34

Ways to deal with back pressure

Page 35: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

35

Can you eliminate stalls?

Probably not all.

Most? Perhaps!

Page 36: Database Stalls, From the Ordinary to the Obscure...PostgreSQL, Redis, MongoDB, and Amazon Aurora, VividCortex uses patented algorithms to reveal key insights, helping users fix performance

Come find me at the O’Reilly booth!

36

Questions?Twitter: @PreetamJinka

Email: [email protected]