five steps perform_2013

94
1 1 1 2 1 5 1 4 1 3 Five Steps to PostgreSQL Performance Josh Berkus PostgreSQL Project MelPUG 2013

Upload: postgresql-experts-inc

Post on 02-Nov-2014

3.933 views

Category:

Documents


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Five steps perform_2013

1 1

1 2

1 51 4

1 3

Five Stepsto PostgreSQL Performance

Josh BerkusPostgreSQL Project

MelPUG 2013

Page 2: Five steps perform_2013

1 1

1 2

1 51 4

1 3Hardware

OS &Filesystem

postgresql.conf

ApplicationDesign

QueryTuning

Page 3: Five steps perform_2013

0. Getting Outfitted

Page 4: Five steps perform_2013

5 Layer Cake

HardwareStorage

Operating System

PostgreSQL

Middleware

Application

Filesystem

Schema

Drivers

Queries

RAM/CPU Network

Kernel

Config

Connections Caching

Transactions

Page 5: Five steps perform_2013

5 Layer Cake

HardwareStorage

Operating System

PostgreSQL

Middleware

Application

Filesystem

Schema

Drivers

Queries

RAM/CPU Network

Kernel

Config

Connections Caching

Transactions

Page 6: Five steps perform_2013

Scalability Funnel

HW

Application

Middleware

PostgreSQL

OS

Page 7: Five steps perform_2013

1OWhat Flavor is Your DB?

►Web Application (Web)●DB smaller than RAM●90% or more “one-liner” queries

W

Page 8: Five steps perform_2013

1OWhat Flavor is Your DB?

►Online Transaction Processing (OLTP)●DB slightly larger than RAM to 1TB●20-70% small data write queries,

some large transactions

O

Page 9: Five steps perform_2013

1OWhat Flavor is Your DB?

►Data Warehousing (DW)●Large to huge databases (100GB to

100TB)●Large complex reporting queries●Large bulk loads of data●Also called "Decision Support" or

"Business Intelligence"

D

Page 10: Five steps perform_2013

Tips for Good Form

►Engineer for the problems you have●not for the ones you don't

1O

Page 11: Five steps perform_2013

Tips for Good Form

►A little overallocation is cheaper than downtime●unless you're an OEM, don't stint a few

GB●resource use will grow over time

1O

Page 12: Five steps perform_2013

Tips for Good Form

►Test, Tune, and Test Again●you can't measure performance by “it

seems fast”

1O

Page 13: Five steps perform_2013

Tips for Good Form

►Most server performance is thresholded●“slow” usually means “25x slower”●it's not how fast it is, it's how close you

are to capacity

1O

Page 14: Five steps perform_2013

11Application

Design

Page 15: Five steps perform_2013

Schema Design

►Table design●do not optimize prematurely

▬normalize your tables and wait for a proven issue to denormalize

▬Postgres is designed to perform well with normalized tables

●Entity-Attribute-Value tables and other innovative designs tend to perform poorly

1 1

Page 16: Five steps perform_2013

Schema Design

►Table design●consider using natural keys

▬can cut down on the number of joins you need

●BLOBs can be slow▬have to be completely rewritten, compressed

▬can also be fast, thanks to compression

1 1

Page 17: Five steps perform_2013

Schema Design

►Table design●think of when data needs to be updated,

as well as read▬sometimes you need to split tables which will be updated at different times

▬don't trap yourself into updating the same rows multiple times

1 1

Page 18: Five steps perform_2013

Schema Design

►Indexing●index most foreign keys●index common WHERE criteria●index common aggregated columns●learn to use special index types:

expressions, full text, partial

1 1

Page 19: Five steps perform_2013

Schema Design

►Not Indexing●indexes cost you on updates, deletes

▬especially with HOT

●too many indexes can confuse the planner

●don't index: tiny tables, low-cardinality columns

1 1

Page 20: Five steps perform_2013

Right indexes?

►pg_stat_user_indexes●shows indexes not being used●note that it doesn't record unique index

usage

►pg_stat_user_tables●shows seq scans: index candidates?●shows heavy update/delete tables: index

less

1 1

Page 21: Five steps perform_2013

Partitioning

►Partition large or growing tables●historical data

▬data will be purged▬massive deletes are server-killers

●very large tables▬anything over 10GB / 10m rows▬partition by active/passive

1 1

Page 22: Five steps perform_2013

Partitioning

►Application must be partition-compliant●every query should call the partition key●pre-create your partitions

▬do not create them on demand … they will lock

1 1

Page 23: Five steps perform_2013

Query design

►Do more with each query●PostgreSQL does well with fewer larger

queries●not as well with many small queries●avoid doing joins, tree-walking in

middleware

1 1

Page 24: Five steps perform_2013

Query design

►Do more with each transaction●batch related writes into large

transactions

1 1

Page 25: Five steps perform_2013

Query design

►Know the query gotchas (per version)●Always try rewriting subqueries as joins●try swapping NOT IN and NOT EXISTS

for bad queries●try to make sure that index/key types

match●avoid unanchored text searches "ILIKE

'%josh%'"

1 1

Page 26: Five steps perform_2013

But I use ORM!

►ORM != high performance●ORM is for fast development, not fast

databases●make sure your ORM allows "tweaking"

queries●applications which are pushing the limits

of performance probably can't use ORM▬but most don't have a problem

1 1

Page 27: Five steps perform_2013

It's All About Caching

►Use prepared queries●whenever you have repetitive loops

W O

1 1

Page 28: Five steps perform_2013

It's All About Caching

►Cache, cache everywhere●plan caching: on the PostgreSQL server●parse caching: in some drivers●data caching:

▬in the appserver▬in memcached/varnish/nginx▬in the client (javascript, etc.)

●use as many kinds of caching as you can

W O

1 1

Page 29: Five steps perform_2013

It's All About Caching

But …►think carefully about cache invalidation

●and avoid “cache storms”

1 1

Page 30: Five steps perform_2013

Connection Management

►Connections take resources●RAM, CPU●transaction checking

W O

1 1

Page 31: Five steps perform_2013

Connection Management

►Make sure you're only using connections you need●look for “<IDLE>” and “<IDLE> in

Transaction”●log and check for a pattern of connection

growth●make sure that database and appserver

timeouts are synchronized

W O

1 1

Page 32: Five steps perform_2013

Pooling

►Over 100 connections? You need pooling!

PostgreSQLPool

Webserver

Webserver

Webserver

1 1

Page 33: Five steps perform_2013

Pooling

►New connections are expensive●use persistent connections or connection

pooling sofware▬appservers▬pgBouncer▬pgPool (sort of)

●set pool side to maximum connections needed

1 1

Page 34: Five steps perform_2013

12QueryTuning

Page 35: Five steps perform_2013

Bad Queries

5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

0

1000

2000

3000

4000

5000

Ranked Query Execution Times

% ranking

exe

cutio

n tim

e

1 2

Page 36: Five steps perform_2013

Optimize Your Queriesin Test

1 2►Before you go production

●simulate user load on the application●monitor and fix slow queries●look for worst procedures

Page 37: Five steps perform_2013

Optimize Your Queriesin Test

1 2►Look for “bad queries”

●queries which take too long●data updates which never complete●long-running stored procedures●interfaces issuing too many queries●queries which block

Page 38: Five steps perform_2013

Finding bad queries

►Log Analysis●dozens of logging

options●log_min_duration_

statement●pgfouine●pgBadger

1 2

Page 39: Five steps perform_2013

Fixing bad queries

►EXPLAIN ANALYZE►things to look for:

●bad rowcount estimates●sequential scans●high-count loops●large on-disk sorts

1 2

Page 40: Five steps perform_2013

Fixing bad queries

►reading explain analyze is an art●it's an inverted tree●look for the deepest level at which the

problem occurs

►try re-writing complex queries several ways

1 2

Page 41: Five steps perform_2013

Query Optimization Cycle

log queries run pgbadger

explain analyzeworst queries

troubleshootworst queries

apply fixes

1 2

Page 42: Five steps perform_2013

Query Optimization Cycle(new)

check pg_stat_statements

explain analyzeworst queries

troubleshootworst queries

apply fixes

1 2

Page 43: Five steps perform_2013

Procedure OptimizationCycle

log queries run pg_fouine

instrumentworstfunctions

find slowoperations

apply fixes

1 2

Page 44: Five steps perform_2013

Procedure Optimization Cycle (new)

check pg_stat_function

find slowoperations

instrumentworstfunctions

apply fixes

1 2

Page 45: Five steps perform_2013

13postgresql.conf

Page 46: Five steps perform_2013

max_connections

►As many as you need to use●web apps: 100 to 300●analytics: 20 to 40

►If you need more than 100 regularly, use a connection pooler●like pgbouncer

13

W O

D

Page 47: Five steps perform_2013

shared_buffers

►1/4 of RAM on a dedicated server●not more than 8GB (test)●cache_miss statistics can tell you if you

need more

►less buffers to preserve cache space

13

W O

D

Page 48: Five steps perform_2013

Other memory parameters

►work_mem●non-shared

▬lower it for many connections▬raise it for large queries

●watch for signs of misallocation▬swapping RAM: too much work_mem▬log temp files: not enough work_mem

●probably better to allocate by task/ROLE

13

W O

D

Page 49: Five steps perform_2013

Other memory parameters

►maintenance_work_mem●the faster vacuum completes, the better

▬but watch out for multiple autovacuum workers!

●raise to 256MB to 1GB for large databases

●also used for index creation▬raise it for bulk loads

13

Page 50: Five steps perform_2013

Other memory parameters

►temp_buffers●max size of temp tables before swapping

to disk●raise if you use lots of temp tables

►wal_buffers●raise it to 32MB

13

D

Page 51: Five steps perform_2013

Commits

►checkpoint_segments●more if you have the disk: 16, 64, 128

►synchronous_commit●response time more important than data

integrity?●turn synchronous_commit = off●lose a finite amount of data in a

shutdown

13

W

Page 52: Five steps perform_2013

Query tuning

►effective_cache_size●RAM available for queries●set it to 3/4 of your available RAM

►default_statistics_target●raise to 200 to 1000 for large databases●now defaults to 100●setting statistics per column is better

13

D

Page 53: Five steps perform_2013

Query tuning

►effective_io_concurrency●set to number of disks or channels●advisory only●Linux only

13

Page 54: Five steps perform_2013

A word aboutRandom Page Cost►Abused as a “force index use”

parameter►Lower it if the seek/scan ratio of your

storage is actually different●SSD/NAND: 1.0 to 2.0●EC2: 1.1 to 2.0●High-end SAN: 2.5 to 3.5

►Never below 1.0

13

Page 55: Five steps perform_2013

Maintenance

►Autovacuum●leave it on for any application which gets

constant writes●not so good for batch writes -- do manual

vacuum for bulk loads

13

W O

D

Page 56: Five steps perform_2013

Maintenance

►Autovacuum●have 100's or 1000's of tables?

multiple_autovacuum_workers▬but not more than ½ cores

●large tables? raise autovacuum_vacuum_cost_limit

●you can change settings per table

13

Page 57: Five steps perform_2013

14OS &

Filesystem

Page 58: Five steps perform_2013

Spread Your Files Around

►Separate the transaction log if possible●pg_xlog directory●on a dedicated disk/array, performs

10-50% faster●many WAL options only work if you have

a separate drive

1 4O D

Page 59: Five steps perform_2013

Spread Your Files Around 1 4

number of drives/arrays 1 2 3which partition

OS/applications 1 1 1transaction log 1 1 2database 1 2 3

Page 60: Five steps perform_2013

Spread Your Files Around

►Tablespaces for temp files●more frequently useful if you do a lot of

disk sorts●Postgres can round-robin multiple temp

tablespaces

D

1 4

Page 61: Five steps perform_2013

Linux Tuning

►Filesystems●Use XFS or Ext4

▬butrfs not ready yet, may never work for DB▬Ext3 has horrible flushing behavior

●Reduce logging▬data=ordered, noatime, nodiratime

1 4

Page 62: Five steps perform_2013

Linux Tuning

►OS tuning●must increase shmmax, shmall in kernel●use deadline or noop scheduler to speed

writes●disable NUMA memory localization

(recent)●check your kernel version carefully for

performance issues!

1 4

Page 63: Five steps perform_2013

Linux Tuning

►Turn off the OOM Killer!● vm.oom-kill = 0● vm.overcommit_memory = 2● vm.overcommit_ratio = 80

1 4

Page 64: Five steps perform_2013

OpenSolaris/IIlumos

►Filesystems●Use ZFS

▬reduce block size to 8K

●turn off full_page_writes

►OS configuration●no need to configure shared memory●use packages compiled with Sun

compiler

W O

1 4

Page 65: Five steps perform_2013

Windows, OSX Tuning

►You're joking, right?

1 4

Page 66: Five steps perform_2013

What about The Cloud?

►Configuring for cloud servers is different●shared resources●unreliable I/O●small resource limits

►Also depends on which cloud●AWS, Rackspace, Joyent, GoGrid

… so I can't address it all here.

1 4

Page 67: Five steps perform_2013

What about The Cloud?

►Some general advice:●make sure your database fits in RAM

▬except on Joyent

●Don't bother with most OS/FS tuning▬just some basic FS configuration options

●use synchronous_commit = off if possible

1 4

Page 68: Five steps perform_2013

Set up Monitoring!

►Get warning ahead of time●know about performance problems

before they go critical●set up alerts

▬80% of capacity is an emergency!

●set up trending reports▬is there a pattern of steady growth

1 4

Page 69: Five steps perform_2013

1 5

Hardware

Page 70: Five steps perform_2013

Hardware Basics

►Four basic components:●CPU●RAM●I/O: Disks and disk bandwidth●Network

1 5

Page 71: Five steps perform_2013

Hardware Basics

►Different priorities for different applications●Web: CPU, Network, RAM, ... I/O●OLTP: balance all●DW: I/O, CPU, RAM

W

O

D

1 5

Page 72: Five steps perform_2013

Getting Enough CPU

►One Core, One Query●How many concurrent queries do you

need?●Best performance at 1 core per no more

than two concurrent queries

►So if you can up your core count, do►Also: L1, L2 cache size matters

1 5

Page 73: Five steps perform_2013

Getting Enough RAM

►RAM use is "thresholded"●as long as you are above the amount of

RAM you need, even 5%, server will be fast

●go even 1% over and things slow down a lot

1 5

Page 74: Five steps perform_2013

Getting Enough RAM

►Critical RAM thresholds●Do you have enough RAM to keep the

database in shared_buffers?▬Ram 3x to 6x the size of DB

W

1 5

Page 75: Five steps perform_2013

Getting Enough RAM

►Critical RAM thresholds●Do you have enough RAM to cache the

whole database? ▬RAM 2x to 3x the on-disk size of the database

●Do you have enough RAM to cache the “working set”?▬the data which is needed 95% of the time

O

1 5

Page 76: Five steps perform_2013

Getting Enough RAM

►Critical RAM thresholds●Do you have enough RAM for sorts &

aggregates?▬What's the largest data set you'll need to work with?

▬For how many users

D

1 5

Page 77: Five steps perform_2013

Other RAM Issues

►Get ECC RAM●Better to know about bad RAM before it

corrupts your data.

►What else will you want RAM for?●RAMdisk?●SWRaid?●Applications?

1 5

Page 78: Five steps perform_2013

Getting Enough I/O

►Will your database be I/O Bound?●many writes: bound by transaction log●database much larger than RAM: bound

by I/O for many/most queries

1 5

O

D

Page 79: Five steps perform_2013

Getting Enough I/O

►Optimize for the I/O you'll need●if you DB is terabytes, spend most of

your money on disks●calculate how long it will take to read

your entire database from disk▬backups▬snapshots

●don't forget the transaction log!

1 5

Page 80: Five steps perform_2013

I/O Decision Treelots ofwrites?

fits inRAM?

affordgood HW

RAID?

terabytesof data?

mirrored

SW RAID

HW RAID

StorageDevice

mostlyread?

RAID 5 RAID 1+0

Yes

No Yes

No

No

Yes

Yes

No

Yes No

1 5

Page 81: Five steps perform_2013

I/O Tips

►RAID●get battery backup and turn your write

cache on●SAS has 2x the real throughput of SATA●more spindles = faster database

▬big disks are generally slow

1 5

Page 82: Five steps perform_2013

I/O Tips

►DAS/SAN/NAS●measure lag time: it can kill response

time●how many channels?

▬“gigabit” is only 100mb/s▬make sure multipath works

●use fiber if you can afford it

1 5

Page 83: Five steps perform_2013

I/O Tips

iSCSI=

death

1 5

Page 84: Five steps perform_2013

SSD

►Very fast seeks●great for index access on large tables●up to 20X faster

►Not very fast random writes●low-end models can be slower than HDD●most are about 2X speed

►And use server models, not desktop!

D

1 5

Page 85: Five steps perform_2013

NAND (FusionIO)

All the advantages of SSD, Plus:►Very fast writes ( 5X to 20X )

●more concurrency on writes●MUCH lower latency

►But … very expensive (50X)

1 5

W O

Page 86: Five steps perform_2013

Tablespaces for NVRAM

►Have a "hot" and a "cold" tablespace●current data on "hot"●older/less important data on "cold"●combine with partitioning

►compromise between speed and size

1 5

O D

Page 87: Five steps perform_2013

Network

►Network can be your bottleneck●lag time●bandwith●oversubscribed switches●NAS

1 5

Page 88: Five steps perform_2013

Network

►Have dedicated connections●between appserver and database server●between database server and failover

server●between database and storage

1 5

Page 89: Five steps perform_2013

Network

►Data Transfers●Gigabit is only 100MB/s●Calculate capacity for data copies,

standby, dumps

1 5

Page 90: Five steps perform_2013

The Most ImportantHardware Advice:

►Quality matters●not all CPUs are the same●not all RAID cards are the same●not all server systems are the same●one bad piece of hardware, or bad driver,

can destroy your application performance

1 5

Page 91: Five steps perform_2013

The Most ImportantHardware Advice:

►High-performance databases means hardware expertise●the statistics don't tell you everything●vendors lie●you will need to research different

models and combinations●read the pgsql-performance mailing list

1 5

Page 92: Five steps perform_2013

The Most ImportantHardware Advice:►Make sure you test your hardware

before you put your database on it●“Try before you buy”●Never trust the vendor or your sysadmins

1 5

Page 93: Five steps perform_2013

The Most ImportantHardware Advice:►So Test, Test, Test!

●CPU: PassMark, sysbench, Spec CPU●RAM: memtest, cachebench, Stream●I/O: bonnie++, dd, iozone●Network: bwping, netperf●DB: pgBench, sysbench

1 5

Page 94: Five steps perform_2013

►Josh Berkus● [email protected]

● www.pgexperts.com▬ /presentations.html

● www.databasesoup.com

Questions? 1 6►More Advice

● www.postgresql.org/docs

● pgsql-performancemailing list

● planet.postgresql.org

● irc.freenode.net▬#postgresql

This talk is copyright 2013 Josh Berkus, and is licensed under the creative commons attribution license